A Word on Network Storage Accessibility and Performance

Introduction

You've got yourself some storage attached to your gigabit network - now what? How do you make use of it? How do you access it?

There are several ways in which you can access that storage. Or should I say there are several protocols and software packages which enable you to access that storage. Each has advantages and disadvantages in terms of performance and convenience. This is what this article is about.

If you transfer one small file over the network the key point in choosing one of these technologies is convenience. If that file is very large (maybe a film, a CD or a DVD) you'll perform sequential access. Most of the different types of technologies offer very good performance for sequential access (with some exceptions though - see below) provided that the hardware (network card, CPU and HDD) can sustain the load. In this scenario choosing one of different technologies is again convenience.

The most interesting part though is when you have to transfer lots of files (maybe small ones). You may want to copy all your documents somewhere else or maybe move your photo or mp3 collection. Or if you're a brave one you might one to compile the kernel tree residing on your network attached storage. There are several real world scenarios that involve accessing a large number of files as fast as possible. This is the most interesting scenario because there are big performance differences between different technologies.

We all love benchmarks. Those little numbers that tell you that what you bought is 3% faster than competition. You may never notice the difference but the important part is that you got the best. In our case performance and convenience differences are so big that you don't really need some lab benchmarks to see the differences. This is why there won't be many numbers in this article.

However we did perform many tests. The tests involved copying a large tree of relatively small real files to and from our tumaBox over and over again. We used our /usr and /usr/lib partitions - that's right: real files from the real world. Of course we took necessary precautions such as clearing caches in order to avoid fake performance improvements. The reasons we don't put the actual numbers in the article are: 1. we don't think it's important whether you get 50 or 52.5 percent difference (nor even 20 or 30 percent) - the important thing is that there is a huge performance difference; 2. even though we took precautions to get real results we didn't perform a scientific benchmark by the book. So feel free to throw your stones and discard everything we say here.

File Level Sharing

This is the first type of sharing storage over the network. It basically means that the server exposes the entire shared tree of directories and files to the client and client computers see the network attached storage as a tree of directories and files. Then they can issue commands to the server to read or write those files.

The key feature of this type of sharing is convenience. The protocols are very well supported by all major operating systems and you can mount the virtual the entire remote storage tree on your local computer and access it as a local resource. One other important feature of this type of sharing is that many clients can read and write remote files at the same time. The server takes care of keeping the data consistent all the time. However this comes at a cost because this mechanism implies locking, something that is done transparently by the server but which decreases performance most notably in our scenario: accessing a very large number of relatively small files.

SMB / CIFS

This is one of the most used protocols. Chances are that you have already used it even without knowing this. It's the protocol used by default by all Windows variants (although in different flavours) but it's also accessible on any other major operating system. On Windows it's very easy to use: if you access something from your “Network Neighborhood” you're already using it. On Windows you can also mount it as a drive (D:, X: or whatever you like). You can also mount it and access it in a variety of ways on all other major operating systems. This is pretty much as easy as it gets.

The performance for sequential transfer (large files) is quite good - you can easily saturate your gigabit network card or even your old HDD. The down side is that accessing lots of small files is slow compared with other alternatives. In this scenario you might have difficulties saturating a 100 Mbps ethernet card and not because of the HDD. The performance problems come from the inherent network latencies combined with the protocol ping-pong over the network and the locks that need to be aquired and released.

The main competitor for SMB/CIFS is NFS (see below) - the competitor from the Unix world. There's a continuous battle on which has better performance. From our experience it depends on actual conditions. Over the years we got mixed results with different hardware, protocol variants and configuration options. But in our current particular tests SMB performs much lower on writing (think 30-40%) and a bit lower on reading (think around 10%) than NFS.

You may say that this is a huge performance difference but this doesn't matter much to us. We heavily use computers and our tumaBox and on a daily basis we don't see this difference as having an impact on our productivity. Throw a different bunch of files to them and the results may be very different. We don't usually copy /usr for our work.

What is more important from our point of view is the difference in accessing the files. Yes, you can mount both SMB and NFS in your local tree but with SMB you mount the remote tree in the name of some user (by supplying username and password) and then all operations you do on the remote files appear as performed by that user. You can see this a security feature or it may get you into a permission nightmare.

NFS

This is the alternative to SMB from the Unix the world. This is why with NFS you can mount the remote tree in the local tree and the local user permissions are enforced on the remote tree as well. From the local user perspective it's really like adding another directory with some extra storage. It's a very convenient way of using remote storage on Unixes and Linuxes and all their siblings but one may argue that this might have some security implications. Knowing these implications we can take the necessary precautions to make NFS a safe thing to use.

On the performance level NFS is very similar with SMB. Again single large file transfers can saturate network or HDD. Again transfering lots of small files has poor performance. In our tests it did perform better than SMB especially on the write side but again it may depend on the particular setup and scenario. The reason why we don't care much about the performance difference between SMB and NFS is because they both perform so much poorly on small files than other alternatives. So if you have to transfer lots of small files stay away from both.

SFTP (RSync, SCP, FUSE)

All these rely on ssh, which (as the name suggests) was meant to provide secure shell access. But it actually provides much more than this: encryption for transfering files, forwarding internet traffic and so on. SFTP is actually the part that provides file transfers with an ftp like access. SCP is a very simple way to transfer files. FUSE is actuallly a piece of software to mount filesystems in userspace. What it actually means is that by writing a FUSE plugin you can make any filesystem available without modifying the kernel (or writing a driver). It has a SFTP plugin that can be used to mount SFTP remote filesystems just as you would with SMB. Last but not least rsync is a very powerful program that can transfer files over ssh. The power comes from it's flexibility and features. It is a much more powerful solution than scp or sftp.

Provided with a powerful CPU transfer over ssh is the fastest solution available. When I say that I mean not only for large file transfers but especially for small files. When using sftp transfer with on the fly compression you can even achieve transfer rates higher than the actual maximum network speed (for highly compressible data). On the fly compression is a standard feature of ssh which is absent by default in all other solutions. If you mention also that all transfers are encrypted by default this solution becomes a clear winner on the security part as well. None of the other solutions provide encryption by default and some lack much more than this on the security side.

On our particular set of tests transfering /usr data with rsync was 10 times faster than SMB or NFS. With SSHFS (FUSE with SSH plugin) we achieved only about 5 times the performance of SMB or NFS. In our tests we didn't use compression. If you transfer even smaller files the performance is expected to increase much more compared with SMB or NFS. The tremendous performance is achieved because of the protocol design but there's no point in going into details about this.

There are some caveats though. First is that if you read carefully you saw something about CPU power. All transfers over ssh are encrypted - this was the whole purpose of ssh in the first place. Encryption requires a lot of CPU power. And it gets even better: there are 2 CPUs involved for each transfer - the server one and the client one. If either one does not have enough power to sustain the transfer speed it becomes the bottleneck and it limits the entire transfer speed according to it's power. Some new CPUs try to help in this situation because they provide AES hardware acceleration (AES-NI instructions), AES being the most popular encryption block cipher. When you use such CPUs both on server and client the performance is expected to increase tremendously when using AES encryption. If at least one of the CPUs does not support this instruction set or you want to use a different encryption algorithm then your CPU might easily become the bottleneck.

One popular way to ease the pain when you don't have AES-NI on both sides is to use the Arcfour protocol aka RC4 (either arcfour, arcfour128 or arcfour256 in ssh Ciphers option - performance difference between them is marginal). All benchmarks point this one to be the fastest ssh encryption protocol without hardware acceleration. We used it ourselves in our tests (arcfour128). The downside in this case is that arcfour is currently considered insecure. This doesn't mean that any kid can instantly see your traffic but don't rely on NSA wasting too much time on decrypting your transfered files.

When you don't necessarily want the protection offered by ssh encryption (maybe you're on the local network in your bunker) you might want to drop it altogether and only use the tremendous speed. Unforturnately some people didn't consider this to be a very good idea so there's no current way in official ssh to drop encryption entirely. There are however some third party patches for this which you can use at your own risk.

When we get to the accessibility aspect there's only one way to compare SSH family to SMB or NFS: SSHFS over FUSE. This is the only way to mount your remote tree in your local tree over ssh and enjoy the performance (of powerful CPUs). All other ways (rsync, scp, sftp) are linux commands which provide an entirely different experience. They're great for transfering a bunch files fast but not for every day browsing. There are also file manager plugins and even standalone products that you could use to browse and transfer files in a quick and convenient way. But the key difference with these is that the remote files and folders are perceived locally as remote resources. This means that if for instance you click on a film in these file managers the entire film will be first copied locally (don't hold your breath - it will take quite some time) and then it will start playing. With all other systems (SMB, NFS, SSHFS etc.) When you click on that film it starts playing immediately and data is transfered in background as needed while you watch the film.

Even more than this FUSE is not available on all operating systems (although there are several attempts to implement something similar). When it comes to file manager plugins or stand alone programs you will probably find several of these for your operating system of choice but you won't get the user experience consistency you get from using something that mounts as a native resource.

SSHF does a good job but using it on daily basis seams cumbersome to us. First of all it's permission handling looks more like SMB (but with some notable differencies). You are logged in on the remote machine and perform all actions in the name of that user. Furthermore this means that is difficult to share the remote tree with some other local users: they each have to mount the remote tree in their own user space using their own credentials. When you look how it handles connection problems you soon start to look for alternative solutions. NFS is no stranger in this department either but the chances of some eventual recovery are somewhat bigger if it's your lucky day.

Block Level Sharing

In this type of sharing the server exports the local resource as a block device. This means that the server has no knowledge or interest in the actual files or directories. A block device can be a partition, an entire HDD or SSD or even a file that is seen by the operating system exactly like a partition or a drive. The resource is exported this way to the client and it's the client's responsability to do something with it in any way it sees fit. This is supposed to have an advantage in speed because there are less operations involved and the architecture is more simple but it transfers much of the responsability of managing the actual files and directories to the client.

This means that the client is responsible for concurrency. Imagine if to clients write at the same time to the same shared block device. It is the clients' responsability to insure that data doesn't get corrupted on the target device. And this brings us to the filesystem problem. Normal filesystems (eg: ext4, ntfs etc.) are designed for single user (in this context user actually means a single computer that accesses the device at any given moment). If you mount the same block device on two different computers at the same time and you use one of these filesystems sooner or later you will get data corruption because there is now mechanism in these filesystems to ensure concurrency.

This is why there are several cluster filesystems which were design for that along with other goodies that they provide. The trouble with these cluster filesystems is that they are more difficult to setup and maintain and they introduce the overhead that we were trying to get rid of. We won't talk about these filesystems because they're beyond the scope of this article.

In our tests we used ext4 with a single client. You may see this as pointless first but when you think about it it's not quite so. We saw that the performance of large single file transfer usually can saturate the gigabit network with NFS or SMB and even SFTP (on powerful hardware). But imagine the scenario in which you want to transfer the entire local filesystem on the remote server. That's what NASes should be about. Imagine for instance that you want to build a thin client with powerful processing and you want to get rid of the HDDs inside (maybe to get rid of the noise or case size or whatever). Then you should put your entire local tree on the NAS because that's what it's there for. An entire tree of files and folders more often than not contains lots and lots of relatively small files. When you just access a single file at one time you can get aways with SMB or NFS. But when you want to copy a large directory it might take… forever. This is when block level sharing may come in handy. The problem is that you won't be able to write from another client at the same time. You should get away with mountint read-only though and thus to be able to read at the same time from the same block share on different clients. Depending on your particular usage scenario this might be just what you need.

This type of sharing is also useful if you want to run virtual machines with storage on the NAS. If you would mount the NFS or SMB tree locally and then create a loopback block device on it for the virtual machine it might not be such a good idea. Having a remote block device should provide better performance. The actual resource could be exactly the same file on exactly the same HDD. Only the mode of access differs. Another great thing is that by using sparse files you can get easily get thin provisioning.

One popular block level sharing protocol that we won't discuss is FC (Fibre Channel) and it's flavor FCoE. It's usually popular in the enterprise sector (which is beyond the scope of this article) and it's supposed to deliver very good performance. On the other hand it's reportedly diffcult to setup.

iSCSI

iSCSI it's by far the most used block sharing protocols except maybe FC and it's variants. The concept is pretty simple: transfer SCSI commands over IP network. So the client sees and accesses the remote block device just as regular SCSI disk. But don't be fooled by the name: you don't necessarily have to use a SCSI device; it works well with SATA or even files.

It's reportedly simple to setup but when compared with the other 2 block sharing protocols that we analyze it's actually most difficult to setup. It's main advantage over the other two is that is well supported by all major operating systems. It's main disadvantage is performance. It's supposed to be fast and it looks well on paper but it real life it's actually quite slow at least for random access.

NBD

NBD (Network Block Device) is by far not as popular as iSCSI or FC although it has been in the linux kernel for quite some time. It relies on TCP/IP so in this regard it's closer to SMB, NFS and SFTP. For sequential transfers it has the best performance of the three block sharing protocols we analyzed and it can saturate the gigabit network. And even for random access (eg: small files) it's much better than iSCSI. And on top of it all it's the easiest to setup. As a matter of fact I don't think it could get any easier than this.

But there is a problem with it: it doesn't handle connection failures gracefully. If your connection dies you're on your own. The resource might become available again but the client won't be able to recover and use it. And when you consider that this is actually a block device in the eyes of the operating system you really get into a nightmare.

Another problem is that it's not well supported on many operating systems. It's very well supported on linux but on other operating systems you might have to rely on third party tools and even they don't seem to be able to handle all the feature set.

AOE

AOE (Ata Over Ethernet) comes with a different approach: instead of relying on TCP, UDP or even IP, it actually relies directly on Ethernet protocol. Again don't let names fool you: you don't necessarily have to use ATA drives and it works very well on gigabit ethernet. Of course getting rid of higher level network protocols is supposed to get rid of some overhead. So it's supposed to add more speed. The concept is actually quite simple: incapsulate ATA commands in ethernet frames.

When it comes to random transfers (eg: small files) it is by far the fastest of them all, except for good old rsync (on powerful CPU). It's faster than NBD and a lot faster than iSCSI. I won't even compare it with SMB or CIFS.

But there is always a but. In our tests we weren't able to exceed 650 Mbps with sequential access on a gigabit network with 1500 MTU. Many people recommend using jumbo frames to improve performance. This is a test we actually didn't do because it wouldn't help for our particular usage. Remember that we did these tests for ourselves and decided to share our experiences with you - not the other way around.

And there's another thing that may have not noticed: working on top of ethernet it means that it is only accessible on the ethernet (local) network by design. You won't be able to access the AOE share over the internet for instance, something that is possible with all other protocols. This may be a deal breaker or it may be a security feature - it depends exactly on your particular needs.

AOE is very easy to setup and it's supported via third party software on many operating systems. Linux supports this natively of course.

Web interface and Webdav

There is a trend of using web interfaces (in a web browser) for accessing and sharing files: think cloud. There are many instances in which you may prefer this type of access. But there won't be any talk about performance here: it's all about convenience. And for real tasks even convenience and accessibility might not be that good.

But there's also Webdav: a protocol supposed to make all these files and directories shared over HTTP accessible as a normal tree with normal tools. You might even be able to mount such a tree. But again don't think about performance here. It's ok for a single large file but when you want to copy a tree of many small files you should search for something else.

What do we use

Actually we use almost all of them. The trees that we keep all our data (which we need to share among us) are shared with NFS. We also use SMB in the rare occasions when we need to access files from Windows computers or Android devices. We use Owncloud (web interface) to share data with other people. We use rsync over ssh to do regular backups and when we have to move large trees around. We have files exported as block devices with AOE to make our computers thinner: think of moving the local HDD (which you don't want to share because all data to be shared is already on tumaBox NFS) to tumaBox. In rare occasions we use NBD or iSCSI (depending on network reliability) to access the same block devices from the internet (over VPN of course).

So the bottom line is: each tool has it's purpose.

tumaBox Wiki

Table of Contents

A Word on Network Storage Accessibility and Performance

Introduction

File Level Sharing

SMB / CIFS

NFS

SFTP (RSync, SCP, FUSE)

Block Level Sharing

iSCSI

NBD

AOE

Web interface and Webdav

What do we use

tumaBox Wiki

User Tools

Site Tools

Table of Contents

A Word on Network Storage Accessibility and Performance

Introduction

File Level Sharing

SMB / CIFS

NFS

SFTP (RSync, SCP, FUSE)

Block Level Sharing

iSCSI

NBD

AOE

Web interface and Webdav

What do we use

Page Tools