Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Comparing Filesystem Performance in Virtual Machines (mitchellh.com)
18 points by zandi on March 2, 2015 | hide | past | favorite | 5 comments


I'm not sure that there's much value in this article.

1) It doesn't really compare benchmarks on OSX vs Linux - you're dealing with different drivers, kernels and filesystems.

2) Then there's the matter of EXT3 - there is little to no value in benchmarking such an old filesystem with modern hardware, not to mention that EXT3 lacks native trim support thus major impacting performance over time (This is assuming the host OS is also using EXT3 and thus not passing through the discard IOCTL to the block device).

3) There's no mention of what Kernel version was being run, modern kernels (3.6+) are significantly more efficient with both disk and network I/O.

4) How much memory is being provided to the host and guest VMs and how much of the 'benchmarks' are being cached? What kind of disk / filesystem caching are they using? What is the IO scheduler for both the guest and the host machines?

5) What 'benchmarks' were actually run? I'd bet that it involves using dd which is by no means a benchmark - nor can it be trusted (especially without what commands were even being run) - if you're going to benchmark disks use fio or bonnie++ (although I think fio is more useful)

Here's a good article on how to perform a few kinds of useful benchmarks: https://www.binarylane.com.au/support/articles/1000055889-ho...

Beyond that, I'd also look at using PGBench to get an idea of the IOP/s in real world scenarios: http://www.westnet.com/~gsmith/content/postgresql/pgbench-sc...


I have been benchmarking different usage scenarios for sharing files between hosts and VMs for the past couple years, and for the majority of use cases, the only way to get near-native performance without much hassle is to use Vagrant's rsync support instead of native shared folders or even NFS, which exhibits some jitter making it annoying at times.

See: http://www.midwesternmac.com/blogs/jeff-geerling/nfs-rsync-a...

If your project doesn't have hundreds or thousands of files, or if you don't need as fast filesystem performance on your host machine, there are other ways that might give more convenience. But I stick with rsync/rsync-auto for 99% of my VMs, whether using VirtualBox or VMware.


One point that might be hidden in the OP is that hypervisors are often configured to lie about whether data has been committed to disk. This is certainly true of most public clouds. As a distributed file system developer I run a lot of I/O tests in the cloud, and I've gotten many results that are impossible to explain any other way. People I know who work at some of these providers have practically admitted it. In a cut-throat industry, any provider who actually did the right thing here would get hammered on performance and price (because they'd be unable to pack as many instances onto each host as their competitors do). It's something to be aware of when you run any data-intensive application in a public cloud, or when you're configuring hosts in a private one.


I don't know if it's possible to nail down a vendor for comment, but are we able to back this up? It would be good to know for cases like running an ACID database.


Note that these are from January, 2014. YMMV.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: