Unsurprising. Red Hat has not hired upstream Btrfs developers for years, where S...

gcp · on Aug 2, 2017

As for the significant technical issues, one thing is the core decision to make it a CoW system, which has fundamental performance issues with many workloads that are exactly those used in the server space. You can disable CoW, but you lose many reasons to use btrfs in the first place if you do.

When I gave up on it there were also fundamental issues with metadata vs data balancing, not-really-working RAID support, and so on...

pgaddict · on Aug 2, 2017

I find the suggestion that the technical issues are caused by the CoW design a bit strange.

Sure, making the filesystem CoW-based means there are some inherent costs, but it allows the filesystem to implement some interesting features (e.g. snapshots) in a more efficient way. For example if you want to do snapshots with ext4/xfs, you'll probably do that using LVM (which you can see as turning the stack into a CoW). In my experience the performance impact of creating a snapshot on ext4/LVM is about 50%, so you cut the performance in half. While on ZFS the impact is mostly negligible, due to the filesystem is designed as CoW in the first place.

And thanks to ZFS we know that it's possible to implement a CoW filesystem that provides extremely stable and balanced performance. I've done a number of database-related tests (which is the workload that I do care about) and it did ~70-80% TPS compared to ext4/xfs (without snapshots). And once you create a snapshot on ext4/xfs, the performance tanks, while ZFS works just like before, thanks to the CoW design.

Unfortunately, BTRFS so far hasn't reached this level of maturity and stable performance (at least not in the workloads that I personally care about). But that has nothing to do with the filesystem being CoW, except perhaps that CoW maybe makes the design more complicated.

jeltz · on Aug 2, 2017

Didn't one of your benchmarks show that nodatacow on Btrfs resulted in a major performance improvement? But that might just show an issue with Btrfs's CoW implementation rather than CoW in general.

pgaddict · on Aug 2, 2017

Yes, I've done some tests on BTRFS with nodatacow, and it improved the performance and behavior in general. Still slower tha XFS/EXT4, but better than ZFS (with "full" CoW).

But as you mention, that does not say anything about CoW filesystems in general. It merely hints the BTRFS implementation in not really optimized.

FWIW while I do a lot of benchmarks (both out of curiosity and as part of my job, when evaluating customer systems), I've learned to value stability and predictability over performance. That is, if the system is 20% slower, but provides stable and predictable behavior, it's probably OK. If you really need the extra 20% you can probably get that by adding a bit more hardware, and it's cheaper than switching filesystems etc. (Sure, if you have more such systems, that changes the formula.)

With EXT4/XFS/ZFS you can get that - predictable, stable performance. With BTRFS not so much, unfortunately.

gcp · on Aug 2, 2017

>it allows the filesystem to implement some interesting features (e.g. snapshots) in a more efficient way.

Interesting features are worthless when reading and writing data is prohibitively slow. Or when there are documented cases where updating a file in random-access manner can cause its storage requirement to balloon to blocks^2.

cryptonector · on Aug 2, 2017

There's a write magnification effect when using CoW. The ZIL helps with this because the ZIL itself is not CoW'ed, and it allows deferring writes, which allows more transactions to share interior metadata blocks, thus reducing the write magnification multiplier. I don't get where you get O(N^2) from.

As to snapshots, who cares, they cost nothing to create and they do not slow down writes -- they only slow down things like zfs send (linearly) and they cost storage over time, but not much more.

gcp · on Aug 3, 2017

You are confusing storage requirements with write amplification (which is another downside). They're totally different.

pgaddict · on Aug 2, 2017

Are you suggesting that's a problem with CoW in general, or with BTRFS implementation specifically?

I would say ZFS works extremely well (at least for the workloads I care about, i.e. PostgreSQL databases, both OLTP and OLAP). I know about companies that actually migrated to FreeBSD to benefit from this, back when "ZFS on Linux" was not as good as it's today.

cmurf · on Aug 2, 2017

Unconvincing. LVM's snapshots are CoW whether thick or thinly provisioned. And while not yet merged in mainline, XFS devs are working on CoW as well which is used when modifying reflinked files (shared extents).

Btrfs behaves basically like that with 'nodatacow' today. It will overwrite extents if there's no reflink/snapshot. If there is, CoW happens for new writes and any subsequent modifications are overwrites until there's a reflink/snapshot in which case CoW happens.

The 'nodatacow' flag can be used as either a mount option, or selectively with an xattr per subvolume, directory, or file. And in all cases, metadata writes (the file system itself) are still CoW.

denji · on Aug 5, 2017

https://github.com/redox-os/tfs