Single-monitor is a common way to run Ceph. On top of that, many cluster configurations cause the whole thing to slow to a crawl when a very small minority of nodes go down. Never mind packet loss, bad switches, and other sorts of weird failure mechanisms. Ceph in general is pretty bad at operating in degraded modes. ZFS and systems like Tectonic (FB) and Colossus (Google) do much better when things aren't going perfectly.
Do you know how many administrators CERN has for its Ceph clusters? Google operates Colossus at ~1000x that size with a team of 20-30 SREs (almost all of whom aren't spending their time doing operations).
> Our Configuring ceph section provides a trivial Ceph configuration file that provides for one monitor in the test cluster. A cluster will run fine with a single monitor; however, a single monitor is a single-point-of-failure. To ensure high availability in a production Ceph Storage Cluster, you should run Ceph with multiple monitors so that the failure of a single monitor WILL NOT bring down your entire cluster.
This is complete nonsense. No one running business critical installs of Ceph runs single-monitor.
You can also tell Ceph to use a single disk as your failure domain. No one does that either. Homelabbers maybe, but then why are you comparing such setups with Google?
We run Ceph with a failure domain of an entire rack. We can literally take down (scheduled or unscheduled) an entire rack of 40 servers, and continue to serve critical, latency sensitive applications, with no noticeable performance loss.
We have a Ceph footprint 5x larger than CERN run by a team of 4-5 people.
There are different levels of scalability needs. CERN has over a dozen (Ceph) clusters with over 100PB of total data as of 2023:
* https://www.youtube.com/watch?v=bl6H888k51w
Certainly there are some number of folks that need more than that, but I don't there are many.
> Like Ceph it is also vulnerable to single points of failure.
The SPOF for ZFS is the host (unless you replicate, e.g., zfs send).
What is SPOF of Ceph? You can have multiple monitors, managers, and MDSes.