The only data protection options I could find were Raid 1, and 10. (raid 0 is a performance option) and as data loss on attempting to re-silver a 3TB mirror is 1 in 5, data protection here is not enterprise quality yet).
The UI stuff is great, but the tricky bit about building a storage system is not provisioning it, or getting the access protocols right, it is all about finding all the ways that data can be destroyed (both silently and noisily) and guarding against them. So if you want to stick with the Enterprise target, then you need something like the ZFS On Linux page which describes every way you can get data zapped and how you will prevent that from happening.
If you want to be just an off the shelf "hey here's something that will make your access point into something like a NAS device." then you get to lose data when a disk goes bad, or a memory chip goes bad, or a network cable is loose, or the powersupply cuts out, or the cat knocks it off the table etc.
NetApp tracks it with their Nearstore product line which used SATA drives in a NAS box (they have been for a while actually, when I left they had data on about 65 million drive hours) and while Seagate quotes it a 1x10^15 bits but its actually closer to 5 in 10^15 bits. A 3TB drive has 3x10^13 bits of data (closer to 3x10^14 when you account for track markers and error recovery bits).
If you're bored some time try reading every sector from one of these drives. To maximize your chance of success make sure you operate the drive at a slightly warm temperature (keeps the lubricant from sticking) and isolate it from vibration. Its worse if you read it randomly (you will get some arm servo movement just because the drive will have replaced some blocks from spares, but minimizing it also keeps vibrations down.)
Long before it became an issue on single drives, like it is today, it was an issue when trying to reconstruct a RAID4 (or 5) group that was 3.5TB (which at the time was a 7 disk raid group of .5T drives. 14 disk groups (a full shelf) were pretty much guaranteed to see a second error in the shelf during reconstruction. Which was also way RAID6 or dual-parity RAID became a must have enterprise feature back in 2005 or thereabouts.
On an interesting side note, because the chance of hitting an unrecoverable read error is evenly distributed through a drive, 3X replication is still recoverable even with intermittent read failures. There isn't really a RAID number for that but it does work reasonably well and avoids a pesky parity calculation if you embed check data in your blocks as they do in GFS.
"Disks protect against media errors by relocating bad blocks, and by undergoing elaborate retry sequences to try to extract data from a sector that is difficult to read [10]. Despite these precautions, the typical media error rate in disks is specified by the manufacturers as one bit error per 1014 to 1015 bits read, which corresponds approximately to one uncorrectable error per 10TBytes to 100TBytes transferred. The actual rate depends on the disk construction. There is both a static and a dynamic aspect to this rate. It represents the rate at which unreadable sectors might be encountered during normal read activity. Sectors degrade over time, from a writable and readable state to an unreadable state."
And in experience from the field put it at about 15TB transferred, so 3TB into 15TB, one in five.
3TB is 310^12 bytes assuming the decimal bytes used in the storage industry. The uncorrectable bit error rate is for the raw block storage. It does not include the low level formatting, which is no more than 20% of the storage on 512-byte sector drives and less than 10% on advanced format drives. The probability of an uncorrectable bit error when copying 3TB using decimal bytes) is approximately 1.5% under the assumption of a 5 in 10^15 uncorrectable bit error rate:
[1 - (1 - 5 10^-15)^(3 * 10^12)] ~ 0.01488...
If your 20% figure is accurate, the actual uncorrectable bit error rate would need to be something like 7 in 10^14. I am not disputing your empirical information, but your numbers are do not agree with it. The difference in what your numbers say and what you say is only about 1 order of magnitude. Doing statistical calculations with better records could allow the cause of that to be identified.
And to be clear, it is a bit error rate not a byte error rate. Nominal coding of data in magnetic media is 10 bits per 8 bit byte although a specific drive may use a different encoding on the platter. The Barracuda included 5120 NRZ encoded bits per sector and a 48 bit NRZ encoded checkword giving it a nominal 10.094 bits per byte. You're off by one decimal order of magnitude in the number of bits.
To avoid markdown, either use reverse-slashes to escape your asterisks in paragraphs, or surround them with spaces, or put four spaces to the left of short lines that have "special" characters.
My very first questions regarding a potential storage solution revolve around data loss:
1. Can we enumerate the data loss scenarios?
2. How is drive failure handled?
3. How may data be corrupted and such corruption detected?
4. For every data loss scenario, what is the recovery procedure?
Of course, there is a wealth of information on such questions for standard RAID, but I would suggest for marketing purposes that rockstor synthesize available information (from the many relevant layers of data management) in a coherent fashion, specific to their product. It doesn't have to be deep, but it should be at least minimally comprehensive and broad, with pointers to more detailed, layer-specific information.
Also, it's fine if the recovery scenario is "restore from backup" for e.g. the scenario where data is deleted by an authorized user. If so, there should be at least a minimal "backup story".
the gui looks pretty cool.
personally i would not trust btrfs for a nas. i have made not the best experience while running various production servers with btrfs. i switched (back) to zfs and never looked back, it its just better in every regard.
i also administer a freenas box for a small business and this stuff is rock solid, i would only wish a _easy_ solution to get the permission stuff right in a multi user setting.
none the less, thumbs up for creating this, cool stuff!
I'm in the process to rolling out btrfs on a lot of production servers (no raid, just subvolumes and compression) using Ubuntu 14.04 - what problems did you encounter with btrfs?
I've hit problems like reaching ENOSPC (even though the data extents were only 70% full) on a colocated server, and there isn't enough free space to run a balance operation to get more free space. (The docs literally suggest inserting a USB stick and adding it to your array to help make the balance work..)
Also, the fsck tool is still very immature. It takes many years to get good at detecting and recovering from corruption.
If you don't already, I strongly recommend lurking on the btrfs mailing list. There are regular fixes to balancing, ENOSPC, send/receive and the btrfs-progs tools; occasional questions and fixes related to the compression code.
Be prepared to update your kernels and tools often and independent of your vendor. Btrfs-progs will likely need to come from the git repo, so building your own packages for distribution around your production nodes will probably be necessary too.
A word of caution: do not run btrfsck without consulting the wiki and mailing list first, and hopefully knowing exactly what you are doing. There are situations you'll encounter which do not require btrfsck to repair (but rather, other tools instead), and it will potentially make a recovery less likely.
FWIW, I have been watching the list for years, and reading regularly for about 6 months trying to get a sense of stability with respect to the features I want.
I would not put btrfs in production yet. Though, likely soon.. I'd guess another year or so.
Oh my god. The debate was between ZFS and btrfs and although I favored ZFS, the extra kernel module and the upcoming support in distros led to the decision for btrfs. However we won't do anything fancy with it. Basically just using the whole disk for a distributed filesystem without snapshots and we use btrfs because of checksumming and scrubbing weekly/monthly to detect corrupt disks and data and maybe compression with lzo and subvolumes. As far as I understood this should be safe?
New kernels should be no problem as Ubuntu will likely provide an HWE stack in the future and btrfs-tools is inside a well maintained ppa...
I wouldn't use ZoL either -- I read that mailing list for quite awhile too, and skimmed most of the issues on GitHub. As of about six months ago, lockups were too frequent for my taste. All the implementations are improving though and the OpenZFS movement is promising. A caution here too: if you use ZFS, all implementations are not equal, you'll need to research the specifics for each platform on which you intend to use it; and the compatibility [with other implementations] if you want to move the file system [to a different platform]. If I was rolling out ZFS, I'd only use it on Illumos/OpenIndiana (vs., say ZoL).
I have been waiting and watching for a long time for most of these "new" filesystem features (pools, fs-level RAID, checksums, send/receive), but I am a "filesystem conservative" (especially in production; less so on my own machines) -- I'll keep waiting awhile longer. On production Linux today, I stay with EXT4 or XFS.
Btrfs does support raid5/6, I'm using it right now. It is still being refined and has a couple rough edges, but I haven't had any problems in the year or so I've been using it. It is not "production ready" yet for sure, but the support is there.
Perhaps. I have some ideas about search features. We can also get some cool stats efficiently from btrfs trees also. But I'd love to hear your thoughts. Is it possible for you to give more input on search/indexing that you wish to see? You can even write to us directly -- support@rockstor.com or file an issue on github: https://github.com/rockstor/rockstor-core
Hmm, stats - don't know much about this topic, but I'd been keen to hear more about what's possible.
On the file indexing front, I think Recoll and Tracker/MetaTracker are the two most active projects - Recoll being the more active one. Strigi and Beagle are both discontinued.
Anyone have suggestions for better servers? I wonder if Rockstore would work well with the backblaze case. Maybe some of the OCP cases would work. Anyone played with those?
Performance of SMB on Mac is only about half of AFP/NFS, and NFS is more complex to manage from an authorization/user management point of view in a Mac environment.
Thanks for your concern, but we don't see a point in just removing it in git because it doesn't really help. the key is in several branches, in our iso file, every rockstor rpm in our yum repo and not to mention lot of users who have downloaded rockstor.
We changed the key in our live demo, but for our users we'll roll out the fix in the next update. As part of that fix, we'll also remove the key file from git.
I think that's a reasonable plan. Hope I am not missing something.
The UI stuff is great, but the tricky bit about building a storage system is not provisioning it, or getting the access protocols right, it is all about finding all the ways that data can be destroyed (both silently and noisily) and guarding against them. So if you want to stick with the Enterprise target, then you need something like the ZFS On Linux page which describes every way you can get data zapped and how you will prevent that from happening.
If you want to be just an off the shelf "hey here's something that will make your access point into something like a NAS device." then you get to lose data when a disk goes bad, or a memory chip goes bad, or a network cable is loose, or the powersupply cuts out, or the cat knocks it off the table etc.