When Solid State Drives Are Not That Solid

ploxiln · on June 16, 2015

Originally TRIM was an un-queued command; all writes had to be flushed, then TRIM executed, then writes could continue. This was bad for performance with automatic on-file-delete trim, so everyone wanted a trim command that could be put in the command queue along with writes. Many new drives have this.

It turns out that Samsung 8XX SSDs advertise they support queued trim but it's buggy. The old TRIM command works fine.

https://lkml.org/lkml/2015/6/10/642

There are in fact lots of "quirks lists" and "blacklists" in the kernel and virtually all computers require some workarounds in the linux kernel for some buggy hardware they have. Pretty amazing when you think about it.

EDIT: another closely related example is macbook pro SSDs and NCQ aka native command queuing. They claim they support it, but on many it's buggy. It gets better though; the linux kernel just starting trying to use such functionality by default relatively recently.

https://bugzilla.kernel.org/show_bug.cgi?id=60731

these sort of things are, as you can see, very confusing and frustrating to track down, identify, and find a general fix for

EDIT2: now that I actually read the kernel bugzilla entry further, it's more recently come to light the actual problem with recent macbook pro SSDs is MSI (efficient type of interrupts)

digi_owl · on June 16, 2015

In essence the Linux kernel put on display what is on Windows hidden by proprietary device drivers.

yosefk · on June 16, 2015

The thing is, almost all hardware accessed through drivers has tons of bugs, at least it's nowhere near as close to "bug-free" as are things like CPUs or DRAMs which cannot hide their bugs behind drivers. The thing that one can hope to work reasonably is a piece of hardware plus an accompanying driver which knows to hide that hardware's issues.

So another way of putting what you said would be "on Linux there's no working driver for that piece of hardware, unlike on Windows where the 'proprietor' went to the trouble of supplying such a driver."

jjawssd · on June 16, 2015

If you think CPUs do not come with a shit-ton of hardware bugs YOU ARE GRAVELY MISTAKEN.

Google up the Intel errata for the i7

The list goes on and on.

ptaipale · on June 16, 2015

I didn't see him thinking that. Just that CPUs do not have as many bugs as other hardware - which I think is quite true. With CPUs a larger portion of bugs are found, and smaller bugs matter because they are not hidden by proprietary drivers.

notacoward · on June 16, 2015

FWIW, memory has plenty of bugs too. With respect to the original point, these are usually not visible to drivers (unless you count EDAC) because they're handled at the chipset level. However, for certain kinds of systems - especially embedded - that don't have chipsets these issues can become painfully visible. My own exposure to this was at SiCortex, where the memory logic was directly on the same single die as everything else that comprised a node.

digi_owl · on June 16, 2015

Heh, i still recall my early encounters with Linux and reading the bootup messages.

One of them contained a line related to having found a CPU bug and having put a workaround in place.

I am not entirely sure, but i think it may have been the F00F bug.

https://en.wikipedia.org/wiki/Pentium_F00F_bug

aidos · on June 16, 2015

Ha. Reminded me of the Pentium Floating Point Bug from the 90s. First (only?) time a CPU bug has been an international press story?

https://en.wikipedia.org/wiki/Pentium_FDIV_bug

MichaelCrawford · on June 16, 2015

that's one of the things drivers are for; to workaround hardware bugs.

Among the challenges faced by the AMCC 3ware RAID HBAs were faulty motherboards.

"But PCI is a standard!" you quite reasonably protest.

Yes, and the US Constitution guarantees us many inalienable rights.

sdalfakj · on June 16, 2015

Since you seem to be the higher voted and showing on top, could you update your bit about queued stuff with this

https://news.ycombinator.com/item?id=9724192

ploxiln · on June 17, 2015

I can no longer edit my comment.

I assumed that these drives had the same controller chip and the same firmware base as the consumer samsung SSDs, but with higher quality nand and some firmware tweaks. It's very hard to find technical details about these enterprise drives on the internet (compared to the consumer drives).

I guess the smartctl command proves it, these enterprise samsung SSDs do not have queued trim enabled.

It would make sense for enterprise drives to be more conservative and lag on feature set. But it's very surprising that enterprise drives are corrupted by original un-queued trim, they're supposed to have more validation, and that's a very common feature.

adamsurak · on June 16, 2015

In this case the TRIM command was un-queued, which makes it worse.

ploxiln · on June 16, 2015

It sounds to me like even when it's the fstrim utility, which uses some ioctl() to tell the kernel to trim free regions in a range on a filesystem, the kernel ends up causing the queued trim command to be used if available.

The "blacklist" does not appear to have any constant to blacklist old-style trim, only NCQ_TRIM (and other odd stuff, most notably all NCQ usage).

This makes sense, because if some SSD advertised old-style trim but was corrupted by it, then it would be found and fixed sooner by these vendors, because Windows 7 would exhibit the corruption.

ploxiln · on June 17, 2015

I see the addendum to your post; touche, I guess these drives do indeed lack queued trim, and have some issue with plain old trim. That's rather surprising, to me... I was going to say "especially for an enterprise-grade drive" but I'm not so sure...

MichaelCrawford · on June 16, 2015

"workarounds in the kernel."

Please permit me to violate my NDA:

/* MacWrite needs this */

... in Mac OS System 7.5.2. I honestly don't know whether MacWrite still needed it but that code was there to work around a bug.

ChuckMcM · on June 16, 2015

Nice debugging story. When I was at NetApp there were lots of times when drive firmware for the 'less used' options would fail. On the fiber channel drives the 'write zeros' command which was supposed to zero a drive was notorious in its in ability to achieve something that simple. When Google looked at (I don't know if they finally deployed it) the disk encryption technology it worked differently disk to disk and firmware rev to firmware rev. I think it was Brian Pawlowski at NetApp that said "You can count on two things working right in a hard drive, read, write, and seek." The joke being that you needed all three of them to work for reliable disk operation.

teraflop · on June 16, 2015

Here's an Ubuntu bug tracker entry for what sounds like the same problem: https://bugs.launchpad.net/ubuntu/+source/fstrim/+bug/144900...

Linux 4.0.5 includes a patch that blacklists queued TRIM for the buggy drives. Windows and OS X apparently don't support queued TRIM at all, so they're unaffected.

adamsurak · on June 16, 2015

The drives we have detected the issue had still un-queued TRIM. I have reached to one of the kernel I/O developers for help and he confirmed that it is not related.

asayler · on June 16, 2015

But isn't the blacklist you link to in the article specifically for queued TRIM? E.g. https://github.com/torvalds/linux/commit/9a9324d. SO either that blacklist has nothing to do with this issue (in which case it probably shouldn't be linked from the article), or it does, and we're talking about issues with queued TRIM.

lolgay5 · on June 16, 2015

The article very clearly states that the issue had nothing to do with queued or unqueued TRIM:

Our affected drives did not match any pattern so they were implicitly allowed full operation.

See the list:

  SAMSUNG MZ7WD480HCGM-00003
  SAMSUNG MZ7GE480HMHP-00003
  SAMSUNG MZ7GE240HMGR-00003

jlebar · on June 16, 2015

To me, this sort of thing brings home the value of not running your own machines. Sure, Amazon's/Google's clouds have quirks, but it's far less likely that you're going to have to debug faulty hardware in this way. It sounds like a team of more than one person worked on this at least part-time for weeks -- how much is that worth? It's not just the cost of hiring extra people to do the work; often small companies simply can't hire enough good people -- when you do find them, do you want to squander them twiddling servers?

will_hughes · on June 16, 2015

If something similar happens to you on "cloud" infrastructure, you're very limited in what you can do to diagnose or work-around the problem.

At a place I used to work at we had a reasonably large cluster of Windows boxes on Amazon. Randomly, Windows machines on Amazon would suddenly stop accepting new TCP connections.

This means that machines would be running fine, and then half your cluster starts dropping offline. At the time when this happened to us, there were no other reports we could find of this happening.

Turns out, it's some bug in the Xen Virtual NIC driver that wasn't running the offloaded TCP cleanup, and so eventually the system couldn't accept any new connections. Once we figured out was happening we could pre-emptively reboot boxes, but that was a problem for us for about 6 months iirc.

There's probably dozens of these bugs affecting someone on these cloud platforms at any one time. But because you have no access to the hardware, you don't even have the option of saying "Screw it, lets just get different hardware". You're at the mercy of your cloud provider.

madez · on June 16, 2015

There is no cloud - just other people's computers.

Many use-cases just require the job to be done on your computers due to security and privacy reasons. Yes, Amazon's and Google's services are in some ways less secure than your own computer, because they are hosted by companies which are subject to a government that doesn't value privacy, not even of it's own citizens. That means said government can, just to give a concrete example, NSL the companies to give up all they have about you, and you wouldn't even know notice.

When the government puts national security above fundamental human rights there is something dangerously wrong.

derefr · on June 16, 2015

Thinking about individual computers will lead you astray. There are, rather, sets of machines (from single boxes to entire data-centers) that are managed by a given sysadmin staff. The more machines they manage, the more likely it is that problems will have institutionalized and operationalized solutions.

A cloud is just a sysadmin staff with a Sufficiently Large Deployment to have ironed out all the kinks in their hardware.

VLM · on June 16, 2015

Or the more likely they'll not do advanced stuff in order to increase profit, as long as there is a microscopic delta better than running it yourself for most customers most of the time on average. The microscopic delta may not be measurable or noticeable by the end users of course.

Assuming their business model isn't assuming an infinite supply of future customers so in the short term as long as revenue per customer exceeds cost of sales per customer we're all good, etc. Support costs that exceed average cost of sales must be beaten down/ignored, otherwise its cheaper to let them go and have sales "earn" a replacement customer.

Finally their sysadmins work for them to meet their corporate objectives of various meaningless metrics which have no necessity of aligning in any way with your own corporate objectives.

vidarh · on June 16, 2015

> A cloud is just a sysadmin staff with a Sufficiently Large Deployment to have ironed out all the kinks in their hardware.

By that definition, I don't think there are any clouds.

derefr · on June 16, 2015

True, by the literal definition. I continue to interpret "cloud" as "that mysterious part in the middle of the diagram which is a clean encapsulation of Somebody Else's Problem that never bothers you"; obviously, there are no true "clouds" (and there cannot be) by that definition.

But people can try, and they can get close; and one can say that something is a cloud to the degree that it manages to fulfill the "amorphous shape in your diagram you don't have to worry about" promise. So there are some 80%-clouds, some 95%-clouds, some 99.995%-clouds, and so on.

The point I was trying to make is that the degree to which a cloud achieves that promise is correlated to the size (and longevity, and homogeneity) of the deployment. The more man-years have gone into taking care of a given server type at a given DC, the more institutional knowledge is ready-at-hand to solve a problem on your machine of that type, and so the fewer issues become emergencies that break out of the "cloud" abstraction to require your attention.

And it was a reply to the parent precisely because a security problem is just such an "emergency" that represents a failure of institutional knowledge: I would much sooner trust AWS's KMS to not leak my private keys than I would trust a machine I was running myself to not leak my private keys. I'm a much worse sysadmin than AWS!

pjc50 · on June 16, 2015

This is true but not relevant to the parent comment's concerns about security/privacy.

KaiserPro · on June 16, 2015

Lets do some maths on that claim: AWS: c3.8xlarge with 32 "CPUS" and 60 gigs of ram.

For the machine alone its $1200 a month. Bear in mind its on a shared infrastructure, with noisy neighbours. You'll see about 10-30% CPU steal. In practice you'll see performance about half that of a real machine (from my comparisons)

Then you'll need to factor in disks as well. First things first EBS is dogshit slow. Yes ephemeral disks are fast, but then they die, so you're in the same situation. however you need 10gig networking to get low latency, avoid puncturing the cache etc,etc,etc,

for EBS the maximum IOPs you can guarantee to get is 20,000, and you need 1tb for that.

for the Iops, thats $1300 a month + $125 for the 1 TB of storage.

so a month, per machine it'll be $2625. $31500 per machine, per year.

Every 6 months, you could buy a new machine, which is faster than the fastest EC2 instance + EBS.

Now, the OP stated that they have more than one machine. Obviously one could use reserved instances. However similarly one could negotiate volume discounts.

There is of course the cost of internet and cooling, you're looking at around $500 a month for half a rack, depending on power consumption. (if you're colo'ing)

From a valuation point of view, having hardware counts towards your value, as its an asset you actually own. More importantly you can use it to lower your tax bill, and reduce your run rate, in exchange for an up front cost.

Now, if you have a lot of bursty traffic, that doesn't require much DB activity, then AWS is perfect, as the elastic IP load balancer allows you to spin up machine on demand. However thats not that helpful for Databases. Sure you can warm migrate from a EBS snapshot, but you'd best do it quick, otherwise you'll overload an already overloaded DB.

adamsurak · on June 16, 2015

With our architecture, HW requirements, the price of HW and the price of the cloud VMs, even working on this for a week or two saves us significant amount of money both short-term and long-term. The side effect is that we now have tools to recover servers way faster and allows us to do things we have not thought about before.

jhead · on June 16, 2015

Agreed. Additionally, some business models simply don't mesh with cloud infrastructure pricing no matter the volume. There are definitely advantages to using cloud services, but most of the time bare metal gets you more hardware/performance at a lower cost in the long run, even when you factor in everything else that it entails.

hvidgaard · on June 16, 2015

The thing people forgets, is that the cloud provider have the same issues and expences. That cost is passed on to the clients. Now they may be more efficient ect. but once you reach a certain scale, and it's less that people think, you might as well get it done in house if you can find qualified people.

icebraining · on June 16, 2015

How can people forget, when that cost is right there in the price tag? If anything, it's easier to overlook the costs of running your own hardware, since they aren't immediately apparent.

vidarh · on June 16, 2015

My experience is that people don't understand cloud pricing at all.

First of all they tend to not look at monthly prices, and are seduced into thinking their instances are cheap. Secondly they are seduced itnto thinking they are spending less ops time, though in my experience it's the reverse. Thirdly, people "forget" about extras like bandwidth costs (which are extortionate at all the big cloud providers), extra storage volumes etc.

Then when people get the bill, it often gets back-rationalised as being ok because it's cloud so it must be cheap.

The greatest innovation AWS did was finding a way to get people to pay absolutely insane rates for hosting.

hvidgaard · on June 17, 2015

They simply underestimate the ops cost, and often focus on the monthly cost. The thing that cloud providers like AWS are good at, and IMO, the only reason you should choose them, is when you have highly variable loads. Dynamic scaling is something only they can do because they have such a massive scale. Even if you're relative small and cannot justify hiring a sysadmin, there are plenty of consults out there you can hire.

emodendroket · on June 16, 2015

I agree with you. I don't think it makes sense except for very large companies.

MrBuddyCasino · on June 16, 2015

Not directly related to TRIM, but AeroSpike has a nice test suite for SSDs, probing for IOPS and latency: https://github.com/aerospike/act

They share their test results for both physical and cloud-based storage, I figured this would be of interest:

http://www.aerospike.com/docs/operations/plan/ssd/ssd_certif...

madez · on June 16, 2015

It feels like Samsung used the Linux community here as a free testbed.

Samsung knew that only Linux supported queued trim, so releasing it without proper testing is just externalizing the disproportionately increased cost of testing to the Linux community.

adamsurak · on June 16, 2015

In this case it was un-queued TRIM (I forgot to mention it in the blogpost). We have reached to Samsung and although it looked good at the beginning now they are silent for more than a month without any progress.

madez · on June 16, 2015

I was refering to the Native Command Queued trim (hence "queued trim"), not the traditional trim command.

caf · on June 16, 2015

Is the loss of reputation really worth less than the value of the externalised testing?

madez · on June 16, 2015

That remains to be seen. I really hope not.

With Samsungs finished-forms walling the company already tells Linux users to not expect any support, at all. So, that is consisting with the testbed-theory.

simoncion · on June 16, 2015

> With Samsungs finished-forms walling the company already tells Linux users to not expect any support, at all.

I'm sorry. I'm too dumb to parse this. :( Would you kindly rephrase it?

Thanks much. :)

madez · on June 16, 2015

    I'm sorry. I'm too dumb to parse this. :(

I am sorry. I was too dumb to phrase it understandibly, so the shame is on me. The sentence smelled problematic to me but after several times rereading it I concluded it is understandable. I should have gone with my guts…

Here is another try: Samsung's support walls with prewritten answers that say Linux is open and thus Linux is unsupported, and this action of Samsung is consisting with the testbed-theory.

simoncion · on June 16, 2015

Don't sweat it, I'm the king of less-than-comprehensible statements. :)

Anyway, I get your statement now. Maybe instead of saying "walls" you should say "stonewalls" (derived from "stonewalling")?

buffportion · on June 16, 2015

I think the confusion stems from your verbing of "wall".

pjc50 · on June 16, 2015

Loss of reputation isn't a real thing in this industry. Pretty much all hard drive manufacturers have had high-profile "bad" models, for example.

drzaiusapelord · on June 16, 2015

Only because up until recently getting in the HD game was prohibitively expensive due to the engineering and capital requirements for designing spinning disks. Now anyone can buy some flash media, a pre-cooked controller firmware, combine the two and sell at competitive rates. There are something like five or six competitive SSD makers right now and many more bottom feeders. There are two competitive spinning disk makers and its been that way for decades, ignoring the occasional small third-party player like Hitachi.

vidarh · on June 16, 2015

It's a real enough thing that the IBM DeathStar incident [1] [2] was a large factor in making IBM exit the harddisk market (sold off to Hitachi)

[1] https://en.wikipedia.org/wiki/HGST_Deskstar

[2] http://www.astro.ufl.edu/~ken/crash/index.html

pjc50 · on June 16, 2015

But they're still in business making hard drives, under the DeskStar name. Seagate had a round of failures at one point. I'm sure there are people out there who've sworn off WD as well.

vidarh · on June 17, 2015

Who? If you're referring to IBM, then they're not. IBM sold off their entire hard-disk division to Hitachi (which a few years ago sold it off to WD).

If you're referring to Hitachi, then they did continue it, yes, but they bought it on a fire-sale, and their name was not attached to the original affair, so they presumably did not see it as particularly risky.

protomyth · on June 16, 2015

Is Seagate back to being good? We had 100 drive failures in a batch of 120 HP netbooks, and we had a large number of our server drives go down the tube. I switched to WD at that point.

PythonicAlpha · on June 16, 2015

I once was a huge fan of Samsung. But with the EVO disaster and this one, I really regret to have bought one of these.

agumonkey · on June 16, 2015

The EVO was well covered, but these being from the PRO line it's even worse... Intel SSD have been praised for a long time, they seem the only stable brand around.

PythonicAlpha · on June 16, 2015

Yes, it seems. Still I don't like them much, since they changed to compressing controllers, as much I know, SandForce controllers which have a mixed reputation, as much I remember. But it looks, that Samsung is not better.

hvidgaard · on June 17, 2015

The latest EVO firmware have this issue as well, so it's probably a bug in their shared codebase for the controllers.

cabirum · on June 16, 2015

Strange, Samsung 840/850 evo/pro are considered [1][2] among the best consumer SSDs. The issues article mentions do not exist on Windows, the SSDs are very reliable there. I suspect it's not only Samsung fault. Are we sure Linux handling of TRIM operations is absolutely correct?

[1] http://techreport.com/review/27062/the-ssd-endurance-experim...

[2] http://www.anandtech.com/show/8216/samsung-ssd-850-pro-128gb...

notacoward · on June 16, 2015

The problem is that "absolutely correct" is a slippery concept. Even the most tightly written standard is likely to have some areas of ambiguity through which bugs can creep. If the way that a particular device deals with that ambiguity is known only to those under NDA, then you can have two drivers that are both "absolutely correct" per the standard but only one actually works in all the edge conditions.

sfilipov · on June 16, 2015

Windows doesn't do queued TRIM (yet).

drzaiusapelord · on June 16, 2015

Personally, I find Samsung has a "it boots? Fine then ship," mentality to pretty much all things. Their buggy phones, buggy SSD's, buggy TV's, etc. I wouldn't recommend them, even though they do well on SSD speed tests (which are often gamed by on-board ram caching).

scott_karana · on June 16, 2015

The 840 Pro exceeded 2.4PB of writes before failing in Anandtech's tests over 18 months: http://techreport.com/review/27909/the-ssd-endurance-experim...

Even if Samsung has some systemic problems, it's more subtle than just schlocky marketing, or targeted benchmarking.

sandGorgon · on June 16, 2015

I have this running on my Ubuntu Thinkpad with A Samsung 840 Pro as a weekly cron job. should I turn it off ?

  #!/bin/sh
  # call fstrim-all to trim all mounted file systems which support it
  set -e
  
  # This only runs on Intel and Samsung SSDs by default, as some SSDs with faulty
  # firmware may encounter data loss problems when running fstrim under high I/O
  # load (e. g.  https://launchpad.net/bugs/1259829). You can append the
  # --no-model-check option here to disable the vendor check and run fstrim on
  # all SSD drives.
  exec fstrim-all

teraflop · on June 16, 2015

Probably, unless you're running a kernel that was released within the last couple of weeks and includes this patch: https://github.com/torvalds/linux/commit/9a9324d

sandGorgon · on June 16, 2015

wow - thanks. did NOT know about this.

notacoward · on June 16, 2015

Pretty disappointing to see some of those Samsung drives on the list, because in some of the other tests/surveys I've seen they seemed to be among the better choices. Sigh I guess Sturgeon's Law applies to SSDs too.

Aardwolf · on June 16, 2015

"Samsung SSD 850 PRO 512GB recently blacklisted as 850 Pro and later in 8-series blacklist"

That's what I have in my home computer, with ArchLinux.

Do you think this problem only is something particular in the servers of the author of that article, or should this be interpreted as:

linux + samsung 850 = you will lose your data?

Thanks...

adamsurak · on June 16, 2015

Unless you run the latest kernel, I would disable TRIM.

stream_fusion · on June 17, 2015

I have the same drive in a laptop. There were lots of trim errors in the kernel logs, with debian so I ended up disabling trim.

cft · on June 16, 2015

Using SAS SSD drives on a server is a bad idea for many reasons. One should use PCIe cards, that sit directly on the PCIe bus, such as FusionIO or SanDisk. They have been tested and retested (e.g. by Facebook), without the unnecessarily added complexity of SAS/SATA protocols. The I/O performance is also about 20x.

baruch · on June 16, 2015

I don't think that testing by Facebook is going to help you unless you are using the exact same model as they are and are assured of using their exact firmware. At work we use SAS SSDs in large quantities and the firmware we use is customized to us (based on the mainline one). Do not assume that a bug that was fixed in our firmware was necessarily fixed in the normal one. One would think it would but it is possible that it wasn't ported to the mainline firmware.

adamsurak · on June 16, 2015

I completely agree and we are going this direction.

andmarios · on June 16, 2015

Been there, done that. :|

Sometime around the end of 2013 I started getting frequently lost data and corrupted filesystems upon reboot. After much search and about 4-6 months into the issue, I found out that the culprit were the queued TRIM commands issued by the linux kernel to my Crucial M500 mSATA disk. The Linux kernel already had a quirks list with many drives, including some of the M500 variants, just not mine.

I added my model, compiled the kernel and the nightmare ended. I proceeded to submit a bug report and a patch. The patch got accepted (yay!) and the bug report turned to be very useful for other people with the same problem but different disk as I included the dmesg output that was specific to the issue. This meant that they could now google the errors and get a helpful result.

Such is the nature of free software; you are allowed to fix your computer yourself. :)

smcleod · on June 16, 2015

I've worked on some interesting SSD deployments / experiments a lot over the past 12 months. Quite honestly - I wouldn't go anywhere near Samsung products regardless of their 'PRO' labelling or otherwise.

We have had great success with both Sandisk Extreme Pro SATA and Intel DC NVMe series drives, we've also recently deployed a number of Crucial 'Micron' M600 1TB SATA drives that are performing very well and so far haven't given us any issues.

u02sgb · on June 16, 2015

I've done similar over the last three years and had good luck with the Crucial drives. However if you take a look at the Linux Kernel patch they link to (search for "don't properly handle queued TRIM"): https://github.com/torvalds/linux/blob/e64f638483a21105c7ce3...

There are Crucial SSDs on the list. I'm going to be keeping a closer eye on them now.

smcleod · on June 16, 2015

Yeah I saw that - although that's the older, now discontinued series that has a different controller and doesn't show the same consistent performance as the newer M600 drives.

suprjami · on June 16, 2015

What a wonderful story. I wish everyone was this diligent at troubleshooting. Then again, that would put me out of a job.

douglasheriot · on June 16, 2015

Wow, that sucks. Another reason to use ZFS – you’d notice the corrupted files a lot sooner.

Freaky · on June 16, 2015

Yup. I was seeing occasional corruption with my SanDisk Extreme Pro's and quite happy that ZFS was able to repair the damage each time.

The problem appears to have gone away following a firmware update, touch wood.

icebraining · on June 16, 2015

Or run a verification layer on top of whatever FS you use (e.g. running git fsck would discover corruption in your git indexed files too).

ThatPlayer · on June 16, 2015

Or Btrfs on Linux.

rleigh · on June 16, 2015

In theory, yes. Unfortunately, every time my Btrfs filesystems have encountered a hardware glitch, it has happily trashed the filesystem beyond recovery (including both drives in a RAID1 mirror, one of which was perfectly OK). I use ZFS now, and while some features are compatable with Btrfs, the implementation quality, documentation, and feature completeness, and tool quality set it well above where Btrfs is at.

naranja · on June 16, 2015

I fully second that: I'm using btrfs for / and ZFS for /srv. So many rashed filesystems beyond recovery on btrfs, so many joy, stability and easy tools for ZFS.

I'm really about to consider to migrate / to ZFS now.

microcolonel · on June 16, 2015

I've had issues with these samsung 8xx drives, unfortunately they all happened at once. I gave up on their RMA/warranty process because I was bounced back and forth between the same two numbers a few times. Either side said that the other was in charge of this process(samsung bought the SSD division from seagate... or was it seagate that bought the HDD division from Samsung? To this day I have no clue.).

bbcbasic · on June 16, 2015

I have a Samsung SSD 850 PRO 512GB in my Windows PC. And I have TRIM enabled in Windows:

     > fsutil.exe behaviour query DisableDeleteNotify
     DisableDeleteNotify = 0

Should I be worried?

ploxiln · on June 16, 2015

Released versions of Windows do not use queued trim.

(That's why serious bugs like this can happen ;)

bad_user · on June 16, 2015

The problem exposed in the article is about un-queued trim.

feelix · on June 16, 2015

This issue is related to TRIM in the context of command queuing, not the relatively ancient straightforward TRIM which Windows supports.

malbs · on June 16, 2015

Pretty sure this is an interaction issue with Samsung drives, trim support, and the Linux kernel, so no, you don't need to be worried.

sengork · on June 16, 2015

Install the Samsung Magician Toolbox for best results on Windows platforms: www.samsung.com/samsungssd/

eveningcoffee · on June 16, 2015

When you do this, read what they claim in ToS.

lvs · on June 16, 2015

Can someone clarify the article's claim that these Samsung drives are really "broken" as such? We have a few of these on 3.13 and 3.16 kernels and ext4 with no problems. It seems that there must be something unique to their application in order to expose these trim failures.

ploxiln · on June 16, 2015

Do you have the "discard" mount option enabled? Do you have a cron job that runs the "fstrim" command? It's possible your systems are not running trim. Or maybe your ext4 filesystems have little activity and you haven't had enough corruption to notice yet :)

Also, some Samsung 800 series drives only gained this bug in a recent firmware update (840 EVO specifically).

guns · on June 16, 2015

The 840 EVO joined the club with firmware EXT0DB6Q, which itself is a nasty little hack around a fundamental design problem with the tightly packed NAND cells.

Linux 4.0.5 ships with the patch linked above, but for a while you had to roll with a kernel built from source.

EDIT: The blatant file corruption issues only manifested after updating to firmware EXT0DB6Q.

Figs · on June 16, 2015

Is there a list of firmware versions with release dates somewhere? I can't seem to find a changelog.

kuschku · on June 16, 2015

So, if I don’t update the firmware of my 840 EVO, I can continue using it with discard?

ploxiln · on June 16, 2015

I'm not sure exactly when the first 840 EVO firmware which advertised queued trim support (along with SATA 3.1/3.2 support) was released, but I think that if you last updated firmware (or acquired the drive) before October 2014, you're safe.

However, if you don't update your firmware, you'll suffer from significant performance degradation when reading old files: http://www.anandtech.com/show/9196/samsung-releases-second-8...

xfalcox · on June 16, 2015

The old file performance hit is VERY big, I can't overstate the need to upgrade.

kuschku · on June 16, 2015

I currently can not notice any performance degradation, and I bought the drive in may 2014, with no further updates. (Unless Arch Linux automatically applies firmware updates, but I doubt that)

Aardwolf · on June 16, 2015

I'm so sick of this TRIM. Constant configurations needed because of it, constant care like "this thing you better don't do on SSDs". And then problems like this.

Do you think there'll ever be SSDs that don't need it?

userbinator · on June 16, 2015

They never "needed" TRIM, it was mainly introduced as a performance optimisation.

I have an old Intel SSD that doesn't even support TRIM, and it still works fine. As do all the other USB flash drives I have...

yardie · on June 16, 2015

I remember started incorporating SSDs into their computers and didn't support TRIM. Windows users were telling Mac users their Macs were practically obsolete because it couldn't do this one thing that was enabled for Windows. of course they sent that back to Apple and Apple replied, for years, you don't need it.

Eventually, they relented and enabled it on their SSDs. I'm pretty sure the marketing and engineering butted heads over this one stupid bullet point.

drzaiusapelord · on June 16, 2015

Except without TRIM you'll fill all your blocks and kill performance of your fancy $1500 Apple when the SSD is performing a dozen operations to create a space to perform writes instead of one operation on a properly TRIM'd drive.

Apple didn't do this because of "windows users whining" but because they knew they didn't want an angry mob of customers wondering why their drive is 10x slower than it was on day one.

Arguably, idle GC was "good enough," for some use cases but probably not for drives that aren't sitting idle all the time and on many hours a day. Even then, Apple probably didn't want to tell its customers "let it sit out overnight" to regain performance when supporting plain-jane TRIM was a trivial addition.

On-board GC + OS-driven TRIM are considered the optimal solution for SSD's.

kbar13 · on June 16, 2015

if one machine failed and failover kicked in correctly, why was the engineer paged?

jimrandomh · on June 16, 2015

Because it's hard to make an automatic monitoring system that reliably distinguishes between "a failure occurred but everything is fine" and "a failure occurred and now everything is on fire".

InclinedPlane · on June 16, 2015

Depends on how much spare capacity they had. Being one failure away from going down is an emergency situation at many places.

mentat · on June 16, 2015

I wondered this as well. Valuing your engineers' sleep is important.

adamsurak · on June 16, 2015

We have multiple different pages. In our cluster we have 3 machines and if one of them is unavailable because of broken network, we do not page. In this case the page came as an application error that the application was not able to cope with. When we have issue that we have seen before and the server can handle it on its own, we do not page.

Qantourisc · on June 16, 2015

Also depends on how many machines you got running. If it's 2: do you really want to wait it out and risk the other one going to hell too ?

stream_fusion · on June 17, 2015

I have one of the affected drives mentioned in the article in my development laptop - the Samsung SSD 850 PRO 512GB.

As one of the most expensive SSD drives available on the market, it was disconcerting to find dmesg -T showing trim errors, when the drive was mounted with the discard option. Research on mailing lists, indicated that the driver devs, believe it's a Samsung firmware issue.

Disabling trim in fstab, stopped the error messages. However it's difficult to get good information about whether drive performance or longevity may be impacted without the trim support.

hvidgaard · on June 17, 2015

Trim really is only a helpful message when the drive is near full so the GC can preemptively zero blocks and retain good write speed. Without trim, the firmware must wait until it gets a write for a particular block before it know it can be erased.

If your drive has reasonably with unprovisioned space, it can simply work around the missing trim commands - this however, is theory, I do not know if the firmware actually does this. This is the exact thing that makes some drives better than others when working without trim.

stream_fusion · on June 17, 2015

Thanks. I'll probably end up creating an unprovisioned partition. It's frustrating, exactly because of the uncertainty re future performance. Especially given the price premium for pro/enterprise level hardware.

hvidgaard · on June 17, 2015

You can research if the firmware understands MBR and GPT - if it only understands one, then you have to use that. Alternatively use Samsungs own software (I think it's called Magician, can't remember exactly), it will make sure you have the unprovisioned space setup correctly.

anigbrowl · on June 16, 2015

Interesting! I sometimes work with SSDs as storage media for cameras (where Sandisk is the most popular brand by a mile) and I seriously doubt any camera firmware is doing drive maintenance. From what I know of digital imaging technicians, neither are they - if a drive starts acting up in any way, the usual policy is to just take it out of service immediately, recover anything that was on it, dump it, and buy a replacement.

sengork · on June 16, 2015

Given how many Samsung drives are listed in their findings, I can only attribute this to the fact Samsung make their own SSD controllers.

Figs · on June 16, 2015

How do you disable TRIM on common distros? Under Ubuntu, is it just preventing /etc/cron.weekly/fstrim from running, or is there more to it? What about CentOS, etc?

frik · on June 16, 2015

What SSD do cloud hoster like DigitalOcean, Linode, Rackspace, Vultr, etc use?

I would some sites trade storage speed for more space (HDDs instead of SSDs).

Supersaiyan_IV · on June 16, 2015

Undoubtedly the same issue happened to me on an 500GB 840 EVO with NTFS.

SSD zeroed out a part of the disk during runtime, as I watched this happen music was playing from this drive. It was mounted from Ubuntu MATE 15.04 and playing a music library through Audacious. Suddenly music glitched and IO errors began appearing. Rebooted to a DISK READ ERROR (MBR was on the EVO). Ran chkdsk from USB and it showed a ridiculous amount of orphaned files for ca. 1h. Once finished the most frequently accessed files had disappeared. Download folder, Documents folder, some system files. Of course, some of the files could've been recovered had I not ran chkdsk off the bat, bot nonetheless it's an approximate measure of failure impact.

I began being suspicious of 840 EVO when sorting old files by date became fantastically slow. If you have a feeling this has happened to you recently - buckle up for a shitstorm.

TL;DR Avoid 840 EVO.

Supersaiyan_IV · on June 17, 2015

To the downvoters: this occurred a week after upgrading to Samsung's EXT0DB6Q firmware. Meaning that mentioned read delays should've been nonexistent.

Not to mention that this disk has only had 5TB written to it.