163
We just lost 3TB of data on a SanDisk Extreme SSD
(www.theverge.com)
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
I get a lot of folks are correctly pointing out the need to back up data but isn’t that a little bit of victim blaming? This isn’t a situation where the guy had a 10 year old drive with all his photos and videos sitting around unbacked up. He had a new drive and it failed. Can we agree that brand new drives aren’t supposed to fail?
No.
The typical failure rates, for pretty much all electronics, even mechanic stuff, form a "bathtub graph": relatively many early failures, very few failures for a long time, with a final increasing number of failures tending to a 100%.
That's why you're supposed to have a "burn in" period for everything, before you can trust it within some probably (still make backups), and beware of it reaching end of life (make sure the backups actually work).
That's absolutely true in the physical sense, but in the "commercial"/practical sense, most respectable companies' QA process would shave off a large part of that first bathtub slope through testing and good quality practices. Not everything off of the assembly line is meant to make it into a boxed up product.
Apparently even respectable companies are finding out that it's cheaper to skimp on QA and just ship a replacement item when a customer complains. Particularly when it's small items that aren't too expensive to ship, but some are doing it even with full blown HDDs.
In this case, I think we can remove what's left of the benefit of the doubt from Western Digital (who owns SanDisk). They are as scammy/shady as I know a company to be.
Personally I've been boycotting them since 2016 after I couldn't recover the data from an external drive, which WD encrypted without warning nor consent. A faulty component on the PCB (unrelated to the drive itself), combined with WD's non standard practices (non SATA pins + mandated proprietary encryption) meant that I had to lose this drive and the data it contained so they could make a quick buck. I can't trust a company with such ethics to store anything for me.
In 2020, they got themselves into another scandal. WD reds, which were advertised as pro/NAS storage, and sold at a premium, were found to behave like shingled drives (a technique that trades away some reliably and availability in exchange for extra storage density), exposing many users to heightened risk of critical failure (esp. during disks swaps). WD of course denied, and then again when confronted with evidence, up until the internet burst in flames. Again consumer hostile practices.
Here we have SSDs which have been reported for months, and by several reputable sources, to be having problems, which SanDisk even attempted to patch without success. And now, wouldn't you think that they are trying to recall them all in order to protect consumers from likely data loss (like any responsible data storage provider would do)? Nope. They are currently trying to sell those at significant discount, as quickly as they can, hurting plenty of consumers in the process is less important than their short term financials.
As far as I can care, they can go to hell, bankruptcy is all they deserve, for the greater good.
Indeed. An old EE mentor told me once that most component aging takes place the first two weeks of operation. If it operates for two weeks, it will probably operate for a long, long time after that. When you're burning in a piece of gear, it helps the testing process if you put it in a high temperature environment as well (within reason) to place more stress on the components.
The high temperature part is kind of a trap with SSDs: flash memory is easier to write (less likely to error out) at temperatures above 50C, so if you run a write heavy application at higher temperature, it's less likely to fail than if it was kept colder.
Properly stress testing an SSD would be writing to it while cold (below 20C) and checking read errors while hot (above 60C).
For normal use you'd want the opposite: write hot, read cold.
They should at least try to recover the data. Maybe a data recovery program like spinrite would just do it. https://www.grc.com/sr/spinrite.htm .
Not running raid, not backing up, and not even trying the simplest recovery approaches is just sloppy and lazy. Do at least one of the three.
Like someone else said. Expect the biggest risk of failure when you buy it. Then like maybe 5 years out rising failure rates. Refreshing the disk pattern as it gets older can help too.
Just pay triple! Don't be a poor!
Such great advice.
You can be mad at it but what they said is largely true. Not having the data backed up somewhere and expecting everything to be perfectly fine forever is like not having old photos backed up somewhere and expecting everything to be perfectly fine forever.
It's even more egregious here because if OP can afford a 3TB SSD, they should be able to afford a 3+TB HDD as a backup no problem. The money isn't an issue for OP, just improper knowledge of how to handle data storage. It isn't necessarily their fault this happened since the average person isn't given this info, but at its core, "pay more money" because you need backups is the only true answer
All of this skills the point. This is a second drive that failed, it was the replacement for an earlier drive that failed.
That's what the article is all about.
A high, unexpected and unreasonable failure rate.
I had a high failure rate in some Seagate drives in the early 00s. Switch vendors and never had the problem again.
We also do no know how they failed. Are they still image readable with ddrescue or spinrite for example or are they truly crashed. It is not clear if they even tried.