21
submitted 1 year ago* (last edited 1 year ago) by patatahooligan@lemmy.world to c/linux@lemmy.ml

I have an SSD from a PC I no longer use. I need to keep a copy of all its data for backup purposes. The problem is that dd reports "Input/output error"s when copying from the drive. There seem to be 20-30 of them in the entire 240GB drive so it is likely that most or all of my data is still intact.

What I'm concerned about is whether these input/output errors can cause issues in the image outside of the particular bad blocks. How does dd handle these errors? Will they be eg zeroed in the output or will the simply be missing? If they are simply missing will the filesystem be corrupted because the location of data has been shifted? If so, what tool should I be using to save what can be saved?

EDIT: Thanks for the help guys. I went with ddrescue and it reports to have saved 99.99% of the data. I guess there could still be significant loss if the 0.01% happens to be on filesystem structures, but in this case maybe I can use an undeleter or similar utility to see if I can get back the files. In any case, I can work at my leisure now that I have a copy of the data on non-failing storage.

you are viewing a single comment's thread
view the rest of the comments
[-] patatahooligan@lemmy.world 3 points 1 year ago

Thanks for the input, guys. I consider my issue resolved.

As for the specific question I head, dd can fill with zeroes the blocks that failed to read with conv=noerror,sync. However, this puts the zeroes at the end of the block and not over the exact bit/byte that failed to read, meaning that a read error will invalidate the rest of the block.

But the consensus across source I searched seems to be to use ddrescue instead of dd.

[-] rotopenguin@infosec.pub 2 points 1 year ago* (last edited 1 year ago)

There is no particular bit or byte that is wrong. The drive is coming from an entire 128K to megabytes-large page that it couldn't make sense of. There was already a lot of error correction code tried, and the overall analog values of the page were re-tried (was this a 14/16th millivolt, or a 15/16th millivolt?). That page couldn't be made sense of, the MLC page overlaid on it couldn't be made sense of, the TLC page overlaid on that couldn't be made sense of, etc. Or things could also be so bad that the FTL doesn't even know which flash cells your data should be found in.

Everything that I understand about flash storage suggests that it can't reasonably do little errors. You could still get small errors from a bit flip in delivery, or more likely flips in your PC's own ram. But the flash itself should either be very right, or very very wrong. Nothing in-between.

[-] patatahooligan@lemmy.world 1 points 1 year ago

Thanks for the explanation. I don't really know how flash storage works. The fundamental idea of the problem I described would still apply, though as long as the input block size for dd extends to more than one page of the underlying storage.

For example, say that exactly three pages fit in a block. If dd attempts to read pages A, B and C (ABC) and fails to read B, you would want the corresponding part zeroed in the output to preserve the offsets of all the other pages (A0C). But instead dd reads whatever it can for the entire block, then pads the rest of the block size with zeroes, effectively moving C forward (AC0). So essentially you magnify errors.

this post was submitted on 05 Oct 2023
21 points (95.7% liked)

Linux

48349 readers
439 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS