Preface:
First of all, there are a verity of things that can cause media to fail. This is why the old throwing your drive in the freezer trick isn't always ideal unless you can confirm that you have a particular drive issue. So this is my little disclaimer - when you have confirmed your failure is the result of bad sectors this guide is tailored for you.
Incidentally this is more or less how the pros scavenge your hard drive when you've been up to no good.
Tools:
ddrescue - It drives me up a wall when people recommend dd for data recovery (you know who you are - I will flame you if I'm not doing something more important!), ddrescue is essentially the same functionality as dd but it takes into account the fact that the media you are working with could be damaged, which makes the difference between getting files back and grinding your platters to dust or burning your NAND up.
fsck - Standard issue Unix File Sytem Consistency Check utility. (this is assuming you are using a "native" file system, since FAT isn't journaled using dosfsck usually doesn't do any good, and a native win32 chkdsk is really the only way to deal with truly damaged NTFS systems, the tool from ntfs-3g will do more damage in this case)
TestDisk - This nifty tool may or may not be needed depending on how much damage was done to your partition table and file system, it can also recover dereferenced files (a.k.a. "deleted")
losetup - Standard Linux loop device management utility
mount/umount - If you don't know how to use this yet, turn away now, don't say I didn't warn you.
Procedure:
Phase 1
We need to make a forensic copy of your drive, incidentally if your file system and partition table isn't recognized, plug the media in anyways because we're working with the block level so that doesn't matter (yet). This entails making a clone of _EVERY_ sector (or block, I'll use sector and block interchangeability here, sorry, bad habit). So make sure you have enough free space, in this guide I'll be using an image because I find that more convenient to work with, you could always do disk to disk cloning but you are on your own if you choose that route.
We do this so that we don't have to worry about the drive dying on us while recovering data, or botching up data.
Adjust your paths accordingly - its important that you add the log, this will let us make multiple passes, keep track of sectors to come back to (initially bad or slow sectors are skipped to get as much data as possible), and resume the image later even you are interrupted for some reason. (told you ddrescue was better at this then dd)
First pass; the n option tells ddrescue not to retry bad sectors or to split (I'll explain that in more detail later, but its another reason you want to use ddrescue for this not dd). Depending on the size of your disk you may want to get on with life for a day or two, I've seen little 16 GB flash drives take upwards of a couple hours to finish pass one.
~$ ddrescue -n -v /dev/DAMAGED_DISK /image.img /ddrescue.log
Second pass; in the first pass we skipped over bad and slow sectors in an attempt to recover as much data as possible, now we are going to go back and grind away at these sectors a few more times. -r 3 will say after 3 tries move on, -d will directly access the disk rather then using the kernel buffer. ddrescue is smart enough to use the log file to go back and hit bad sectors. This could also take some time.
~$ ddrescue -r 3 -d -v /dev/DAMAGED_DISK /image.img /ddrescue.log
On splitting and trimming: Sectors are the smallest allocateable unit of storage space - as far as your operating system is concerned. Traditionally sectors are 512 bytes and newer drives in the terabytes often have 4096 bytes per sector; this means that not all of a sector is necessarily bad. Splitting and trimming will have ddrescue go back over a sector and try to find accessible bytes, resulting in more data then if bad sectors were simply ignored.
Phase 2
Pro Tip - if you have enough storage space, now is a good time to make a copy of this image, just in case you screw up.
Ok, now that we have our image we need to trick the operating system into treating it like a physical drive. After running this command we'll be able to treat /dev/loop0 like a device node representing a drive (e.g. /dev/sda) only we'll be manipulating our image instead.
~$ losetup /dev/loop0 /image.img
If your image has a bad partition table AND you are using a journaled file system like ext4, you may simply be able to create a new partition table on top of the image (using a tool like fdisk), you just need to make sure it starts in exactly the same spot and stop either at the exact same spot, then fsck the file system.
Only use this if phase 3 does not yield results!
This tick doesn't work with FAT file systems unfortunately.
If you are fairly certain your partition table was damaged, skip ahead to phase 3.
Lets see if fsck can do anything for us, the N option will prevent anything from being written to the image just yet until we are sure:
~$ fsck -N -V /dev/loop0
At this point it may be necessary to use the file system specific fsck tool (e.g. e2fsck) since the fsck wrapper utility may not be able to detect the file system in use.
Its entirely possible that fsck might not be able to help us here, move on to phase 3 in that case. Lets try mounting the file system, its important to use the ro (read-only) option with mount, this allows us to preserve the integrity of our image. Also you may need to use the -t option to let mount know what file system driver to use.
~$ mount -o ro -t FILE_SYSTEM /dev/loop0 /mnt
If that worked, you are now free to poke around the file system at your leisure, copying what data you want to.
Don't forget to unmount your file system when you are done. And when you are done with your loopback device you can disassociate it with your image like so:
~$ losetup -d /dev/loop0
Sit back, pat yourself on the back, hopefully the experience taught you its cheaper just to buy a couple more disks and have a good backup handy.
Phase 3
You are at phase 3 because there was too much damage to either your partition table or your file system to recover data. In this case you should still have your loop back device loop0 associate with the image.
We'll use a tool called TestDisk that has a few different approach to handling data and usually file pointers or even partition tables don't need to be intact, it just deals with raw data.
~$ testdisk /dev/loop0
- It should only show you /dev/loop0, go ahead and select proceed.
- You will need to know what kind of partition table you have, chances are if you arn't sure, you'll want Intel/PC
- Select Analyse and do a Quick Search
- If this doesn't turn up your partition you may need to do a Deeper Search
- Once you have found your partition, go back to the main menu and choose [Advanced] File system utilities
- Choose undelete
- You can use 'a' to select all files and 'C' to copy those files, it will prompt you to find a location to copy them to. Anything hilighted in red was a previously deleted file.
Sit back, pat yourself on the back, hopefully the experience taught you its cheaper just to buy a couple more disks and have a good backup handy.
First of all, there are a verity of things that can cause media to fail. This is why the old throwing your drive in the freezer trick isn't always ideal unless you can confirm that you have a particular drive issue. So this is my little disclaimer - when you have confirmed your failure is the result of bad sectors this guide is tailored for you.
Incidentally this is more or less how the pros scavenge your hard drive when you've been up to no good.
Tools:
ddrescue - It drives me up a wall when people recommend dd for data recovery (you know who you are - I will flame you if I'm not doing something more important!), ddrescue is essentially the same functionality as dd but it takes into account the fact that the media you are working with could be damaged, which makes the difference between getting files back and grinding your platters to dust or burning your NAND up.
fsck - Standard issue Unix File Sytem Consistency Check utility. (this is assuming you are using a "native" file system, since FAT isn't journaled using dosfsck usually doesn't do any good, and a native win32 chkdsk is really the only way to deal with truly damaged NTFS systems, the tool from ntfs-3g will do more damage in this case)
TestDisk - This nifty tool may or may not be needed depending on how much damage was done to your partition table and file system, it can also recover dereferenced files (a.k.a. "deleted")
losetup - Standard Linux loop device management utility
mount/umount - If you don't know how to use this yet, turn away now, don't say I didn't warn you.
Procedure:
Phase 1
We need to make a forensic copy of your drive, incidentally if your file system and partition table isn't recognized, plug the media in anyways because we're working with the block level so that doesn't matter (yet). This entails making a clone of _EVERY_ sector (or block, I'll use sector and block interchangeability here, sorry, bad habit). So make sure you have enough free space, in this guide I'll be using an image because I find that more convenient to work with, you could always do disk to disk cloning but you are on your own if you choose that route.
We do this so that we don't have to worry about the drive dying on us while recovering data, or botching up data.
Adjust your paths accordingly - its important that you add the log, this will let us make multiple passes, keep track of sectors to come back to (initially bad or slow sectors are skipped to get as much data as possible), and resume the image later even you are interrupted for some reason. (told you ddrescue was better at this then dd)
First pass; the n option tells ddrescue not to retry bad sectors or to split (I'll explain that in more detail later, but its another reason you want to use ddrescue for this not dd). Depending on the size of your disk you may want to get on with life for a day or two, I've seen little 16 GB flash drives take upwards of a couple hours to finish pass one.
~$ ddrescue -n -v /dev/DAMAGED_DISK /image.img /ddrescue.log
Second pass; in the first pass we skipped over bad and slow sectors in an attempt to recover as much data as possible, now we are going to go back and grind away at these sectors a few more times. -r 3 will say after 3 tries move on, -d will directly access the disk rather then using the kernel buffer. ddrescue is smart enough to use the log file to go back and hit bad sectors. This could also take some time.
~$ ddrescue -r 3 -d -v /dev/DAMAGED_DISK /image.img /ddrescue.log
On splitting and trimming: Sectors are the smallest allocateable unit of storage space - as far as your operating system is concerned. Traditionally sectors are 512 bytes and newer drives in the terabytes often have 4096 bytes per sector; this means that not all of a sector is necessarily bad. Splitting and trimming will have ddrescue go back over a sector and try to find accessible bytes, resulting in more data then if bad sectors were simply ignored.
Phase 2
Pro Tip - if you have enough storage space, now is a good time to make a copy of this image, just in case you screw up.
Ok, now that we have our image we need to trick the operating system into treating it like a physical drive. After running this command we'll be able to treat /dev/loop0 like a device node representing a drive (e.g. /dev/sda) only we'll be manipulating our image instead.
~$ losetup /dev/loop0 /image.img
If your image has a bad partition table AND you are using a journaled file system like ext4, you may simply be able to create a new partition table on top of the image (using a tool like fdisk), you just need to make sure it starts in exactly the same spot and stop either at the exact same spot, then fsck the file system.
Only use this if phase 3 does not yield results!
This tick doesn't work with FAT file systems unfortunately.
If you are fairly certain your partition table was damaged, skip ahead to phase 3.
Lets see if fsck can do anything for us, the N option will prevent anything from being written to the image just yet until we are sure:
~$ fsck -N -V /dev/loop0
At this point it may be necessary to use the file system specific fsck tool (e.g. e2fsck) since the fsck wrapper utility may not be able to detect the file system in use.
Its entirely possible that fsck might not be able to help us here, move on to phase 3 in that case. Lets try mounting the file system, its important to use the ro (read-only) option with mount, this allows us to preserve the integrity of our image. Also you may need to use the -t option to let mount know what file system driver to use.
~$ mount -o ro -t FILE_SYSTEM /dev/loop0 /mnt
If that worked, you are now free to poke around the file system at your leisure, copying what data you want to.
Don't forget to unmount your file system when you are done. And when you are done with your loopback device you can disassociate it with your image like so:
~$ losetup -d /dev/loop0
Sit back, pat yourself on the back, hopefully the experience taught you its cheaper just to buy a couple more disks and have a good backup handy.
Phase 3
You are at phase 3 because there was too much damage to either your partition table or your file system to recover data. In this case you should still have your loop back device loop0 associate with the image.
We'll use a tool called TestDisk that has a few different approach to handling data and usually file pointers or even partition tables don't need to be intact, it just deals with raw data.
~$ testdisk /dev/loop0
- It should only show you /dev/loop0, go ahead and select proceed.
- You will need to know what kind of partition table you have, chances are if you arn't sure, you'll want Intel/PC
- Select Analyse and do a Quick Search
- If this doesn't turn up your partition you may need to do a Deeper Search
- Once you have found your partition, go back to the main menu and choose [Advanced] File system utilities
- Choose undelete
- You can use 'a' to select all files and 'C' to copy those files, it will prompt you to find a location to copy them to. Anything hilighted in red was a previously deleted file.
Sit back, pat yourself on the back, hopefully the experience taught you its cheaper just to buy a couple more disks and have a good backup handy.