Saving a dying or damaged hard drive
There are times when drives start to kick out hard read errors. On FreeBSD and most UNIX-like operating systems, you get the errors logged so you have a heads-up before the drive simply dies. With the exception of power surges, ESD, and lightning strikes, hard drives seldom die instantly. They most often carp warnings on errors while continually reallocating data to good sectors silently for days or weeks before giving up completely. Once you see hard error messages though, it is time to replace the drive. Don’t reboot the system if it is running and complaining; rather instead get your new drive prepared by doing a base install of the same version OS with similar partitioning and slices of equal size or greater. Then shut down the system. I usually do not try to fsck the drive if the system has not crashed (in the case of hard errors the drive can no longer tell the good sectors of the disk from the bad sectors -so don’t run fsck as it worsens the data loss). After the new drive is prepared in a new machine, you should be ready to try copying the data over from the old. The information here covers rescuing a FreeBSD system but the basic steps should apply to any OS. It’s not meant to be a step-by-step howto but rather a general overview of the process with some details omitted. If you’re attempting to save a drive, you should already have some detailed knowledge of the process involved so take your steps carefully to avoid data loss. When the machine is still running (and it’s a UNiX-like OS) you may want to try a dump of the file systems over the network to a machine with sufficient capacity for the job. For speed however, copying locally seems to be much faster.
I tend to use dd for saving damaged drives by copying the contents of the old drive to a new one. If you have had a serious amount of errors (and lots of time) you can also try the spinrite program to recover data but if you have hard errors you should simply try to get data off drive as fast as possible using dd to do a sector by sector copy.
We use the dd switch ‘conv=noerror’ to prevent it from dieing on errors and ’sync’ to pad input block to the input buffer size. Don’t specify a block size (bs) with no conversion values other than ‘noerror’, ’sync’, and ‘ntrunc’ and you should have no aggregation of short (empty) blocks which might be safer for copying partitions.
- Do a base install of target OS so you have the partitions ready first or manually create partitions.
- Boot in single user mode. (boot –s) freebsd
- Use dd if=/src of=/dest conv=noerror,sync where src is for example /dev/ad0s1g
Mounting and fscking a bad FS seems to only make the drive worse. Move off data ASAP and then tend to the bad/missing files.
Data transfer from IDE to IDE proceeded for me at about 1.9 MB per second with the bad drive in place so copying 60G of data may take a full workday to copy this way.
Other things to try (Linux):
http://www.garloff.de/kurt/linux/ddrescue/
http://www.kalysto.org/utilities/dd_rhelp/index.en.html A front end for dd_rescue
http://www.simplicidade.org/notes/archives/2005/02/recover_day.html a dd_rescue story
Also you can try this:
Install the new drive in the computer either with the existing drvie as the master or do a fresh base install of the new OS same version as you are replacing.
Boot single user at: ok boot –s
Hit enter for /bin/sh shell, fsck –yp, mount –u / , mount –a , swapon –a
Run sysinstall from /stand/sysinstall or /usr/sbin/
Go to Configure-> Fdisk and add the new drive device. (eg ad2 for secondary master)
In fdisk chose A for auto defaults for entire disk, enter Q to quit
Install standard boot manager when prompted
Returning to sysinstall choose Label for disklabel
Copy the label as closely as possible from the original fstab in /etc/fstab
Create your new partitions on the new drive and use M to re-label / (eg /dev/ad2s1a) only mount point to /mnt and all others as regular /var, swap, and /usr. Type Q to quit and exit sysinstall
Now mount the new file systems eg:
mount /dev/ad2s1a /mnt
mount /dev/ad2slf /tmp
mount /dev/ad2s1g /usr
mount /dev/ad2s1e /var
copy the existing filesystem using tar
# tar –cfk - –one-file-system –ignore-failed-read –C / –exclude=’mnt/*’. | tar –xpvf – C /mnt
# tar –cfk - –one-file-system –ignore-failed-read –C /usr . | tar –xpvf – C /mnt/usr
# tar –cfk - –one-file-system –ignore-failed-read –C /var –exclude=’mnt/*’. | tar –xpvf – C /mnt/var
Shutdown and remove the old drive. Boot the new one single user and fsck it fsck –p and mount it. Reboot and all should be well again.