Archive for December, 2005

How to recover from deleted /usr on Debian

Monday, December 5th, 2005

Ok, this is the second time I have lost my /usr partition. The first was due to overwriting it, and this time a hard drive crashed. Fortunately, as long as you keep /usr/local and /usr/tmp either backed up or empty, the rest can be rebuilt reasonably automatically.

A second machine is required.

On the second machine:
1. Use dselect and update the package list (/var/lib/dpkg/available)

2. Build a list of required and important packages using the Package: and Priority: lines

3. apt-get clean; apt-get –download-only –reinstall install `cat required-essential-list`

4. Use dpkg -x to unpack all of the debs in /var/cache/apt/archives into a single directory

5. Tar/gz that up and either put it on a floppy/cd, or use netcat to send it over.

6. On the target machine, unpack the files directly into /usr

7. dpkg –get-selections | cut -f1 to create a list of packages to reinstall

One by one, reinstall the packages. I do this by first reinstalling all packages that start with lib, then going through the list alphabetically. If you see some programs not running, because of missing shared libraries, dpkg -S for the library and then reinstall the package it is in before proceeding.

In order to get alternatives to reinstall, remove them first using update-alternatives –remove-all [name]. This is to get commands like 'editor' back. You may want to remove all alternatives before beginning so that they will be automatically repaired as the packages get reinstalled (for a list, look under /var/lib/dpkg/alternatives)

Some packages will not be able to be reinstalled since you don't have sources for them anymore, just purge those packages. Others will not be able to be reinstalled because they depend on a library which conflicts with another package; just purge the package for now and put it in a list of things to be reinstalled later.

Some maintainer scripts will behave weirdly if things they expect to be installed actually are not around. Edit the offending script in /var/lib/dpkg/info/package.{pre|post}{inst|rm} and prepend an 'exit 0' to the top. You should probably reinstall all such packages later in order to get things back to a consistent state.

Then once everything has been reinstalled at least once and all the dependency problems cleared up, go through and reinstall everything again.

In the end, things should be mostly back to normal.

Why I hate computers

Sunday, December 4th, 2005

Within the span of a month:

– The third replacement of a Maxtor 200GB drive died in my second workstation, one month out of warranty. Data corrupt but recovered somewhat.

– A Maxtor 40GB drive in my main workstation died a click of death while running. Total loss.

– An attempted upgrade of my main workstation failed because I have a MS-6905 1.1 Rev B, the revision that happened to be recalled by MSI at some point years past. Of course, the recall is no longer honored.

– Upgrading the fileserver with a used 3ware card and 250GB Seagate drives in RAID5. Of course, it doesn't work. (6410) Fortunately, the seller refunded our money even though he claimed it worked.

– Obtained a 3ware 7504. It was working fine. Tonight the driver reported that the controller was not responding and kicked the array offline, necessitating a reboot. Of course this meant all the AFS volumes with open files need to be salvaged. And I'm certain this will happen again.

The moral of the story is to expect that anything even remotely related to a hard drive will fail in spectacular fashion at the point when it would cost you the most for it to do so.