The Linux and Unix Menagerie: Using TCT To Recover Lost Data On Linux Or Unix

Hey there,

Today's post is a follow up to yesterday's post (conspicuously titled Recovering Lost Data On Linux or Unix Using TCT, or something like that ;). Please refer to that post for the 3 or 4 paragraphs of over-explanation of some of the minutiae which may or may not be helpful to you :)

Today, we'll move on to the second (easier, but more time-consuming) method of recovering your deleted data (on any Linux or Unix system) using The Coroner's Toolkit (TCT). Today, we'll be using lazarus to make file recovery somewhat simpler. Again, please see yesterday's post on Recovering Lost Data if you're recovering simple text files and/or just want to use unrm and be done with it.

The situation today, will be the same as yesterday.

THE SITUATION: You've created a text file, containing valuable information that you couldn't commit to memory, using your favorite text editor and saved it. Then, an hour or so later, you accidentally deleted it, realizing that you'd completely screwed up just seconds after pressing the enter key. Deja Vu? ;)

host # cat /usr/THE_ALMOST_LOST_FILE
we'll just put some
semi-random text in
here to see if we can
find this later with
grep.  For simplicity's
sake, we'll include the
word semi-unusual so that we
have something in this
file that probably won't
be in any other files
host # rm /usr/THE_ALMOST_LOST_FILE
host # cat /usr/THE_ALMOST_LOST_FILE
cat: cannot open /usr/THE_ALMOST_LOST_FILE

Now, since we've already created our recovery area (required to be on a separate partition than the one on which we lost the data) and have run unrm to create that one gigantic file composed of all the free blocks on the partition where we accidentally deleted our file, we're ready to make the process of data recovery simpler using lazarus (Longest sentence ever? Maybe not ;)

Lazarus is a simple tool to run, but it does come with a few caveats:

1. Unlike unrm(not a double-negative ;), which only requires that you have 100% of the free space available on the partition where you deleted your file, available on your recovery partition, lazarus requires you to have 220% of that space available. These sentences are "killing" me! Apologies for any confusion caused by their Byzantine structure ;)

2. Lazarus depends on the output from unrm, so you'll need to run that first. Technically you can run lazarus against any file, but your results may be less than satisfactory.

3. Lazarus picks apart the gigantic block-file created by unrm and separates it into individual files and tags those files as being of a certain type (text, audio, HTML, C code, etc). This makes it take a very very long time to complete execution. Depending upon the amount of free space on the partition on which you deleted the file you wish to recover, you may be waiting days (literally) for lazarus to complete its work!

Below is a listing of the common file types lazarus will recognize. It will assign the corresponding letters to the files it cranks out. As an example, if it finds a block that's composed of "unresolved text" it will save it as: BLOCKNUMBER.TYPE.txt (All files will have the extension .txt, although lazarus does allow you to produce HTML output instead, with the -h flag). So, if it recovered block number 3714, which happened to be tar/cpio, etc file, it would name it: 3714.a.txt

THE LAZARUS BLOCK FILE OUTPUT LEGEND:

A "." represents unrecognized binary blocks of data.
type value color meaning

t 777777 gray unresolved text
f ff0000 bright red (alarm) sniffer stuff
m 0066ff blue mail
q 6633ff pale blue mailq files
s 6699ff purply emacs/lisp
p cc6666 greenish program file
c 336666 green C code
h ff99ff light purple HTML
w cc3333 reddish password file
l cc9900 light brown log file

Binary files are represented by:

type value color meaning
o bbbbbb light grey null block
r 000000 black removed block
x 000000 black binary exe
e d9d9i9 gold ELF
i 238e68 greenish JPG/GIF
a d19275 black cpio/tar/etc
z 336633 greenish compressed
! 000000 black audio

When you run lazarus, it can be as simple as just invoking the command (assuming you've run unrm already). Using our example from yesterday's post on Data Recovery, we could invoke it using the method below (and then kick back and wait and wait and wait...):

NOTE: The files, etc, used in these examples are the same as in yesterday's post. Please refer back to that post (hyperlinked to, a few times, above) if you have any questions about the command lines below that don't require further explanation in this context). The file entitled "the_found_file_I_hope" is the gigantic block file we created with unrm yesterday.

NOTE: Unless you run lazarus with the -D option, it will create the "blocks" subdirectory in TCT's base directory!

First, we'll check again to see if text from our deleted file even exists in our unrm block recovery file:

host # du -sh /usr/local/recovery/the_found_file_I_hope
1.7G /usr/local/recovery/the_found_file_I_hope
host # egrep -il 'unusual|later|random' /usr/local/recovery/the_found_file_I_hope
/usr/local/recovery/the_found_file_I_hope

Since we've confirmed that our deleted file is in there (your test may need to be more exact. This experiment has the luxury of being controlled and isn't subject to the normal laws of break-neck work stress ;) we'll go ahead and kick off lazarus:

NOTE: While the display below will seem cool for about 10 minutes, tops, you'll eventually have to walk away from your terminal (unless you can bring it with you into the bathroom ;) - The process run time for combing 1.7GB of data took about 17 hours on a middle-weight desktop server. I can't say for sure, because I left it to finish-up while I went on living my life ;) You'll also find that the output to your screen will be long and, probably, useless to you. Generally, it's not worth watching unless you're really jaded, tired and/or bored ;)

host # ./lazarus /usr/local/recovery/the_found_file_I_hope
....t........tt.....pp.......tt....t.....pp....pp...tt...t...pp..pp........c...
......!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!t.ttt...
....ttt.t.t.aaaaaaatt...tt.t...t....ttt...tt..t...t.tt...t....ttt.t...t.tt..tt.
.tt...tt..t.tt..tt.t...t.tt..tt...tt.t..t.t.....t.ttt.....t.......t...tt......t
t..tt.tt..t..tt..tt..tt..t..t...tt..tt..t...tt...........xxxxxxxxxxxt......aaaa
ahhh..ttt..ttt....ttt..t.t..t..t..t.t.t...t........aaaaaaaaat.t.t.....gggttt.tt
t.ttt.ttt.ttt.ttt.ttt..aaaaaaaaaaaaaaaaaat.t.aaaaaaaaacc..t........aaaaaattt.tt
t.ttt.ttt.ttt.ttt.ttt........t...t......xxxxxxxaaaeeeeee!!!ttt.....t...eeeeeeee
...

As noted above, since we ran lazarus with no command line arguments, the "blocks" subdirectory (which it creates by default) will be located in the base directory in which TCT resides (as opposed to the directory in which you invoked lazarus). In our case, for the sake of argument, we have the binary in /var/tmp/tct-1.18/bin and the "blocks" directory will get automatically created in /var/tmp/tct-1.18.

host # cd /var/tmp/tct-1.18
host # ls
Beware LICENSE TODO blocks help-recovering-file patchlevel
CHANGES MANIFEST TODO.before-next-release conf help-when-broken-into quick-start
COPYRIGHT Makefile additional-resources docs lazarus reconfig
Date OS-NOTES bibliography etc lib src
INSTALL README.FIRST bin extras man www
host # cd blocks
host # ls -1|wc -l
10163
host # ls
1...txt 116389.t.txt 138610.t.txt 162373...txt 174689...txt 178282.t.txt 191521...txt 19517.t.txt 213764.t.txt 38401.p.txt 6250.x.txt
...
116375...txt 13861...txt 162371.t.txt 174682.t.txt 178281...txt 191514.t.txt 195169...txt 213759...txt 38377.t.txt 62006.x.txt

You'll note that the contents of the "blocks" directory are truncated in the output above. This directory (once lazarus has completed its run) is filled with one file per character that you saw in the output during its run. In the interest of keeping this post under 300 pages, the middle was clipped ;)

Now, as we did yesterday, we can use grep to very simply discover which of the 10,163 recovered block files in the "blocks" directory most probably contains our deleted file:

host # grep -l semi-random *
16.t.txt

And, now we can (hopefully) verify that our file is actually in there. As with the output from yesterday's post (just using unrm), this file will probably contain plenty of garbage surrounding the simple text. In this case (since I don't believe in revisionist-history, I'm not going to delete the previous sentence ;) it turns out that my assumption was wrong and we have one very clean copy of our deleted text file :)

home # cat 16.t.txt
we'll just put some
semi-random text in
here to see if we can
find this later with
grep. For simplicity's
sake, we'll include the
word semi-unusual so that we
have something in this
file that probably won't
be in any other files
host #

And, that's all there is to it :) As mentioned previously, this process can take lots and lots of time, although it makes the data discovery much easier in the end; especially if you're dealing with binary data or once-contiguous blocks that are now all still available for recovery, but scattered about. If you use the -h flag to have lazarus output HTML data for you, it will create a mini-website in the "blocks" directory that can make it much easier to piece together binary data, like a picture or an audio file.

Have fun waiting, and here's to your success in recovering your lost data!

Cheers,

, Mike

Discover the Free Ebook that shows you how to make 100% commissions on ClickBank!

Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.