Analysis for Scan of the Month 15. Doing it the Hard Way!

By Albert Bendicho, May 2001

mail: bendi#at#redestb.es

Introduction

What we want to do

The purpose of our work is to recover the rootkit on the disk. But this time we will use only "unrm" and "lazarus" to learn more about the use of these tools. There's an easier approach with "icat".  That would be the preferred way, we are providing this other solution to practice with and know about "unrm" and "lazarus".

How do we do it

First we recover all the unallocated blocks on the disk and try to get some structure out of them. We will do it by using "unrm" and "lazarus" with some help from "entropy". Those are part of "The Coroner's Toolkit" (TCT) by Dan Farmer and Wietse Venema (available at http://www.porcupine.org/forensics/tct.html).

Next we analyze the pieces found through this process to try to find the rootkit. Giving a fast look at those pieces we find what seems to be the rootkit as well as several scripts and files that seem to belong to the rootkit.

After that we recover the rootkit by "re-building" it from the pieces found by unrm/lazarus.

Knowing what the rootkit does we do some analysis on the other pieces found by unrm/lazarus.

What we need

All we need is the image to analyze, TCT, patience and a BIG amount of free disk space. We have to store the image, the output of unrm, the output of lazarus and some extra space to try to reconstruct the rootkit. That's something like 700MBytes of storage. We are going to store the image file (honeypot.hda8.dd) on /honeynet.scan. We are using RedHat 7.0 with a 2.2.16-22 kernel (using a 2.4 kernel gave problems trying to mount honeypot.hda8.dd).

Steps to recover the rootkit

Execute unrm and lazarus

Once whe have TCT installed we go to the directory that contains honeypot.hda8.dd (/honeynet.scan) and execute

/path/to/TCT/bin/unrm honeypot.hda8.dd > unrm.results

This collects all unallocated data blocks and puts them on the file unrm.results.

Next we run "lazarus" on "unrm.results".

/path/to/TCT/bin/lazarus -h -D /honeynet.scan/blocks -w /honeynet.scan/www unrm.results

"Lazarus" tries to recognize blocks of information and classifies them leaving the resulting blocks of information on different files. By using the -h option it generates a set of html files that makes our lives easier to browse on the results later on. The -D and -w parameters are to have the output organized.

Looking around the results

We open an html browser and point it to "file:/honeynet.scan/www/unrm.results.frame.html". We immediately identify a really "attractive" zone with compressed information (starting with an uppercase "Z"). If a root kit has been present on the system it surely arrived in a compressed form. We look and see that it's block #9. To test we run "file" on that block. That is;

file blocks/9.z.txt

We see that it's a gzip file. A good suspect on being a compressed rootkit. But before analyzing it further we continue looking arround.

The next "attractive zones" are two zones with programs (starting with an uppercas "P"). We check the first one (it turns out to be block 8499) and we see that it's a shell script. A closer look reveals that it's an installer for a root kit. We can also see that the script ends by doing

cd /
rm -rf last lk.tgz computer lk.tar.gz

so now we know that probably the installed rootkit was named lk.tgz or lk.tar.gz. We will do a deeper analysis of this installation script later on.

The second program "zone" (block 8529) is a perl script to sort out the output of LinSniffer. So it seems that LinSniffer it's installed too, and in fact we can see it in the installation script from block 8499.

The two program zones are quite close, so we look closely to the blocks arround block 8499 and 8529 and we find several parts of the rootkit (or perhaps other pieces that don't belong to the rootkit). We can specially mention;

Ok, so it seems that the rootkit was installed and deleted, but we'll find more evidence later on. Now its time to recover the rootkit.

Try to reconstruct the tgz file

We've seen that block #9 contains a gzip file, and in fact we can suspect that it was once "lk.tar.gz" or "lk.tgz". To try to see if we are in the good way we copy block #9 to a file named attempt1.tgz since it's our first attempt to recover the toolkit. So we issue the following commands;

cp blocks/9.z.txt attempt1.tgz
tar -xvzf attempt1.tgz
we get
last/
tar: Archive contains future timestamp 2002-02-08 14:08:13
last/ssh
tar: Skipping to next header
gzip: stdin: unexpected end of file
tar: 4 garbage bytes ignored at end of archive
tar: Child returned status 1
tar: Error exit delayed from previous errors

First we see a directory named "last". Sounds familiar; we've seen the order to delete it on the script at block 8499

We see that there's a future timestamp. Probably that means that who created the file before had a bad time on its machine. If we wanted a "paranoic" forensic study we should make sure that it doesn't imply anything else (the chances are really low, but in a forensic study you have to consider everything!). We won't do it here.

Next we see ssh, so we have more evidence that ssh was involved on this rootkit. But it's not completely reconstructed, gzip complains about an "unexpected end of file". That could make us think that we don't have the original file complete, but we have to be carefull, since it can also mean that what we have recovered contains "garbage" in it. We don't have to forget that we're trying to recover a file from the unallocated zones, so we can easily have "garbage" on the midle and, most important, this garbage can make our file unrecoverable. We look at the file we are trying to recover "attempt1.tgz" and we see that it's size (an "ls -l attempt1.tgz" reveals a size of 237,568 bytes) is big enough as to contain a copy of "ssh" tared and much more, so we are probably facing the "garbage" problem. Another clue is that tar says "tar: Skipping to next header", so it seems that the header it finds after ssh is not well formatted, so we have more hints that suggest the "garbage" problem.

But we have to be careful! Perhaps the problem it's that the file "ssh" it's not necessarily a version of "secure shell" as we know it. It could even be something completely different, but for now let's try the "garbage" approach.

Ok, so we have to find out what blocks on attempt1.tgz don't belong to the original ".tgz" file. Here comes at rescue another nice application that comes with TCT; "entropy". It's an application that gives the entropy of a file. For those that don't know the entropy of a set of data measures how much information this set contains. So if we look at the entropy of a file filled with zeros we'll get an entropy of 0, there's almost no information there. On a file filled with text we'll get a medium value, since it really has information, but on a compressed file (there are lot's of them nowadays, not only on data but audio, video and images are usually highly compressed) the entropy is really high.

So we want to find what doesn't belong to the file, or in another words, we want to find what blocks have low entropy. But looking at the entropy of the whole file won't help much, so what we do is split attempt1.tgz in several files; one per block, and then we look at the entropy of each block to find out the one with less entropy. To split attempt1.tgz we use the "split" command, part of the GNU textutils. So we issue the following commands;

split -b 1k attempt1.tgz
/path/to/TCT/extras/entropy/entropy x*|sort -n +1

we get the following lines

xam 3.68 0 0.00 0.00
xcy 6.83 0 0.00 0.00
xho 7.28 0 0.00 0.00

followed by a lot of blocks with higher entropy up to

xbp 7.84 0 0.00 0.00

So we can see that the block "xam" has an entropy way lower than the others. In fact if we look at xam, with hexdump for example we see that it doesn't look like part of a compressed file at all; it has an structure that's too regular. xam, given "split"'s convention on naming files, corresponds to block number 13 inside of attempt1.tgz. So we will construct attempt2.tgz as a copy of attempt1.tgz without it's 13th block and try again to uncompres/untar it. We issue the following comands;

dd if=attempt1.tgz of=attempt2a.tgz bs=1024 count=12
dd if=attempt1.tgz of=attempt2b.tgz bs=1024 skip=13
cat attempt2a.tgz attempt2b.tgz > attempt2.tgz
tar -xvzf attempt2.tgz

we get the following output

last/
tar: Archive contains future timestamp 2002-02-08 14:08:13
last/ssh
last/pidfile
last/install
last/linsniffer
last/cleaner
last/inetd.conf
last/lsattr
last/services
last/sense
last/ssh_config
last/ssh_host_key
last/ssh_host_key.pub
last/ssh_random_seed
last/sshd_config
last/sl2
last/last.cgi
last/ps
 
gzip: stdin: unexpected end of file
tar: 192 garbage bytes ignored at end of archive
tar: Unexpected EOF in archive
tar: Child returned status 1
tar: Error exit delayed from previous errors

OK! We get better results, so we can conclude that we have eliminated garbage. But again we face the dilema of deciding if we're missing blocks or if we have more garbage. The fact that tar hasn't complained is a hint to indicate we're missing blocks. So we try to grow our file. The logical thing is to use the next block identified by lazarus. That is block #241 so we make add this blocks to our result so far;

cat attempt2.tgz blocks/241.x.txt > attempt3.tgz
tar -xvzf attempt3.tgz

we get

last/
tar: Archive contains future timestamp 2002-02-08 14:08:13
last/ssh
last/pidfile
last/install
last/linsniffer
last/cleaner
last/inetd.conf
last/lsattr
last/services
last/sense
last/ssh_config
last/ssh_host_key
last/ssh_host_key.pub
last/ssh_random_seed
last/sshd_config
last/sl2
last/last.cgi
last/ps
last/netstat
last/ifconfig
last/top
tar: Skipping to next header
 
gzip: stdin: unexpected end of file
tar: 486 garbage bytes ignored at end of archive
tar: Child returned status 1
tar: Error exit delayed from previous errors

Again, before we get the gzip error of "unexpected end of file" we see "tar: Skipping to next header", so we repeat the process we've done with attempt1.tgz to clear the garbage (the "rm x*" is to delete the previous "split" output, just in case);

rm x*
split -b 1k attempt3.tgz
/path/to/TCT/extras/entropy/entropy x*|sort -n +1

we get

xki 0.02 0 0.00 0.00
xkj 3.48 0 0.00 0.00
xcx 6.83 0 0.00 0.00

followed by a lot of blocks with higher entropy up to

xnt 7.84 0 0.00 0.00

So it seems that the culprit is now xki, that's block 269 (k=11, i=9, (11-1)*26+9 = 269) of attempt3. xkj looks also to have a low entropy and is just behind xki. Looking at both of them with hexdump (or ghex) reveals that they're probably not part of a compressed file, so we proceed to construct our next attempt following the same process as before but skipping two blocs;

dd if=attempt3.tgz of=attempt4a.tgz bs=1024 count=268
dd if=attempt3.tgz of=attempt4b.tgz bs=1024 skip=270
cat attempt4a.tgz attempt4b.tgz > attempt4.tgz
tar -xvzf attempt4.tgz

We get

last/
tar: Archive contains future timestamp 2002-02-08 14:08:13
last/ssh
last/pidfile
last/install
last/linsniffer
last/cleaner
last/inetd.conf
last/lsattr
last/services
last/sense
last/ssh_config
last/ssh_host_key
last/ssh_host_key.pub
last/ssh_random_seed
last/sshd_config
last/sl2
last/last.cgi
last/ps
last/netstat
last/ifconfig
last/top
last/logclear
last/s
last/mkxfs
 
gzip: stdin: unexpected end of file
tar: 309 garbage bytes ignored at end of archive
tar: Unexpected EOF in archive
tar: Child returned status 1
tar: Error exit delayed from previous errors
 

OK! Again we've cleared some garbage and we've uncompressed/untared 3 more files. Now we're missing more blocks. Again we get the next block recovered by lazarus. That's block #385. So we repeat our process;

cat attempt4.tgz blocks/385.x.txt > attempt5.tgz
tar -xvzf attempt5.tgz

we get

last/
tar: Archive contains future timestamp 2002-02-08 14:08:13
last/ssh
last/pidfile
last/install
last/linsniffer
last/cleaner
last/inetd.conf
last/lsattr
last/services
last/sense
last/ssh_config
last/ssh_host_key
last/ssh_host_key.pub
last/ssh_random_seed
last/sshd_config
last/sl2
last/last.cgi
last/ps
last/netstat
last/ifconfig
last/top
last/logclear
last/s
last/mkxfs
 
gzip: stdin: decompression OK, trailing garbage ignored
tar: Child returned status 2
tar: Error exit delayed from previous errors

So it seems that at last we got tar without errors and gzip only complaining that it has trailing garbage. That's not important, but just if we want we can try to guess what was the original compressed file size was we can re-tar/compress the result (the "last" directory) with;

tar -cvzf attempt5-back.tgz last

We get a file with a size of 520.334 bytes. So we try to cut our attempt5.tgz to some similar size without errors. After few trials we end up with;

dd if=attempt5.tgz of=attempt6.tgz bs=1024 count=509
tar -xvzf attempt6.tgz

we get

last/
tar: Archive contains future timestamp 2002-02-08 14:08:13
last/ssh
last/pidfile
last/install
last/linsniffer
last/cleaner
last/inetd.conf
last/lsattr
last/services
last/sense
last/ssh_config
last/ssh_host_key
last/ssh_host_key.pub
last/ssh_random_seed
last/sshd_config
last/sl2
last/last.cgi
last/ps
last/netstat
last/ifconfig
last/top
last/logclear
last/s
last/mkxfs

So we have concluded our succesfull recovery of the rootkit!!

Finding traces of the Rootkit

The main method to find the traces has been;

Executing "strings" on the uncompressed file from "attempt6.tgz" do a "grep" on the blocks directory of "lazarus" to find the blocks of deleted files.

Here's some of the results we get;

We haven't gone further, but probably almost everything will be found (at least anything not overwritten!!)