[ds@localhost tmp]$ file the-binary the-binary: ELF 32-bit LSB executable, Intel 80386, version 1, statically linked, strippedIt's an ELF binary for some operating system that runs on i386. Unfortunately the binary is statically linked and stripped. That means it's going to be rather difficult to disassemble. Let's look at the 'strings' output.
Here is the interesting string output that isn't from the libraries:
[mingetty] /tmp/.hj237349 /bin/csh -f -c "%s" 1> %s 2>&1 TfOjG /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:. PATH HISTFILE linux TERM /bin/sh /bin/csh -f -c "%s" %d.%d.%d.%d %u.%u.%u.%u %c%s
And here is some interesting string output that is from the libraries:
@(#) The Linux C library 5.3.12
Ok, now we know that this binary is supposed to be run on Linux. From the other strings we can guess that the binary opens a file in the /tmp directory at some time, and that it likely runs some shell commands. The '[mingetty]' looks like the kind of thing that a malicious hacking program might want to overwrite argv[0] with.
Ok, let's run it and see what it does!
[ds@localhost tmp]$ strace -ff ./the-binary execve("./the-binary", ["./the-binary"], [/* 31 vars */]) = 0 personality(0 /* PER_??? */) = 0 geteuid() = 500 _exit(-1) = ? [ds@localhost tmp]$Wow, that's pretty boring. I guess it wants me to run it as root. Feeling reckless? The nice honeynet folks wouldn't actually send me a binary to reverse engineer that destroys my computer would they?
(Just kidding, I actually ran this in a VMWare virtual machine that is running in nonpersistant mode. The binary can eat all my files and all I have to do is reboot)
[ds@localhost tmp]$ su Password: [root@localhost tmp]# strace -ff ./the-binary execve("./the-binary", ["./the-binary"], [/* 29 vars */]) = 0 personality(0 /* PER_??? */) = 0 geteuid() = 0 sigaction(SIGCHLD, {SIG_IGN}, {SIG_DFL}, 0x40063848) = 0 fork() = 13295 [pid 13295] setsid() = 13295 [pid 13295] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0 [pid 13295] fork() = 13296 [pid 13294] _exit(0) = ? [pid 13295] _exit(0) = ? chdir("/") = 0 close(0) = 0 close(1) = 0 close(2) = 0 time(NULL) = 1022881147 socket(PF_INET, SOCK_RAW, 0xb /* IPPROTO_??? */) = 0 sigaction(SIGHUP, {SIG_IGN}, {SIG_DFL}, 0x40063848) = 0 sigaction(SIGTERM, {SIG_IGN}, {SIG_DFL}, 0x40063848) = 0 sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0 sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0 recv(0,Ok, so now we know that the binary:
This is about as far as we're going to get without actually disassembling the binary to see what it does. Unless we can solve the problem of the binary not having any symbols, we're going to spend a year disassembling half of libc.
Why don't we try putting the symbols back into the binary?
How are we going to do that?
Now in order for this to work very well we need to locate the exact original static libc file that the binary was built with. Remember that there was a pretty bit clue in the strings output:
@(#) The Linux C library 5.3.12I decided to go looking for Redhat RPMs containing a static version of libc 5.3.12. I choose Redhat for no other reason than it's a fairly popular linux distribution. I was surprised to find that the most recent version of Redhat that shipped with an rpm with a static version of this library was 5.2. I grabbed the static libc from version 5.2 and 5.0. I used these versions to develop the unstrip utility and after I had it matching object files that the older version produced more matches than the newer version. Very curious, I decided to investigate 4.x versions. I could only find updates on the internet, not the original RPMs so I tried those and found that an update to 4.0 was the best match that I could find. Actual redhat distributions older than 5.x are impossible to find on the internet, so I hunted down a CD version of Redhat 4.0 which I was lucky enough to stumble across. The libc.a from the RPM that shipped with Redhat 4.0 fit like a glove!. I am fairly certain that the binary was built on this platform. Using unstrip with this binary resolved the symbols for all of the library functions that the binary used. There were a couple of misidentifications that were easily corrected.
A detailed description of the protocol for communication between the client and the server can be found in the advisory.