Honeynet Project Scan of the Month Challenge 22

Reverse Engineering the foo binary

There were a number of characteristics of the foo binary that were similar to those of the-binary in the Reverse Challenge. These included:

Given these similarities, I assumed that the operating system used to compile the-binary, namely Slackware 3.1, was used to compile foo. This assumption later proved to be correct.

My next step was to obtain Slackware 3.1's libc.a and libgcc.a. I could then set about reconstructing the symbol table, using a version of Dion Mendel's search_static, modified slightly to run under Cygwin (but hopefully continues to run under other modern unix and unix-like systems).

$ bin/search_static foo slackware3.1 >
files/object_files
Once I had resolved the conflicts (see Resolving symbol table conflicts)
$ bin/gensymbols
files/object_files > files/symbols
$ bin/decomp_strip files/object_files < dump1 > dump2
$ bin/decomp_insert_symbols files/symbols < dump2 > dump3
$ grep 'call   0x' dump3 | grep -v '<' | cut -c18- | sort | uniq
0x08048080
0x08048134
0x08048258
0x080482b8
0x08048300
0x08048318
0x08048384
0x0804841c
0x08048670
0x080489a8

From Dion's work with a dummy C file, I knew that 0x8048080 is the common gcc start up code, and 0x8048134 is the main function. From this, the symbol table can be updated to include

0x08048134 main
0x08048258 func1
0x080482b8 func2
0x08048300 func3
0x08048318 func4
0x08048384 func5
0x0804841c func6
0x08048670 func7
0x080489a8 func8
$ cp files/symbols files/symbols.modified
$ vi symbols.modified
$ bin/decomp_insert_symbols symbols.modified < dump3 > dump4

Now that we have a disassembled program containing symbols, we can generate a call graph

$ egrep '<main>|<func.*>' dump4
080480e6: call   0x08048134 <main>
# 08048134 <main>:
08048223: call   0x0804841c <func6>
08048240: call   0x08048670 <func7>
# 08048258 <func1>:
# 080482b8 <func2>:
# 08048300 <func3>:
# 08048318 <func4>:
08048337: call   0x08048258 <func1>
# 08048384 <func5>:
080483c1: call   0x08048258 <func1>
# 0804841c <func6>:
08048448: call   0x080482b8 <func2>
0804849b: call   0x08048318 <func4>
080484c0: call   0x08048384 <func5>
08048536: call   0x08048318 <func4>
0804855b: call   0x08048384 <func5>
08048586: call   0x08048300 <func3>
080485b8: call   0x08048300 <func3>
08048643: call   0x08048300 <func3>
# 08048670 <func7>:
0804869d: call   0x08048258 <func1>
0804879c: call   0x080489a8 <func8>
# 080489a8 <func8>:

This makes generating the call graph trivial:

main
  +-- func6
  |     +-- func2
  |     +-- func3
  |     +-- func4
  |           +-- func1
  |     +-- func5
  |           +-- func1
  |
  +-- func7
        +-- func1
        +-- func8

The next stage was to tidy the assembly code, and make it easier to follow, using

$ bin/decomp_fixup_signs < dump4 > tmp1
$ bin/decomp_xref_data foo < tmp1 > tmp2
$bin/decomp_xref_jumps < tmp2 > dump5

Once this was done, I first went through the system calls, working out what was passed, and therefore what was achieved. In func2, for example, a socket is created,

080482be: push   $0x11
080482c0: push   $0x2
080482c2: push   $0x2
080482c4: call   0x08063ff0 <socket>

Examining the Slackware socket.h, it is clear that socket(2,2,17) translates to socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP) - that is, a UDP socket is set up. From this information, I renamed func2 open_udp.

Another example is func3, which seems to have one purpose - to close whatever is passed in as an argument. This function was renamed close_socket.

Much of the other information was worked out over time - for example, the call to sendto in func4 requires an argument of type struct sockaddr. From this, I worked out that some of the code in that function must have been setting the sa_family field, and other parts the sa_data. Unfortunately, it took me longer (with no prior socket programming experience) to realise that what was really used as an argument to sendto was a struct sockaddr_in. Running foo under strace really helped in that particular part of the discovery phase.

My translation of the behaviour of foo, from assembly to C, is available.

Resolving symbol table conflicts

Conflict 1

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/auth_none.o - match at 0x080744b4 (0x0000010f bytes)
# slackware3.1/pthread_stubs.o - match at 0x080745b0 (0x00000013 bytes)
# slackware3.1/__errno_loc.o - match at 0x080745a4 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x080745a4 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x080745a4 (0x0000000c bytes)
# slackware3.1/_clear_cache.o - match at 0x0807459c (0x00000007 bytes)
# slackware3.1/_clear_cache.o - match at 0x080745bc (0x00000007 bytes)
# slackware3.1/_udiv_w_sdiv.o - match at 0x0807459c (0x00000007 bytes)
# slackware3.1/_udiv_w_sdiv.o - match at 0x080745bc (0x00000007 bytes)

Looking at the surrounding object files:

slackware3.1/loadlocale.o - match at 0x0807401c (0x00000497 bytes)
slackware3.1/xdr_ref.o - match at 0x080745c4 (0x00000105 bytes)

there is an 0x111 (0x80745c4-(0x807401c+0x497)) byte gap. The object file that best fits this gap is auth_none.o at 0x10f bytes.

Conflict 2

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/__errno_loc.o - match at 0x0804db80 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x0804db80 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x0804db80 (0x0000000c bytes)

Looking at the object files used prior to 0x0804db80, we see:

slackware3.1/res_comp.o - match at 0x0804b678 (0x00000717 bytes)
slackware3.1/res_init.o - match at 0x0804bd90 (0x0000089c bytes)
slackware3.1/res_query.o - match at 0x0804c62c (0x00000655 bytes)
slackware3.1/res_send.o - match at 0x0804cc84 (0x00000ef9 bytes)

This suggests that the correct object file is __res_loc.o

Conflict 3

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/__errno_loc.o - match at 0x08064160 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x08064160 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x08064160 (0x0000000c bytes)

Looking at the object file used prior to 0x08064160, we see:

slackware3.1/_strerror.o - match at 0x08064110 (0x0000004e bytes)

As __res_loc.o has already been used above, the match is most likely to be __errno_loc.o, as this provides errno, used as an argument to the strerror function.

Conflicts 4 and 5

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/asprintf.o - match at 0x0804ddf4 (0x00000018 bytes)
# slackware3.1/iosprintf.o - match at 0x0804ddf4 (0x00000018 bytes)
# slackware3.1/iosscanf.o - match at 0x0804ddf4 (0x00000018 bytes)

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/asprintf.o - match at 0x0804de0c (0x00000018 bytes)
# slackware3.1/iosprintf.o - match at 0x0804de0c (0x00000018 bytes)
# slackware3.1/iosscanf.o - match at 0x0804de0c (0x00000018 bytes)

These two conflicts are adjacent, and surrounded by object files:

slackware3.1/iofclose.o - match at 0x0804db8c (0x00000081 bytes)
slackware3.1/iofgets.o - match at 0x0804dc10 (0x0000005b bytes)
slackware3.1/iofopen.o - match at 0x0804dc6c (0x00000060 bytes)
slackware3.1/iofprintf.o - match at 0x0804dccc (0x00000053 bytes)
slackware3.1/iogetline.o - match at 0x0804dd20 (0x000000b8 bytes)
slackware3.1/ioprintf.o - match at 0x0804ddd8 (0x00000019 bytes)
(0x33 byte gap)
slackware3.1/iovsprintf.o - match at 0x0804de24 (0x00000065 bytes)
slackware3.1/iovsscanf.o - match at 0x0804de8c (0x00000043 bytes)
slackware3.1/iovfprintf.o - match at 0x0804ded0 (0x000035f7 bytes)
slackware3.1/iovfscanf.o - match at 0x080514c8 (0x00001be6 bytes)

It seems apparent that the appropriate files are iosprintf.o and ioscanf.o.

From the disassembly of foo, generated using objdump -d foo, the first call is made by:

 8048614:	83 c4 0c             	add    $0xc,%esp
 8048617:	a1 84 ad 07 08       	mov    0x807ad84,%eax
 804861c:	50                   	push   %eax
 804861d:	68 74 4a 07 08       	push   $0x8074a74
 8048622:	8b 45 0c             	mov    0xc(%ebp),%eax
 8048625:	50                   	push   %eax
 8048626:	e8 c9 57 00 00       	call   0x804ddf4

and the second by:

 80485e8:	68 84 ad 07 08       	push   $0x807ad84
 80485ed:	68 70 4a 07 08       	push   $0x8074a70
 80485f2:	8d 85 f4 fb ff ff    	lea    0xfffffbf4(%ebp),%eax
 80485f8:	8d 50 02             	lea    0x2(%eax),%edx
 80485fb:	52                   	push   %edx
 80485fc:	e8 0b 58 00 00       	call   0x804de0c

Examining the read only data, by using Dion Mendel's decomp_xref_data, 0x8074a74 contains SE%lu, and 0x8074a70 contains %lu. This suggests the call to 0x804ddf4 is iosprintf, and the call to 0x804de0c is iosscanf.