Analysis of The-Binary




Thursday, May 30, 2002





Christopher Gragsone (chris.gragsone@eds.com)



Contents


1 Methodology Behind the Analysis
2 Binary Analysis
3 Code Analysis
4 Identifiable Trends
5 Appendix


Methodology Behind the Analysis


Source of the Binary

The binary was retireved from a compromised Honeynet University systems. The binary was downloaded, installed, and ran on the compromised system by the attacker. The binary has been confirmed with the originial binary's md5 hash using md5sum.

Essential Tools for this Investigation

I used Redhat 7.2 for the test machine which was used for dynamic analysis. Netstat and ps were used for analyze the binary in a dynamic state. File, readelf, and objdump were used to analyze and disassemble the binary. A personally developed tool called Nightmode was used to aid in the code analysis of the binary. Nightmode is a binary interragator which outputs a list of functions, library functions, strings, and their virtual memory location.



Binary Analysis


Initial Information Gathing

[scarbaci@testhost reverse]$ file the-binary
the-binary: ELF 32-bit LSB executable, Intel 80386, version 1, statically linked, stripped
The File command shows me that this binary has the libraries compiled in the binary, with its symbol tables stripped, and compiled for the Intel 386 architecture.
[scarbaci@testhost reverse]$ readelf the-binary -S
There are 13 section headers, starting at offset 0x31f2c:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al  [ 0]                   NULL            00000000 000000 000000 00      0   0  0  [ 1] .init             PROGBITS        08048080 000080 000008 00  AX  0   0 16  [ 2] .text             PROGBITS        08048090 000090 01f53c 00  AX  0   0 16  [ 3] __libc_subinit    PROGBITS        080675cc 01f5cc 000004 00   A  0   0  4  [ 4] .fini             PROGBITS        080675d0 01f5d0 000008 00  AX  0   0 16  [ 5] .rodata           PROGBITS        080675d8 01f5d8 004c4a 00   A  0   0  4  [ 6] .data             PROGBITS        0806d228 024228 00c084 00  WA  0   0  4  [ 7] .ctors            PROGBITS        080792ac 0302ac 000008 00  WA  0   0  4  [ 8] .dtors            PROGBITS        080792b4 0302b4 000008 00  WA  0   0  4  [ 9] .bss              NOBITS          080792bc 0302bc 0058dc 00  WA  0   0  4  [10] .note             NOTE            00000000 0302bc 000d5c 00      0   0  1  [11] .comment          PROGBITS        00000000 031018 000ea6 00      0   0  1  [12] .shstrtab         STRTAB          00000000 031ebe 00006c 00      0   0  1Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)
Using Readelf to output a list of the binary's section headers. This shows me how the binary is logically organized. By comparing this list to the list produced by other statically compiled binaries I can also tell from this that it is missing the .eh_frame, .got, .sbss, and .note.ABI-tag sections. These missing sections could be used as clues to determine the linker/compiler, or the version.
[scarbaci@testhost reverse]$ readelf the-binary -l
Elf file type is EXEC (Executable file)
Entry point 0x8048090
There are 2 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x08048000 0x08048000 0x24222 0x24222 R E 0x1000
  LOAD           0x024228 0x0806d228 0x0806d228 0x0c094 0x11970 RW  0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .init .text __libc_subinit .fini .rodata
   01     .data .ctors .dtors .bss
This is a list of program segments using Readelf. The segments are mainly use to seperate writeable memory segments and executable memory segments. Once again by comparing this list to the program segment lists of gcc statically compiled programs, I see that the "Note" program segment is missing.

Detection

The binary does not attempt to prevent itself from being seen by tools such as ps.

[scarbaci@testhost reverse]$ ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Aug09 ?        00:00:03 init [3]
root         2     1  0 Aug09 ?        00:00:05 [keventd]
root         3     1  0 Aug09 ?        00:00:00 [kapm-idled]
root         4     0  0 Aug09 ?        00:00:00 [ksoftirqd_CPU0]
root         5     0  0 Aug09 ?        00:00:02 [kswapd]
...
root     21793     1  0 08:12 ?        00:00:00 [mingetty]
The binary relies on changing its process name to [mingetty] avoid detection. The bracketed process name denotes a kernel thread. What is strange, is that the binary chooses to use the name of a terminal, which shouldn't be running in a kernel thread. I would assume that most administrators would quickly spot this as suspicious behavior, which is the opposite of the binary's goal. My only guess as to the choice of the name, is to hide from some automated security tool which scans running processes but not kernel threads.
[scarbaci@testhost reverse]$ netstat -anp
Proto Recv-Q Send-Q Local Address           Foreign Address         State
PID/Program name
tcp        0      0 0.0.0.0:1024            0.0.0.0:*               LISTEN
587/rpc.statd
tcp        0      0 127.0.0.1:1025          0.0.0.0:*               LISTEN
873/xinetd
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN
559/portmap
tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN
21109/X
...
raw        0      0 0.0.0.0:11              0.0.0.0:*               7
21793/[mingetty]
Using netstat to list all the open sockets and which process owns them. I see that our fake [mingetty] binary has open a raw socket using protocol 11. Opening a raw socket gives it the advatage of not putting the interface into promiscous mode, and not being quickly detected. By not using a traditional IP protocol (TCP/UDP/ICMP), the communication to and from the socket can bypass poorly configured firewalls and IDS's.

Strings Analysis

My program called Nightmode, reported 435 strings. Reviewing these strings it looks like the program is not encrypted or compressed. Most of the strings seem to belong to static libraries. No "elite speak", email addresses, websites weredetected. However there were suspicious strings that could point to the purpose of this binary.

[scarbaci@testhost reverse]$ nightmode the-binary
...
0x80675d8       [mingetty]
0x80675e6       /tmp/.hj237349
0x80675f5       /bin/csh -f -c "%s" 1> %s 2>&1
...
0x806766d       /bin/sh
0x8067675       /bin/csh -f -c "%s"
Here is the filtered output of nightmode listing the binary's strings and their virtual memory locations. The first string should be recognizable, as it is the string that the program overwrites its process name with. The second string is likely the path to its temp file. The last three stings point to the possiblity that this binary is a "bind shell", a program which listens to the network and runs received commands at a priviledged level.
[scarbaci@testhost reverse]$ strings -a the-binary
...
GCC: (GNU) 2.7.2.l.2
This output is generated using the strings tool to print out the strings in all the sections of the-binary. Here I have obtain a GCC version number, which can then be verified with the sections and program segments.



Code Analysis


Summary

Nightmode reports that the-binary contains atleast 292 library functions, and 192 non library functions. If one person was to reverse engineer all the non-library functions in one month. They would need to reverse approximity 6 functions a day.

Most of the functions seem to be never called, such as the functions which use the "bind shell" strings previously discussed. It seems that these dorment functions were part of the object code which was linked into the binary. It could be possible to continue reverse engineering the dorment functions to understand the purpose and inner workings of the other, or unknown tools which the attacker is using.

Initialization Phase

The Active functions can be grouped into two phases a initialization phase, which performs all the setup work, and a loop phase which handles the incomming and outgoing packets. In the initialization phase, it checks to see if it has an effective user id of zero. Then it replaces it's process name with [mingetty], does a double fork, kills off the parents, and makes itself the leader of a new program group. It closes off the stdin, stdout, and stderr file descriptors. Then it opens up a raw ip socket which will be used to recv incomming packets on protocol 11. Finally, it ignores the hang-up, segment fault, and child signals.

Loop Phase

The loop phase of the-binary performs the communication realy using four functions. First it recieves packet1 from the socket, then it checks the protocol, packet size, and a byte value in packet1. It then passes it to the Decryption Function (0x804a1e8), and then checks to see if the second byte of packet2 is less than 12. Then it modifies packet1, and sends it and packet2 to the Encryption Function (0x804a194). It then calls a function (0x8056058). Finally it sends out packet2 to the network using the Output Function (0x8048ecc).

The Decryption function starting with the last byte of packet1. It from subtracts the value of the previous byte from the current byte, checks to see result ofthat is greater than zero. If the result is less than 0, it adds 256 to that value. It then places this value in its respective position in packet2.

The Encryption function basically does the opposite of the decryption, using the modified packet1 and packet2 (which is the decrypted packet1), and stores it into packet2. After the Encryption Function, another function is called at 0x8056058 which I was not able to understand its purpose. The Output Function sends out a new packet using the data from packet2, using IP protocol 255, to an address generated by a function called at 0x8049138.



Identifiable Trends


This binary doesnt use any encryption, compression, or obfuscation to protect itself. There were no alterations made to the elf headers preventing disassemblers such as gdb or objdump from functioning. Its methods of keeping the binary unseen from the administrator is by modifying its process name, and using a non tcp/udp protocol.

The main trend I see with this binary, is its communication system. The attacker compromises a box and places a relay for his covert channels with his other compromised systems. Most likely he also has other binarys, which listen to this covert channel, that he uses to compromise more systems.

This may become a trend in which systems are compromised from systems in which the administrator doesnt observe any abnormal user activity, because the attacker does not need to login. In the future more binaries will use covert channels to recieve commands. Most likely using a trojan service which is more firewall friendly such as web, mail, and dns servers.



Appendix


A description of Nightmode is provided in nightmode.txt
Nightmode's output of the-binary is provided in nm-scan.txt