The Reverse Challenge Results

Home Page [Be-Secure]
Back

*******************************************************************
*                                                                 *
*             The "Nazgul" Attack tool: An Analysis               *
*                                                                 *
*******************************************************************

By G. Lamastra, P. Abeni, D. Sestito, E. Caprella
   F. Frosali, F. Coda Zabetta, G. Cangini
Be-Secure, Telecom Italia Labs
May 5th, 2002

1.Introduction
--------------

The following paper details the analysis that our research group
performed for the Honeynet Project's "Reverse Challenge".
The paper is organized as follows; section 1, discusses the initial
steps; section 2, discusses the Reverse Engineering process, 
focusing on the methodology we adopted; section 3, is a detailed
comment of the binary functions and implementation; section 4, 
presents the tests we performed on a real working system, while
section 5 proposes the possible countermeasures and presents our
conclusions.

After downloading the binary from the
http://www.honeynet.org/reverse/ site on May 16th, 2002 we have
immediately begun the analysis.
We started with some basic tests in order to understand the
nature of the binary executable; first of all, we run objdump
and ldd that helped to determine that the binary was statically
linked and it probably was a standard C program.
We also used strings to extract all intelligible data from the
binary. This way we gathered further evidence about the program
being linked with the libc 5.3.12 (see strings:387)

The strings analysis also brought evidence about possible 
functions of the binary. We guessed that some kind of DNS interaction
was involved, because of the presence of several resolver commands.
Most of the strings were clearly coming from the libc static
linking process.
Using a perl script, we extracted from the strings list everything
that could match a filename; we obtained the following list:

/tmp/.hj237349                 |   Clearly a temporary file
                               |
/bin/sh                        |
/bin/csh -f -c "%s"            |   Some kind of shell execution
/bin/csh -f -c "%s" 1> %s 2>&1 |
                               |
/sbin:/bin:/usr/sbin: \        |   A path (close to a shell 
/usr/bin:/usr/local/bin/:.     |   execution strings) are they
                               |   related?
                               |
/dev/console                   |
/dev/log                       |   File used during various
/usr/lib/zoneinfo              |   libc functions
/var/yp/bindings               |
/etc/locale/....               |

The binary has been code-named "nazgul" because this string has
been found embedded in the binary. Nazgul are "black knights" from
Tolkien "Lord of the Ring" book, a modern mythology classic
very appreciated in the computer/hacking community.

In order to proceed more effectively, we divided ourselves
into two distinct groups: 
- Group A, doing the reverse engineering stage;
- Group B, doing some live analysis on a working testbed.

2.Reverse Engineering
---------------------

The RE stage started by disassembling the binary-code using
the IDA freeware tool and the .ELF plugin.
After doing so, we had a 2802kB file .asm and a 5777kb .lst file
to shovel; we adopted a bottom-up approach, starting from the
Linux syscalls, which are easy to spot because of the familiar
int $0x80 code.
Using the Linux syscall List, we associated each entry point in
the program with a specific Linux syscall.
This was the first stage, that started the discrimination process
between the effective tool code and the standard libc code.
IDA did a great job in effectively identifying the subroutine
surrounding an instruction, making our job easier.

Working on the basic IDA disassembly file, we started by generating
a file (libcall) which associates at each function related address
the guessed libcall routine.
Using this file, we were able to substitute in the assembly code
the subroutine symbolic name in place of the numerical address.

Printf-like strings were another extremely useful hook for the
reverse engineering stage. By spotting these strings into the code
(again IDA greatly helps by providing symbolic names for such a
string), it was possible to discover where a printf() like
function was invoked. Traditionally, these functions are all
built upon a basic translation function, that does the real job,
and a group of wrapper-like functions which implement the different
libcalls. 
malloc() could be identified through the use of the mmap() syscall,
invoked on a non-significant file descriptor with the ANONYMOUS
flag set.
The fopen() and fclose() libcall could be identified looking for the
corresponding open() and close() syscall. After identifying fopen(),
it become easier to identify file-related operations.
Several string conversion functions (tolower, toupper, isXXXX) use
large look-up tables and can be identified through cross-reference
use of such tables.
The socket related libcalls are all built on the socketcall()
syscall; different parameters are used to implement different libcalls;
these parameters can be found in the libc source code and can be
used to reconstruct the different libcalls.

As final remark, we like to point out that several functions use
various flag combinations, which are represented in the assembly code
as hex numbers; in order to clarify their meaning, it is very useful
to browse files contained in /usr/include/bits, which often contains
relevant information.

At this point, several interesting piece of code are already clear;
for example, the socket() syscall immediately tell us that this
system uses some weird IP protocol number 11.
Moreover, the typical start-up sequence for a classical server
program (set of signals, close, fork, accept/receive) is easy to
recognize.

This stage has been the longest phase in the reverse-engineering
process; but after the libc decoding, the effective program size
reduced from 2802kB to 764kB. The complete libcall list is contained
in the libcall file. This process allowed us to discriminate the source
code originally written by the attacker from the residual libc code.

The second stage of the RE process has produced a human-readable
C source code from the remaining assembly code; we discuss the code
in the next section.

3.Results
---------

The main() code is easy enough for being reconstructed; it essentially 
consists of a standard server-like start-up sequence: 
- Checks to see if it started as root
- Modifies the process name as "[mingetty]" to hide it to the occasional
  sysadm; however, it is possible to find the real binary executing under
  the "[mingetty]" name by checking the exe link in /proc/ directory,
  where  is the process PID obtained through ps.
- It disassociates the process from the controlling terminal, close
  standard file descriptors, and chdir('/')
- A raw socket, using IP protocol 11 is created
- The program starts a never-ending loop which processes the received
  packet; it expects a standard IP packet with a 20 byte IP header;
  after the IP header, there is a kind of signature, represented by
  the first payload byte being 0x02
- A packet is considered valid if and only if
  * IP Protocol == 11
  * The first byte == 0x02 (IP.payload[0] = 0x02)
  * The received packet size is longer than 200 bytes (including the IP
    header)
- If a packet is valid, it is descrambled using a trivial deobfuscation
  function: starting from the third byte of the payload, we have:
  
  L = length(payload);
  clear[2] = scrambled[2] - 23;
  for (i = 3; i < L; i++) {
    clear[i] = scrambled[i] - scrambled[i-1] - 23;
  }
  
- In the fourth byte of the descrambled payload (payload[3]) we
  have a code which identifies the specific function requested
  to the zombie:
- Code 1: This is used to provide information about the zombie status;
  the response is sent using a raw socket, encapsulated as an IP packet
  with protocol 11. 
  The first two bytes are a signature: 0x03, any-value.
  They are followed by (0x00, 0x01, 0x07, X sequence), where X is
  1 if an attack is ongoing (in this case, the following byte
  identifies the attack type); 0 if no attack/child is active.
  The packet is scrambled, starting from the third byte, so that
  the signature is left unmodified (0x03, ANY).
  The packet size is a random value between 400 and 600 bytes.
- Code 2: this code is associated with 3 different sub-commands:
  * Subcommand 0: tells the zombie the IP address of the client,
  where to send reply packets;
  * Subcommand 1: Sets 9 random IP address and the IP address of the
    real client; these 10 IPs will be used for sending reply packet,
    in order to mask the real IP address of the client;
  * Subcommand 2: Sets 10 random IP address, which are used as decoys
    for sending reply packets, so to better mask the real IP address of
    the client;
  A command 2 should be sent to the zombie process, if the attacker wants
  to monitor the remote activity. Using subcommand 2, it is possible to
  create a fictitious set of random IP address which make the attacker
  identification process much more difficult.
  If the command 2 is not issued, an eventual response packet will be
  sent to a random IP address (derived from a 4 bytes uninitialized 
  value on the main() stack).
- Code 3: the parameter following the command byte code is a string which
  contains a command to execute. The zombie will redirect the output (both
  stdout & stderr) on the /tmp/.hj237349 file. When the command terminates,
  a reply packet is built with the data contained in the temporary file,
  the packet is scrambled and sent back to the server.
  The payload size is 400 bytes; if necessary, the zombie may send more
  than one packet. All the packets sent by the zombie to the client
  have payload[0]=0x03 and payload[1] = ANY (this is a kind of
  "signature", like the 0x02 value for the packets sent to the zombie).
  The first packet has payload[3]=0x03; the following packets have
  payload[3]=0x04; payload[2]=ANY in both cases.
- Code 4: Executes a DNS-Flood (attack #4, see below) against 8000 
  hard-coded addresses; it does a double-nested loop; the inner loop 
  iterates on the packet to send; the outer loop iterates on the IP 
  address. The starting IP address is chosen random.
- Code 5: The ICMP-Flood (attack #2, see below) is executed against the 
  target host.
- Code 6: A shell is spawned on port 23281; the shell is protected with
  a password: SeNiF; after issuing the password, the attacker has full
  access to the box. We used netcat to communicate with the zombie.
  The password is encoded in the binary as TfOjG (strings:259) which
  is exactly SeNiF where each letter has been substituted with the
  previous letter (S -> T, e -> f, ...).
- Code 7: Executes a command sent in the payload, but does not provide
  any output.
- Code 8: Kills the on-going attack/command.
- Code 9: Executes the DNS-Flood (attack #3, see below)  with the delay 
  parameter equal to 0.
- Code 10: Executes the SYN-Flood Attack (attack #1, see below)  with
  delay parameter equal to 0. 
- Code 11: Executes the SYN-Flood Attack with the specified delay
  parameter.
- Code 12: The DNS-Flood attack (attack #3) is executed with the specified 
  delay parameter.

  Commands 1,2,3,7,8 may be executed at any time; commands 4,5,6,9,10,
  11,12 can only be executed if no other attack is ongoing.
  
We discovered four different attack patterns:
- Attack #1: TCP SYN Flood
  This function gets the destination IP address and port, a flag (if
  set, the attack will use the given source IP address, otherwise the 
  source IP address will be randomly generated), a source IP address 
  (for spoofing purpose), a delay parameter (it is a counter value,
  set to X, after sending X packets, the zombie wait for a fixed small
  delay), and a pair (int flag, char *hostname); if this flag is set,
  the hostname will be looked up in the DNS and the corresponding
  address will be used for the attack.
  The function does an infinite loop, generate a SYN packet with the
  requested parameters, calculate the CRC, and sends it.
  The attack continues until a command 8 stops it.

- Attack #2: ICMP/UDP flood
  This function gets a flag (if set the attack will be an ICMP flood,
  otherwise it will be an UDP flood); a destination port (used only
  for UDP flood; ICMP flood are based on echo request packet, type 8);
  a source IP address (again, the IP is used for spoofing purposes); a 
  (int flag, char *hostname) pair used exactly as in attack #1.
  As in the above case, the function executes an infinite loop; it
  will assemble the IP packet, using a random TTL = 120 + random(130)
  and the same IP id for all the packets. CRC is calculated for UDP
  and ICMP packets.
  The attack continues until a command 8 stops it.
  
- Attacco #3: DNS Flood
  This function gets a destination IP address, a source IP address (used
  for spoofing purposes), a delay parameter (like the one used in attack #1),
  the source port and a (int flag, char *hostname) pair.
  This functions uses 9 different payload type, encapsulated in an UDP
  packets; these payloads are different kind of DNS requests.
  It seems that this function does not calculate the CRC.
  
- Attacco #4: DNS flood/2
  This function gets a destination IP address, a source port, a delay
  parameter and a (int flag, char *hostname) pair.
  This function attacks 8000 hard-coded IP address, sending the same 9 payload
  used for attack #3. The outer loop iterates over payload type, the inner
  loop iterates over IP address. The starting IP address is picked up randomly.

4.Live Analysis
---------------

The next step after obtaining the initial results, was the set up of a 
simple test-bed composed by two systems:
- A victim host, where the binary was run (IP: 10.0.0.1)
- A scanning host, used for traffic analysis and network mapping;
  (IP: 10.0.0.2)
The binary is launched using strace -f; this makes it possible to
follow the syscall execution.
This is the strace output:

execve("./the-binary", ["./the-binary"], [/* 37 vars */]) = 0
personality(PER_LINUX)                  = 0
geteuid()                               = 0
sigaction(SIGCHLD, {SIG_IGN}, {SIG_DFL}, 0x400557c8) = 0
fork()                                  = 3828
[pid  3828] setsid()                    = 3828
[pid  3828] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3828] fork()                      = 3829
[pid  3828] _exit(0)                    = ?
[pid  3829] chdir("/")                  = 0
[pid  3829] close(0)                    = 0
[pid  3829] close(1)                    = 0
[pid  3829] close(2)                    = 0
[pid  3829] time(NULL)                  = 1022154669
[pid  3829] socket(PF_INET, SOCK_RAW, 0xb /* IPPROTO_??? */) = 0
[pid  3829] sigaction(SIGHUP, {SIG_IGN}, {SIG_DFL}, 0x400557c8) = 0
[pid  3829] sigaction(SIGTERM, {SIG_IGN}, {SIG_DFL}, 0x400557c8) = 0
[pid  3829] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3829] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3829] recv(0,  
[pid  3827] --- SIGCHLD (Child exited) ---
[pid  3827] _exit(0)                    = ?
<... recv resumed> "E\0\0\36\200\341\0\0@\v\321\334\n\n\n\1\n\n\n\3XXXXXXX"...,
2048, 0) = 30
oldselect(1, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
recv(0, "E\0\0\36\366\26\0\0@\v\\\247\n\n\n\1\n\n\n\3XXXXXXXXXX"..., 2048, 0) =
30
oldselect(1, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
recv(0,  

In this example, we sent some garbage data (using the hping packet
injection tool), using IP protocol 11.
As expected, the process wakes up and tries to process the packet
since the packet does not match anything, the process executes a
usleep(10000).

We check using ps and we find a [mingetty] in the list

# ps
...
 1846 pts/1    S      0:00 bash
 1870 ?        S      0:00 [mingetty]
 1871 pts/1    R      0:00 ps ax

Checking the corresponding /proc/1870 entry, will show that the exe
entry is linked to the suspect binary.

As expected, if the program is not started as root, it will abort since
it cannot open the required raw socket.

Using the information gathered with the RE stage we assembled a
packet injector which is able to talk with nazgul.
The injector use the scrambling code we reverse-engineered from the
binary to process the packet.

We run tcpdump to capture all the packet exchanged
# tcpdump -i eth0 -s 0 -X
14:38:33.063965 10.0.0.2 > 10.0.0.1:  ip-proto-11 200 [tos 0x10] 
0x0000	 4510 00dc 00f2 0000 400b 6513 0a00 0002	E.......@.e.....
0x0010	 0a00 0001 0200 1734 bb2c 449b b3ca e1f8	.......4.,D.....
0x0020	 27c0 dbfa c530 489f c52f d4f2 1882 2745	'....0H../....'E
0x0030	 4c59 6f45 74de f64d 7c18 3352 6a81 98af	LYoEt..M|.3Rj...
0x0040	 9245 5eb5 5840 59b0 c736 4ea5 30a8 9eb8	.E^.X@Y..6N.0...
0x0050	 dfed 03d9 f05f 77ce b3cd e73e 41a9 c219	....._w....>A...
0x0060	 309f b70e f1a4 bd14 2b9a b209 2138 4f66	0.......+...!8Of
0x0070	 750e 2948 137e 96ed 92a8 364e f309 97af	u.)H.~....6N....
0x0080	 0614 2a00 2f99 b108 25c1 dcfb 4f3b 54ab	..*./...%...O;T.
0x0090	 8e9b b187 5a39 52a9 c02f 479e 35a7 bf16	....Z9R../G.5...
0x00a0	 2e45 5c73 8aa1 b8cf e6fd 142b 4259 7087	.E\s.......+BYp.
0x00b0	 9eb5 cce3 08e1 f84f 8ef2 0a61 90fa 1269	.......O...a...i
0x00c0	 9834 4f6e 8593 a97f 360f 267d 48b3 cb22	.4On....6.&}H.."
0x00d0	 d94a 62b9 d1e8 ff16 2d44 0240          	.Jb.....-D.@
1 packets received by filter
0 packets dropped by kernel

We also run a strace on the victim
# strace -f ./the-binary
execve("./the-binary", ["./the-binary"], [/* 49 vars */]) = 0
personality(0 /* PER_??? */)            = 0
geteuid()                               = 0
sigaction(SIGCHLD, {SIG_IGN}, {SIG_DFL}, 0x40053478) = 0
fork()                                  = 3381
[pid  3380] _exit(0)                    = ?
setsid()                                = 3381
sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
fork()                                  = 3382
[pid  3382] chdir("/")                  = 0
[pid  3382] close(0)                    = 0
[pid  3382] close(1)                    = 0
[pid  3382] close(2)                    = 0
[pid  3382] time(NULL)                  = 1022587878
[pid  3382] socket(PF_INET, SOCK_RAW, 0xb /* IPPROTO_??? */) = 0
[pid  3382] sigaction(SIGHUP, {SIG_IGN}, {SIG_DFL}, 0x40053478) = 0
[pid  3382] sigaction(SIGTERM, {SIG_IGN}, {SIG_DFL}, 0x40053478) = 0
[pid  3382] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3382] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3382] recv(0,  
[pid  3381] _exit(0)                    = ?
<... recv resumed> "E\20\0\334\0\362\0\0@\v{\23\177\0\0\1\177\0\0\1\2\0\027"..., 2048, 0) = 220
sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
fork()                                  = 3525
[pid  3525] setsid()                    = 3525
[pid  3525] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3525] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 1
[pid  3525] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3525] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3525] sigaction(SIGHUP, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3525] sigaction(SIGTERM, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid  3525] sigaction(SIGINT, {SIG_IGN}, {SIG_DFL}, 0x40053478) = 0
[pid  3525] setsockopt(1, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid  3525] bind(1, {sin_family=AF_INET, sin_port=htons(23281), sin_addr=inet_addr("0.0.0.0")}}, 16 
[pid  3382] oldselect(1, NULL, NULL, NULL, {0, 10000} 
[pid  3525] <... bind resumed> )        = 0
[pid  3525] listen(1, 3)                = 0
[pid  3525] accept(1,  
[pid  3382] <... oldselect resumed> )   = 0 (Timeout)
[pid  3382] recv(0,  

There is a listener on port 23281, as NMap shows:

# nmap -sS -p 23281 victim
Starting nmap V. 2.54BETA34 ( www.insecure.org/nmap/ )
Interesting ports on victim (10.0.0.2):
Port       State       Service
23281/tcp  open        unknown                 

Nmap run completed -- 1 IP address (1 host up) scanned in 0 seconds

Netcat shows the shell doing its work
# nc 10.0.0.2 23281
SeNiF
ls
hostname
darkmoor
...


In order to make our life more exciting, the attacker has encoded the
password shifting each letter by 1; so the corresponding password string
we find in the binary is TfOjG (see strings:259)

5.Countermeasures
-----------------

Although the attacker uses a scrambling scheme for packet cloacking,
it is fairly easy to block the control traffic; we would suggest to
block IP protocol 11 and port 23281/tcp on the border router.

This also give us few rules for our IDS:
- IP.Protocol == 11 and (payload[0] == 0x02 || payload[0] == 0x03)
- TCP.port = 23281

To analyze a compromised machine, it can be useful to look after
"[mingetty]" process in the ps/top list; if the machine has been
trojaned and this string would not show up, we developed a simple
client which supports a limited interaction with the nazgul zombie.
By sending a command 6, the zombie will open a TCP/23281 socket and
NMap will reveal (remotely) the presence of the zombie.

We believe that it is sufficient to remove the binary to sanitize
the system; however, since this zombie is usually installed after
a full root compromise, we cannot do any assumptions on other 
system components; hence a full reinstall should be performed.

Home Page [Be-Secure]
Back