*******************************************************************
* *
* The "Nazgul" Attack tool: An Analysis *
* *
*******************************************************************
By G. Lamastra, P. Abeni, D. Sestito, E. Caprella
F. Frosali, F. Coda Zabetta, G. Cangini
Be-Secure, Telecom Italia Labs
May 5th, 2002
1.Introduction
--------------
The following paper details the analysis that our research group
performed for the Honeynet Project's "Reverse Challenge".
The paper is organized as follows; section 1, discusses the initial
steps; section 2, discusses the Reverse Engineering process,
focusing on the methodology we adopted; section 3, is a detailed
comment of the binary functions and implementation; section 4,
presents the tests we performed on a real working system, while
section 5 proposes the possible countermeasures and presents our
conclusions.
After downloading the binary from the
http://www.honeynet.org/reverse/ site on May 16th, 2002 we have
immediately begun the analysis.
We started with some basic tests in order to understand the
nature of the binary executable; first of all, we run objdump
and ldd that helped to determine that the binary was statically
linked and it probably was a standard C program.
We also used strings to extract all intelligible data from the
binary. This way we gathered further evidence about the program
being linked with the libc 5.3.12 (see strings:387)
The strings analysis also brought evidence about possible
functions of the binary. We guessed that some kind of DNS interaction
was involved, because of the presence of several resolver commands.
Most of the strings were clearly coming from the libc static
linking process.
Using a perl script, we extracted from the strings list everything
that could match a filename; we obtained the following list:
/tmp/.hj237349 | Clearly a temporary file
|
/bin/sh |
/bin/csh -f -c "%s" | Some kind of shell execution
/bin/csh -f -c "%s" 1> %s 2>&1 |
|
/sbin:/bin:/usr/sbin: \ | A path (close to a shell
/usr/bin:/usr/local/bin/:. | execution strings) are they
| related?
|
/dev/console |
/dev/log | File used during various
/usr/lib/zoneinfo | libc functions
/var/yp/bindings |
/etc/locale/.... |
The binary has been code-named "nazgul" because this string has
been found embedded in the binary. Nazgul are "black knights" from
Tolkien "Lord of the Ring" book, a modern mythology classic
very appreciated in the computer/hacking community.
In order to proceed more effectively, we divided ourselves
into two distinct groups:
- Group A, doing the reverse engineering stage;
- Group B, doing some live analysis on a working testbed.
2.Reverse Engineering
---------------------
The RE stage started by disassembling the binary-code using
the IDA freeware tool and the .ELF plugin.
After doing so, we had a 2802kB file .asm and a 5777kb .lst file
to shovel; we adopted a bottom-up approach, starting from the
Linux syscalls, which are easy to spot because of the familiar
int $0x80 code.
Using the Linux syscall List, we associated each entry point in
the program with a specific Linux syscall.
This was the first stage, that started the discrimination process
between the effective tool code and the standard libc code.
IDA did a great job in effectively identifying the subroutine
surrounding an instruction, making our job easier.
Working on the basic IDA disassembly file, we started by generating
a file (libcall) which associates at each function related address
the guessed libcall routine.
Using this file, we were able to substitute in the assembly code
the subroutine symbolic name in place of the numerical address.
Printf-like strings were another extremely useful hook for the
reverse engineering stage. By spotting these strings into the code
(again IDA greatly helps by providing symbolic names for such a
string), it was possible to discover where a printf() like
function was invoked. Traditionally, these functions are all
built upon a basic translation function, that does the real job,
and a group of wrapper-like functions which implement the different
libcalls.
malloc() could be identified through the use of the mmap() syscall,
invoked on a non-significant file descriptor with the ANONYMOUS
flag set.
The fopen() and fclose() libcall could be identified looking for the
corresponding open() and close() syscall. After identifying fopen(),
it become easier to identify file-related operations.
Several string conversion functions (tolower, toupper, isXXXX) use
large look-up tables and can be identified through cross-reference
use of such tables.
The socket related libcalls are all built on the socketcall()
syscall; different parameters are used to implement different libcalls;
these parameters can be found in the libc source code and can be
used to reconstruct the different libcalls.
As final remark, we like to point out that several functions use
various flag combinations, which are represented in the assembly code
as hex numbers; in order to clarify their meaning, it is very useful
to browse files contained in /usr/include/bits, which often contains
relevant information.
At this point, several interesting piece of code are already clear;
for example, the socket() syscall immediately tell us that this
system uses some weird IP protocol number 11.
Moreover, the typical start-up sequence for a classical server
program (set of signals, close, fork, accept/receive) is easy to
recognize.
This stage has been the longest phase in the reverse-engineering
process; but after the libc decoding, the effective program size
reduced from 2802kB to 764kB. The complete libcall list is contained
in the libcall file. This process allowed us to discriminate the source
code originally written by the attacker from the residual libc code.
The second stage of the RE process has produced a human-readable
C source code from the remaining assembly code; we discuss the code
in the next section.
3.Results
---------
The main() code is easy enough for being reconstructed; it essentially
consists of a standard server-like start-up sequence:
- Checks to see if it started as root
- Modifies the process name as "[mingetty]" to hide it to the occasional
sysadm; however, it is possible to find the real binary executing under
the "[mingetty]" name by checking the exe link in /proc/ directory,
where is the process PID obtained through ps.
- It disassociates the process from the controlling terminal, close
standard file descriptors, and chdir('/')
- A raw socket, using IP protocol 11 is created
- The program starts a never-ending loop which processes the received
packet; it expects a standard IP packet with a 20 byte IP header;
after the IP header, there is a kind of signature, represented by
the first payload byte being 0x02
- A packet is considered valid if and only if
* IP Protocol == 11
* The first byte == 0x02 (IP.payload[0] = 0x02)
* The received packet size is longer than 200 bytes (including the IP
header)
- If a packet is valid, it is descrambled using a trivial deobfuscation
function: starting from the third byte of the payload, we have:
L = length(payload);
clear[2] = scrambled[2] - 23;
for (i = 3; i < L; i++) {
clear[i] = scrambled[i] - scrambled[i-1] - 23;
}
- In the fourth byte of the descrambled payload (payload[3]) we
have a code which identifies the specific function requested
to the zombie:
- Code 1: This is used to provide information about the zombie status;
the response is sent using a raw socket, encapsulated as an IP packet
with protocol 11.
The first two bytes are a signature: 0x03, any-value.
They are followed by (0x00, 0x01, 0x07, X sequence), where X is
1 if an attack is ongoing (in this case, the following byte
identifies the attack type); 0 if no attack/child is active.
The packet is scrambled, starting from the third byte, so that
the signature is left unmodified (0x03, ANY).
The packet size is a random value between 400 and 600 bytes.
- Code 2: this code is associated with 3 different sub-commands:
* Subcommand 0: tells the zombie the IP address of the client,
where to send reply packets;
* Subcommand 1: Sets 9 random IP address and the IP address of the
real client; these 10 IPs will be used for sending reply packet,
in order to mask the real IP address of the client;
* Subcommand 2: Sets 10 random IP address, which are used as decoys
for sending reply packets, so to better mask the real IP address of
the client;
A command 2 should be sent to the zombie process, if the attacker wants
to monitor the remote activity. Using subcommand 2, it is possible to
create a fictitious set of random IP address which make the attacker
identification process much more difficult.
If the command 2 is not issued, an eventual response packet will be
sent to a random IP address (derived from a 4 bytes uninitialized
value on the main() stack).
- Code 3: the parameter following the command byte code is a string which
contains a command to execute. The zombie will redirect the output (both
stdout & stderr) on the /tmp/.hj237349 file. When the command terminates,
a reply packet is built with the data contained in the temporary file,
the packet is scrambled and sent back to the server.
The payload size is 400 bytes; if necessary, the zombie may send more
than one packet. All the packets sent by the zombie to the client
have payload[0]=0x03 and payload[1] = ANY (this is a kind of
"signature", like the 0x02 value for the packets sent to the zombie).
The first packet has payload[3]=0x03; the following packets have
payload[3]=0x04; payload[2]=ANY in both cases.
- Code 4: Executes a DNS-Flood (attack #4, see below) against 8000
hard-coded addresses; it does a double-nested loop; the inner loop
iterates on the packet to send; the outer loop iterates on the IP
address. The starting IP address is chosen random.
- Code 5: The ICMP-Flood (attack #2, see below) is executed against the
target host.
- Code 6: A shell is spawned on port 23281; the shell is protected with
a password: SeNiF; after issuing the password, the attacker has full
access to the box. We used netcat to communicate with the zombie.
The password is encoded in the binary as TfOjG (strings:259) which
is exactly SeNiF where each letter has been substituted with the
previous letter (S -> T, e -> f, ...).
- Code 7: Executes a command sent in the payload, but does not provide
any output.
- Code 8: Kills the on-going attack/command.
- Code 9: Executes the DNS-Flood (attack #3, see below) with the delay
parameter equal to 0.
- Code 10: Executes the SYN-Flood Attack (attack #1, see below) with
delay parameter equal to 0.
- Code 11: Executes the SYN-Flood Attack with the specified delay
parameter.
- Code 12: The DNS-Flood attack (attack #3) is executed with the specified
delay parameter.
Commands 1,2,3,7,8 may be executed at any time; commands 4,5,6,9,10,
11,12 can only be executed if no other attack is ongoing.
We discovered four different attack patterns:
- Attack #1: TCP SYN Flood
This function gets the destination IP address and port, a flag (if
set, the attack will use the given source IP address, otherwise the
source IP address will be randomly generated), a source IP address
(for spoofing purpose), a delay parameter (it is a counter value,
set to X, after sending X packets, the zombie wait for a fixed small
delay), and a pair (int flag, char *hostname); if this flag is set,
the hostname will be looked up in the DNS and the corresponding
address will be used for the attack.
The function does an infinite loop, generate a SYN packet with the
requested parameters, calculate the CRC, and sends it.
The attack continues until a command 8 stops it.
- Attack #2: ICMP/UDP flood
This function gets a flag (if set the attack will be an ICMP flood,
otherwise it will be an UDP flood); a destination port (used only
for UDP flood; ICMP flood are based on echo request packet, type 8);
a source IP address (again, the IP is used for spoofing purposes); a
(int flag, char *hostname) pair used exactly as in attack #1.
As in the above case, the function executes an infinite loop; it
will assemble the IP packet, using a random TTL = 120 + random(130)
and the same IP id for all the packets. CRC is calculated for UDP
and ICMP packets.
The attack continues until a command 8 stops it.
- Attacco #3: DNS Flood
This function gets a destination IP address, a source IP address (used
for spoofing purposes), a delay parameter (like the one used in attack #1),
the source port and a (int flag, char *hostname) pair.
This functions uses 9 different payload type, encapsulated in an UDP
packets; these payloads are different kind of DNS requests.
It seems that this function does not calculate the CRC.
- Attacco #4: DNS flood/2
This function gets a destination IP address, a source port, a delay
parameter and a (int flag, char *hostname) pair.
This function attacks 8000 hard-coded IP address, sending the same 9 payload
used for attack #3. The outer loop iterates over payload type, the inner
loop iterates over IP address. The starting IP address is picked up randomly.
4.Live Analysis
---------------
The next step after obtaining the initial results, was the set up of a
simple test-bed composed by two systems:
- A victim host, where the binary was run (IP: 10.0.0.1)
- A scanning host, used for traffic analysis and network mapping;
(IP: 10.0.0.2)
The binary is launched using strace -f; this makes it possible to
follow the syscall execution.
This is the strace output:
execve("./the-binary", ["./the-binary"], [/* 37 vars */]) = 0
personality(PER_LINUX) = 0
geteuid() = 0
sigaction(SIGCHLD, {SIG_IGN}, {SIG_DFL}, 0x400557c8) = 0
fork() = 3828
[pid 3828] setsid() = 3828
[pid 3828] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3828] fork() = 3829
[pid 3828] _exit(0) = ?
[pid 3829] chdir("/") = 0
[pid 3829] close(0) = 0
[pid 3829] close(1) = 0
[pid 3829] close(2) = 0
[pid 3829] time(NULL) = 1022154669
[pid 3829] socket(PF_INET, SOCK_RAW, 0xb /* IPPROTO_??? */) = 0
[pid 3829] sigaction(SIGHUP, {SIG_IGN}, {SIG_DFL}, 0x400557c8) = 0
[pid 3829] sigaction(SIGTERM, {SIG_IGN}, {SIG_DFL}, 0x400557c8) = 0
[pid 3829] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3829] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3829] recv(0,
[pid 3827] --- SIGCHLD (Child exited) ---
[pid 3827] _exit(0) = ?
<... recv resumed> "E\0\0\36\200\341\0\0@\v\321\334\n\n\n\1\n\n\n\3XXXXXXX"...,
2048, 0) = 30
oldselect(1, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
recv(0, "E\0\0\36\366\26\0\0@\v\\\247\n\n\n\1\n\n\n\3XXXXXXXXXX"..., 2048, 0) =
30
oldselect(1, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
recv(0,
In this example, we sent some garbage data (using the hping packet
injection tool), using IP protocol 11.
As expected, the process wakes up and tries to process the packet
since the packet does not match anything, the process executes a
usleep(10000).
We check using ps and we find a [mingetty] in the list
# ps
...
1846 pts/1 S 0:00 bash
1870 ? S 0:00 [mingetty]
1871 pts/1 R 0:00 ps ax
Checking the corresponding /proc/1870 entry, will show that the exe
entry is linked to the suspect binary.
As expected, if the program is not started as root, it will abort since
it cannot open the required raw socket.
Using the information gathered with the RE stage we assembled a
packet injector which is able to talk with nazgul.
The injector use the scrambling code we reverse-engineered from the
binary to process the packet.
We run tcpdump to capture all the packet exchanged
# tcpdump -i eth0 -s 0 -X
14:38:33.063965 10.0.0.2 > 10.0.0.1: ip-proto-11 200 [tos 0x10]
0x0000 4510 00dc 00f2 0000 400b 6513 0a00 0002 E.......@.e.....
0x0010 0a00 0001 0200 1734 bb2c 449b b3ca e1f8 .......4.,D.....
0x0020 27c0 dbfa c530 489f c52f d4f2 1882 2745 '....0H../....'E
0x0030 4c59 6f45 74de f64d 7c18 3352 6a81 98af LYoEt..M|.3Rj...
0x0040 9245 5eb5 5840 59b0 c736 4ea5 30a8 9eb8 .E^.X@Y..6N.0...
0x0050 dfed 03d9 f05f 77ce b3cd e73e 41a9 c219 ....._w....>A...
0x0060 309f b70e f1a4 bd14 2b9a b209 2138 4f66 0.......+...!8Of
0x0070 750e 2948 137e 96ed 92a8 364e f309 97af u.)H.~....6N....
0x0080 0614 2a00 2f99 b108 25c1 dcfb 4f3b 54ab ..*./...%...O;T.
0x0090 8e9b b187 5a39 52a9 c02f 479e 35a7 bf16 ....Z9R../G.5...
0x00a0 2e45 5c73 8aa1 b8cf e6fd 142b 4259 7087 .E\s.......+BYp.
0x00b0 9eb5 cce3 08e1 f84f 8ef2 0a61 90fa 1269 .......O...a...i
0x00c0 9834 4f6e 8593 a97f 360f 267d 48b3 cb22 .4On....6.&}H.."
0x00d0 d94a 62b9 d1e8 ff16 2d44 0240 .Jb.....-D.@
1 packets received by filter
0 packets dropped by kernel
We also run a strace on the victim
# strace -f ./the-binary
execve("./the-binary", ["./the-binary"], [/* 49 vars */]) = 0
personality(0 /* PER_??? */) = 0
geteuid() = 0
sigaction(SIGCHLD, {SIG_IGN}, {SIG_DFL}, 0x40053478) = 0
fork() = 3381
[pid 3380] _exit(0) = ?
setsid() = 3381
sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
fork() = 3382
[pid 3382] chdir("/") = 0
[pid 3382] close(0) = 0
[pid 3382] close(1) = 0
[pid 3382] close(2) = 0
[pid 3382] time(NULL) = 1022587878
[pid 3382] socket(PF_INET, SOCK_RAW, 0xb /* IPPROTO_??? */) = 0
[pid 3382] sigaction(SIGHUP, {SIG_IGN}, {SIG_DFL}, 0x40053478) = 0
[pid 3382] sigaction(SIGTERM, {SIG_IGN}, {SIG_DFL}, 0x40053478) = 0
[pid 3382] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3382] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3382] recv(0,
[pid 3381] _exit(0) = ?
<... recv resumed> "E\20\0\334\0\362\0\0@\v{\23\177\0\0\1\177\0\0\1\2\0\027"..., 2048, 0) = 220
sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
fork() = 3525
[pid 3525] setsid() = 3525
[pid 3525] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3525] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 1
[pid 3525] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3525] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3525] sigaction(SIGHUP, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3525] sigaction(SIGTERM, {SIG_IGN}, {SIG_IGN}, 0x80575a8) = 0
[pid 3525] sigaction(SIGINT, {SIG_IGN}, {SIG_DFL}, 0x40053478) = 0
[pid 3525] setsockopt(1, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid 3525] bind(1, {sin_family=AF_INET, sin_port=htons(23281), sin_addr=inet_addr("0.0.0.0")}}, 16
[pid 3382] oldselect(1, NULL, NULL, NULL, {0, 10000}
[pid 3525] <... bind resumed> ) = 0
[pid 3525] listen(1, 3) = 0
[pid 3525] accept(1,
[pid 3382] <... oldselect resumed> ) = 0 (Timeout)
[pid 3382] recv(0,
There is a listener on port 23281, as NMap shows:
# nmap -sS -p 23281 victim
Starting nmap V. 2.54BETA34 ( www.insecure.org/nmap/ )
Interesting ports on victim (10.0.0.2):
Port State Service
23281/tcp open unknown
Nmap run completed -- 1 IP address (1 host up) scanned in 0 seconds
Netcat shows the shell doing its work
# nc 10.0.0.2 23281
SeNiF
ls
hostname
darkmoor
...
In order to make our life more exciting, the attacker has encoded the
password shifting each letter by 1; so the corresponding password string
we find in the binary is TfOjG (see strings:259)
5.Countermeasures
-----------------
Although the attacker uses a scrambling scheme for packet cloacking,
it is fairly easy to block the control traffic; we would suggest to
block IP protocol 11 and port 23281/tcp on the border router.
This also give us few rules for our IDS:
- IP.Protocol == 11 and (payload[0] == 0x02 || payload[0] == 0x03)
- TCP.port = 23281
To analyze a compromised machine, it can be useful to look after
"[mingetty]" process in the ps/top list; if the machine has been
trojaned and this string would not show up, we developed a simple
client which supports a limited interaction with the nazgul zombie.
By sending a command 6, the zombie will open a TCP/23281 socket and
NMap will reveal (remotely) the presence of the zombie.
We believe that it is sufficient to remove the binary to sanitize
the system; however, since this zombie is usually installed after
a full root compromise, we cannot do any assumptions on other
system components; hence a full reinstall should be performed.