Answers to Honeynet's reverse challenge
This answer was written by Felix von Leitner (felix-honeynet@fefe.de)
with help from Olaf Dreesen.
We kept a log file of our findings in syscalls.txt, which doubles as configuration file
for annotate.pl, which is a small perl script I
wrote to annotate an objdump -dr disassembly dump of the binary.
annotate.pl reads syscalls.txt and keeps a symbol table. Lines starting
with a hex number and a colon were put into the table. If the string
after the colon looked like a function (i.e. "\w+\(" in perl notation),
calls to that address will be rewritten as "call name-of-function".
Also, to make reading the disassembly easier, the code changes several
of gcc's alignment "multi-byte NOPs" to real multi-byte NOPs, i.e. "nop;
nop", "nop; nop; nop" and so on. annotate.pl is also able to annotate
local variables in a function, the syntax for that is "funcname/0x8:
arg1".
Q1: Identify and explain the purpose of the binary.
The binary is a DDoS (distributed denial of service) client that can be
installed on a compromised machine. It can then be used to remote
control the machine and in particular launch different types of network
based attacks against victims.
The binary does not carry any identifying marks that make for a good
name, except for the letters "Gn" (see syscalls.txt for an explanation).
Since that is an awfully short name, I hereby call this DDoS tool "Grand
Nagus" (same initials, and it like the Star Trek figure it will
ruthlessly maim his victims with back stabbing methods for his personal
gain).
Q2: Identify and explain the different features of the binary. What
are its capabilities?
The binary establishes an encrypted (well, obfuscated) remote control
channel using a raw sockets to listen to IP traffic with protocol number
11. This protocol number is not normally used, so this kind of traffic
should be easy to spot using an IDS or even tcpdump with "ip proto 11".
The binary will answer to commands using protocol number 11 as well.
However, it does not just send answers back to the same IP, but it has
an internal list of 10 IPs and answers to all of them. One of the
commands (2) can be used to set one or all of these 10 IPs (if only one
is set, the rest is overwritten with random data). The binary itself
does not know itself which one is genuine and which is not. This
establishes some basic deniability for the one controlling the binary,
because the initial command can be sent with a spoofed source IP and the
answer is sent to 9 others as well, so unless the whole transaction is
sniffed on his side, he can deny everything. It also makes it possibly
to send the command from host a and read the answer on host b, which
makes tracing even harder. This is optional, however. The binary can
also only answer on one IP.
Over that channel, the binary can be
- discovered and told to return stats (returns some constants and
whether an attack is running) [command 1].
- reconfigured (one or all of the IPs can be set, and spread spectrum
mode can be enable or disabled) [command 2].
- told to execute a command using /bin/csh and return the first few
hundred bytes of the output [command 3].
- told to launch a DNS based flood (see below) [command 4].
- told to to a normal UDP or ICMP based flood. The ICMP packets are
"echo request" (i.e. ping) and the UDP packets deliberately have an
invalid checksum [command 5].
- told to spawn a password protected root shell on port 23281 [command 6].
- told to execute a command using /bin/csh but throw away the output
[command 7].
- told to clean up (kill the process doing the flood attack) [command 8].
- told to do the DNS flood with one more parameter set (whose meaning
eludes me so far) [command 9].
- told to do a SYN flood [command 10].
- told to do a variation on the SYN flood where you can set one more
parameter (don't know what it does) [command 11].
- told to do a variation on the DNS flood (same attack but use a
destination IP that is configured as part of the command) [command 12].
About the DNS based flood
This attack sends spoofed DNS queries to public DNS servers with the
source IP of the victim. The binary comes with a list of over 11400
public DNS servers from all over the world. The code sends queries for
IN SOA records for well-known top-level domains and "usc.edu" to the
servers. Each of the servers will send an answer to the victim. The
code cycles through the DNS servers, which means that although it sends
packets rapidly, each DNS servers gets a packet from each instance of
the binary about every 3.5 seconds.
Previous methods of dealing with DDoS zombies focused on looking where
the most traffic comes from and severing that link with static routes.
With this attack, the traffic comes from the public DNS servers, not
from the zombies. You can't sever the links to these important
infrastructure components.
Also, since each zombie is seen by each public DNS server only every few
seconds, there is no clear signal if looking by volume in the huge
amount of normal DNS traffic for these servers. So, basically, you
can't do anything if you are hit by this attack. That obviously
makes this attack very dangerous.
Q3: The binary uses a network data encoding process. Identify the
encoding process and develop a decoder for it
Both decoder and encoder are in the binary, but the decoder is written
in a very confused way. Here are simple encrypt and decrypt routines:
void decrypt(int len,const unsigned char* code,
unsigned char* plaintext) {
int i;
for (i=0; i<len; ++i)
plaintext[i]=code[i]-23-(i?code[i-1]:0);
plaintext[len]=0;
}
void encrypt(int len,const unsigned char* plaintext,
unsigned char* code) {
int i;
for (i=0; i<len; ++i)
code[i]=plaintext[i]+23+(i?code[i-1]:0);
}
Q: Identify one method of detecting this network traffic using a
method that is not just specific to this situation, but other ones as
well.
Having a white list for allowed IP protocols severs the command link
completely. This binary uses protocol 11. If a firewall blocks IP
traffic that is none of the well-known IP protocols (UDP, TCP, ICMP,
IGMP, and maybe IPsec), the binary can no longer be reached.
Q: Identify and explain any techniques in the binary that protect it
from being analyzed or reverse engineered.
The binary was not particularly well protected. It was statically
linked against an old libc 5 and stripped, however. That removes libc
symbols and makes it harder to differentiate program code and libc code.
Also, the password for the interactive backdoor is in the binary, but
each character in it is incremented by 1, so you can't just run strings
on the binary and use this string as password.
Other than that, the code was not protected. It looked very obfuscated
in several parts, but I doubt that was done on purpose. There are much
better way to protect a binary from reverse engineering. If the author
knew how to do it, he would have done it properly.
The binary sets its argv[0] to "[mingetty]" to avoid detection.
Q: Identify two tools in the past that have demonstrated similar
functionality.
I am not an expert in DDoS tools and backdoors, but the ICMP/UDP flood
and synflood are a feature of most good DDoS tool as far as I know, i.e.
TFN, Stacheldraht.
Q: What kind of information can be derived about the person who
developed this tool? For example, what is their skill level? (Bonus)
The code looks like the author was a script kid who just copied code
from others together and randomly shuffled lines until it worked. The
program has some very innovative new ideas, for example the DNS attack,
but other parts are really bad. For example, an IP is passed as four
characters (i.e. in the correct format, not even in the wrong byte
order). The code then uses sprintf with "%d.%d.%d.%d" to write an ASCII
representation and calls inet_addr on it. This looks like the sort of
voodoo programming a newbie would do.
Also, the commands are not authenticated by a check sum or cryptographic
signature. That means one can accidentally cause random floods to be
triggered if one gets the encryption routine wrong.
On the other hand, the DNS flood is a really bright idea. And the
spread spectrum stuff is a clever way to protect the identity of the
puppet master. This does not fit together very well. I guess that this
code is the work of some not very experienced but ambitious kid who got
access to the code of one of the existing similar tools and hacked it
until it did what he wanted.
Q: What advancements in tools with similar purposes can we expect in
the future? (Bonus)
I am not very good at looking in the future. The DNS attack was already
a new idea to me, so this is already the pinnacle of what can be done if
you ask me ;-) However, the DNS indirection can not only be done with
DNS. This attack does not use indirection to amplify the traffic, but
to hide. So every way to talk to one machine and get a response can be
used like this. For example, sending forged TCP packets to FTP or web
servers and having the victim flooded with RSTs. I don't think this
kind of attack can be avoided except if all ISPs activate egress
filtering in their routers.