Answers to Honeynet's reverse challenge

This answer was written by Felix von Leitner (felix-honeynet@fefe.de) with help from Olaf Dreesen.

We kept a log file of our findings in syscalls.txt, which doubles as configuration file for annotate.pl, which is a small perl script I wrote to annotate an objdump -dr disassembly dump of the binary. annotate.pl reads syscalls.txt and keeps a symbol table. Lines starting with a hex number and a colon were put into the table. If the string after the colon looked like a function (i.e. "\w+\(" in perl notation), calls to that address will be rewritten as "call name-of-function". Also, to make reading the disassembly easier, the code changes several of gcc's alignment "multi-byte NOPs" to real multi-byte NOPs, i.e. "nop; nop", "nop; nop; nop" and so on. annotate.pl is also able to annotate local variables in a function, the syntax for that is "funcname/0x8: arg1".

Q1: Identify and explain the purpose of the binary.

The binary is a DDoS (distributed denial of service) client that can be installed on a compromised machine. It can then be used to remote control the machine and in particular launch different types of network based attacks against victims.

The binary does not carry any identifying marks that make for a good name, except for the letters "Gn" (see syscalls.txt for an explanation). Since that is an awfully short name, I hereby call this DDoS tool "Grand Nagus" (same initials, and it like the Star Trek figure it will ruthlessly maim his victims with back stabbing methods for his personal gain).

Q2: Identify and explain the different features of the binary. What are its capabilities?

The binary establishes an encrypted (well, obfuscated) remote control channel using a raw sockets to listen to IP traffic with protocol number 11. This protocol number is not normally used, so this kind of traffic should be easy to spot using an IDS or even tcpdump with "ip proto 11".

The binary will answer to commands using protocol number 11 as well. However, it does not just send answers back to the same IP, but it has an internal list of 10 IPs and answers to all of them. One of the commands (2) can be used to set one or all of these 10 IPs (if only one is set, the rest is overwritten with random data). The binary itself does not know itself which one is genuine and which is not. This establishes some basic deniability for the one controlling the binary, because the initial command can be sent with a spoofed source IP and the answer is sent to 9 others as well, so unless the whole transaction is sniffed on his side, he can deny everything. It also makes it possibly to send the command from host a and read the answer on host b, which makes tracing even harder. This is optional, however. The binary can also only answer on one IP.

Over that channel, the binary can be

discovered and told to return stats (returns some constants and whether an attack is running) [command 1].
reconfigured (one or all of the IPs can be set, and spread spectrum mode can be enable or disabled) [command 2].
told to execute a command using /bin/csh and return the first few hundred bytes of the output [command 3].
told to launch a DNS based flood (see below) [command 4].
told to to a normal UDP or ICMP based flood. The ICMP packets are "echo request" (i.e. ping) and the UDP packets deliberately have an invalid checksum [command 5].
told to spawn a password protected root shell on port 23281 [command 6].
told to execute a command using /bin/csh but throw away the output [command 7].
told to clean up (kill the process doing the flood attack) [command 8].
told to do the DNS flood with one more parameter set (whose meaning eludes me so far) [command 9].
told to do a SYN flood [command 10].
told to do a variation on the SYN flood where you can set one more parameter (don't know what it does) [command 11].
told to do a variation on the DNS flood (same attack but use a destination IP that is configured as part of the command) [command 12].

About the DNS based flood

This attack sends spoofed DNS queries to public DNS servers with the source IP of the victim. The binary comes with a list of over 11400 public DNS servers from all over the world. The code sends queries for IN SOA records for well-known top-level domains and "usc.edu" to the servers. Each of the servers will send an answer to the victim. The code cycles through the DNS servers, which means that although it sends packets rapidly, each DNS servers gets a packet from each instance of the binary about every 3.5 seconds.

Previous methods of dealing with DDoS zombies focused on looking where the most traffic comes from and severing that link with static routes. With this attack, the traffic comes from the public DNS servers, not from the zombies. You can't sever the links to these important infrastructure components.

Also, since each zombie is seen by each public DNS server only every few seconds, there is no clear signal if looking by volume in the huge amount of normal DNS traffic for these servers. So, basically, you can't do anything if you are hit by this attack. That obviously makes this attack very dangerous.

Q3: The binary uses a network data encoding process. Identify the encoding process and develop a decoder for it

Both decoder and encoder are in the binary, but the decoder is written in a very confused way. Here are simple encrypt and decrypt routines:

  void decrypt(int len,const unsigned char* code,
               unsigned char* plaintext) {
    int i;
    for (i=0; i<len; ++i)
      plaintext[i]=code[i]-23-(i?code[i-1]:0);
    plaintext[len]=0;
  }

  void encrypt(int len,const unsigned char* plaintext,
               unsigned char* code) {
    int i;
    for (i=0; i<len; ++i)
      code[i]=plaintext[i]+23+(i?code[i-1]:0);
  }

Q: Identify one method of detecting this network traffic using a method that is not just specific to this situation, but other ones as well.

Having a white list for allowed IP protocols severs the command link completely. This binary uses protocol 11. If a firewall blocks IP traffic that is none of the well-known IP protocols (UDP, TCP, ICMP, IGMP, and maybe IPsec), the binary can no longer be reached.

Q: Identify and explain any techniques in the binary that protect it from being analyzed or reverse engineered.

The binary was not particularly well protected. It was statically linked against an old libc 5 and stripped, however. That removes libc symbols and makes it harder to differentiate program code and libc code. Also, the password for the interactive backdoor is in the binary, but each character in it is incremented by 1, so you can't just run strings on the binary and use this string as password.

Other than that, the code was not protected. It looked very obfuscated in several parts, but I doubt that was done on purpose. There are much better way to protect a binary from reverse engineering. If the author knew how to do it, he would have done it properly.

The binary sets its argv[0] to "[mingetty]" to avoid detection.

Q: Identify two tools in the past that have demonstrated similar functionality.

I am not an expert in DDoS tools and backdoors, but the ICMP/UDP flood and synflood are a feature of most good DDoS tool as far as I know, i.e. TFN, Stacheldraht.

Q: What kind of information can be derived about the person who developed this tool? For example, what is their skill level? (Bonus)

The code looks like the author was a script kid who just copied code from others together and randomly shuffled lines until it worked. The program has some very innovative new ideas, for example the DNS attack, but other parts are really bad. For example, an IP is passed as four characters (i.e. in the correct format, not even in the wrong byte order). The code then uses sprintf with "%d.%d.%d.%d" to write an ASCII representation and calls inet_addr on it. This looks like the sort of voodoo programming a newbie would do.

Also, the commands are not authenticated by a check sum or cryptographic signature. That means one can accidentally cause random floods to be triggered if one gets the encryption routine wrong.

On the other hand, the DNS flood is a really bright idea. And the spread spectrum stuff is a clever way to protect the identity of the puppet master. This does not fit together very well. I guess that this code is the work of some not very experienced but ambitious kid who got access to the code of one of the existing similar tools and hacked it until it did what he wanted.

Q: What advancements in tools with similar purposes can we expect in the future? (Bonus)

I am not very good at looking in the future. The DNS attack was already a new idea to me, so this is already the pinnacle of what can be done if you ask me ;-) However, the DNS indirection can not only be done with DNS. This attack does not use indirection to amplify the traffic, but to hide. So every way to talk to one machine and get a response can be used like this. For example, sending forged TCP packets to FTP or web servers and having the victim flooded with RSTs. I don't think this kind of attack can be avoided except if all ISPs activate egress filtering in their routers.