HoneyNet Scan 25

(submission by David E. Anderson, November 2002)

(dave@unixhome.net)

Answers | Time Log | Resources | Notes | Summary


Answers

1) Which is the type of the .unlock file? When was it generated?

The .unlock file is a compressed (gzip) file of a tar file containing 2 files, .unlock.c and .update.c.  The .unlock file was generated Fri Sep 20 13:28:11 2002 GMT according to the tar header.

The file was identified using strings and vi commands.  Many file types have traits that give up their types.  Compressed and encrypted files show a random distribution of characters with the strings command.  The first bytes of a gzip file show as "^_\213" (Solaris vi) and "^_~K" with Linux vim.  Once the file was uncompressed using gunzip, repeating the step shows the ustar in the middle of the first line giving up that it is a tar file inside.

The date can be obtained by examining the Linux include file for tar: /usr/include/tar.h.  The header defines the position of the creation date of the file.  Taking this value and translating it to human readable format yields the tar creation date.  The Notes section contains more detail on how I did this.

2) Based on the source code, who is the author of this worm? When it was
created? Is it compatible with the date from question 1?

Reading the source code for .unlock.c lists multiple authors, contem@efnet seems to be the commented author with contributions from ensane, st and aion@ukr.net.  Multiple authors are apparent from the code styles and the fact that some sections have CR/LF sequences likely from a Windows machine.  Yet others sections are clear of the ^M character.  Indentation also varies.

The author of .update.c appears to be aion@ukr.net.

The dates in the tar archive of the files themselves are:

Sep 19 21:57 for .update.c

Sep 20 13:28 for .unlock.c

There is a VERSION string in the file that has the value of 20092002.  Presuming this is in the form of <dy><mo><year> it would be September 20th 2002 GMT.

These are consistent with the date in question 1).  As the tar file header date must have the same date or newer than it's contents.  The same is presumably true with the version number.  This does not mean the dates have not been altered, just that the tar file creation, tar file content dates and version are consistent.

The command "tar tvf ..." was used to list the file dates after setting the time zone to GMT0.  The TZ environment variable influences the date/time of output.  See the Notes for more details as for all dates I have used GMT0.

3) Which process name is used by the worm when it is running?

The worm process runs with the name of httpd.  It starts as /tmp/httpd but is changed to just httpd when it starts.  

This can be found by reviewing the code right after the main(...) startup.  The code copies a new PSNAME into argv[0].  The define statement for PSNAME is httpdargv[] array is what stores the process path/name and passed parameters. 

By overwriting argv[0] is overwritten to prevent "ps ax" from showing where the worm started from, which is an attempt to hide the process start point.  It also makes the worm look like the Apache server component named httpd. This is an attempt at the process to hide itself.

4) In which format the worm copies itself to the new infected machine? Which files are created in the whole process? After the worm executes itself, which files remain on the infected machine?

The file copies itself to a machine is in uuencoded format.  The following files are created in the whole process:

All files are removed after the worm is running except for /tmp/.unlock which remains as a source of the worm for future exploits from the new host.  In fact, it is important to note here that the worm does not modify .unlock so the date contents are in fact the version date of this form of the worm.

The function int sh(int sockfd) shows the process of transferring and activating the worm on a host.  It shows the worm first setting the environment, cleaning up old files that might exist and using "cat > /tmp/.unlock.uu" to send the file to the target host.  However for cat to work, zdhr() is first used to first control xterm settings. encode(sockfd) function feeds the local /tmp/.unlock to the cat in uuencode format.  And again zhdr() to simulate an eof for the cat to finish.  Once the cat is finished, it uudecodes, uncompresses, untars, compiles and then runs the executables.  The final part is the cleanup.  If all this works on the target host, it is now then infected and running the two daemons.

5) Which port is scanned by the worm?

Port 80/tcp is used to scan for the vulnerability by capturing the "Server: " string to identify the server architecture and type, and in fact if the system has http services running.  The function GetAddress() is where to find the code that does this.

The "Server: " line is from a Web server that identifies it's architecture and OS type.  To see what a web server will give you, you can "telnet <hostip> 80" and enter "GET / HTTP/1.1".  Hit enter a second time and the web server will error out with a "400 Bad Request" error as the request has not been specified.  The server then identifs itself and it's capabilities. This worm uses it later to lookup in the architectures[] array some key information to infect a host.

6) Which vulnerability the worm tries to exploit? In which architectures?

This worm exploits a buffer overflow vulnerability in OpenSSL in versions before 0.9.6e.  Apache/mod_ssl is targeted as it uses OpenSSL for mod_ssl.  The buffer overflow is in the session negotiation code that permits the execution of arbitrary code on the host.

The worm is looking for x86 Apache http servers with the following Linux/Apache version combinations:

By default, it tries Red-Hat 1.3.23 if no match on the architectures table exists, so versions not on this list may still be vulnerable to this or future enhanced versions of this worm.

The use of the architectures[] array and code in the start of exploit() show how it determines the best function address to use to infect.  If no exact match is found, it sets it to the 9th item in architectures[] and continues to attempt to infect the target host.  We can see this in the exploit() function where it sets "arch=-1".  The program calls GetAddress() to receive the server string and then uses the C loop with strstr() to find a match in architectures[].  If a match is found, arch is set to the value if not it remains -1.  It exits the loop and if still -1 it is set to 9.  This 9th item was undoubtedly a popular Linux/Apache/mod_ssl combination and is a last ditch effort to infect those web administrators that may have changed their server strings.

The code then sets up 2 SSL sessions, the first ssl1 sets up a weakness for the second session to get an interactive session with the target host.  Once the overflows are in, it attempts the sh() function over the ssl2's socket that actually transfers the worm.

"Apache/mod_ssl Worm", Variant "B" of the Cert Alert Advisory described at http://www.cert.org/advisories/CA-2002-27.html has a general description of this worm.

7) What kind of information is sent by the worm by email? To which account?

The worm sends 3 items in the mail message, they are: host id, hostname and the "sip".  It sends the message directly to freemail.ukr.net with he recipient of aion@ukr.net.

The function to review in .unlock.c is mailme(char *sip).

To find the source of the sip variable passed into mailme(sip) we must go back to where the program is started.  Just after the main() function forks, the program calls mailme(argv[1]), which is the first parameter passed on the command line. sh(int sockfd) function shows /tmp/httpd is started with the localip as the first argument.  Following it through conv() it is a text representation of the hosts IP that the worm started on.

It is telling aion@ukr.net where the infected host is, likely so they can take advantage of the .update.c code discussed later and to leverage the infected hosts for a denial of service attack.

8) Which port (and protocol) is used by the worm to communicate to other infected machines?

It uses port 4156/udp is used for peer to peer communications.  It appears as if the system maintains a list of of peer systems it has infected and sends regular broadcasts of which senderror() is one.  

One give away is finding the main() line "if (audp_listen (&udpserver, PORT) != 0)" in main.  Following this, it is clear udpserver is just that, a listening server for peer messages that the subsequent switch() statements convert to commands.

To clean up indenting for code readability I used dos2unix to remove DOS CR/LF sequences and indent to indent the code consistently.

9) Name 3 functionalities built in the worm to attack other networks.

Three functionalities are: upd flood, tcp flood and dns flood.  Others in the case statement in main() are information gathering like email scan and info.  The main purpose of these floods are to cause a denial of service.

The commands come in from the udpserver.  It should be noted that the system also has other features such as the ability to run a command and scan the file system looking for email addresses.  Commands like wget could be executed to extend attacks to other services.

10) What is the purpose of the .update.c program? Which port does it use?

.update.c is a back door daemon program allowing direct shell access to the system.   It listens on port 1052/tcp and only expects the password of aion1981 for shell access.  It overwrites arv[0] to show only the process name "update" and forks to the background.  By writing over argv[0] the process is hiding the fact it started in /tmp.  See also 11).  It would appear in many respects like a normal Apache web server process.

11)  Bonus Question: What is the purpose of the SLEEPTIME and UPTIME values in the .update.c program?

The purpose of these two values are to reduce the probabilities that his program will be detected.  It does this by opening up the services for a limited time, specifically 10 seconds (UPTIME) and not be accepting connections for 5 minutes (SLEEPTIME).  So if you were to use a utility like nmap or netstat, you would see it only for 10 seconds in a 310 second period reducing the odds that you would detect it on a system to 10/310, or 1 in 31.

From another host (Solaris in my case) you can take advantage of this with something like:

#!/bin/ksh
while [ 1 -gt 0 ]
do
  telnet hostname 1052
  # try again in 2 seconds
  sleep 2
done

Once the telnet connects, just enter the password of aion1981 and enter and your in.


Time Log


Resources


Notes

These are the verbose notes and steps I took to get the information.  Some of the notes may be incorrect as I didn't pursue some avenues further.  But is is the path I took to understand it's operation.

Downloading the .unlock file into a Solaris system and reviewing the questions.  Taking strings command to it the distribution looks even, must be a compressed or encrypted file. So I vi the file, the first byte is consistent with other gzip files I have.  Lets try:

cp .unlock temp.gz
gzip -d temp.gz

Hm, this worked.  So it is gzip.  Now to edit the file with vi I see "ustar" in the middle and know tar files have headers.  I review the /usr/include/tar.h file in Linux.  Dates, and dates mean time znes.  So I set my time zone to GMT0 to get GMT dates and times.  I am using the ksh so I executed:

TZ=GMT0
export TZ

Back to the tar, lets get some dates, this is a tar file so I execute:

$ tar tvf temp
-rw-r--r-- root/wheel 70981 2002-09-20 13:28:11 .unlock.c
-rw-r--r-- root/wheel 2792 2002-09-19 21:57:48 .update.c

With vi I see the date/time of the tar header, it appears to be 07542621153, for lack of knowing a better way I write a little C program to convert it

#include <stdio.h>
#include <time.h>
time_t t;
int main(int argc, char **argv)
{
  t = 07542621153; /* date from tar file */
  printf("%s %s\n", getenv("TZ"), ctime(&t));
}

And run it to get the output:.

GMT0 Fri Sep 20 13:28:11 2002

Spent a few minutes reviewing the code.

Doing well here... day 1 is over but got some stuff.

------------

Started by re-reviewing the questions.  Review the code for question 2).  Notice the CR/LF issue, it is trivia, but would help if looking for
proof on the suspects PC.

On to question 3).  In order for any program to run and compile source it needs to execute, they might call system() or one of the exec()/spawn() system calls. So lets grep the code with:

$ egrep -e exec -e spawn -e system .*.c
.unlock.c: "exec bash -i\n");
.unlock.c: printf("%s: Exec format ....
.unlock.c: else senderror(&udpclient,id,"Unable ...
.update.c: execl("/bin/sh","sh -i",(char *)0);

There also has to be an entry point into the program, which would be
main():

$ grep main .*.c
.unlock.c: if (!strcmp(email,"webmaster@mydomain.com")) continue;
.unlock.c:int main(int argc, char **argv) {
.update.c:main (int argc,char **argv)

Hm, places to look and both files have a main, likely two parts to this beast. To answer this question we will have to examine the code and figure out basically how it works to know which is the worm and what the other piece is for.

.update.c seems smaller, lets look at this source first. It appears to overwrite argv[0] right away, hiding it's startup information, probably to hide it's path from ps commands. And then it forks. It then opens up a listening socket. This is a typical server daemon that listens on port 1052/tcp for just a password of aion1981. I will look at the weird for/SLEEP/UPTIME later.  If the password is right it starts up an interactive shell ("/bin/sh -i"). This would be suitable for "telnet <ip> 1052" and password to gain access. So .update.c is for remote access, likely after the compromise. Will compile later once I know more.

So now to .unlock.c. To make this a little more simple, lets write what each interesting function does, it helps when reverse engineering code to define what each function does as a reference.

So for question 3), the process name the worm uses when it is running is httpd, it sets this after starting up.

-------------

Go to the web to answer 6).  Find it pretty quick knowing a rough overview of this virus.  Because it looks like it pushes buffer overflows down SSL a good match is identified.

I like the bonus question, looks like the program limits how long it is listening, re-review .update.c and make sure it is safe to compile/run and fix an accept integer issue with it and test it out on Red-Hat 7.3.  Works as I suspect.

I think I have most of the answers except for I am not comfortable with 8 and 9.  Going to read the CERT in detail and review the whole main() and exploit() functions of .unlock.c again...

Looking at the code on how it seems patch-worked - duplications and even bugs... it looks like perhaps a library of code snippets exist that hackers use to paste together worms and hacks.  For example, .update.c could be reused by any number of root kits or worms.  Part of .unlock.c seem "canned" functions.  The ^M in the .unlock file looks like pasted code from a windows box.

On my Solaris box I use dos2unix to remove the CR/LF's ^M and indent to reformat the code for easier reading.

Reviewing the code again seeing a pattern in the VERSION define, looks like a date... modifying the answers.  Pays to take a second, third .. look.  Especially since I want to know more about ESCANPORT.

Question 8 is the most difficult thus far, it requires some more code interpretation, grep is our friend:

$ grep PORT .unlock.c 
#define PORT 4156
#define SCANPORT 80
#define ESCANPORT 10100
q->port=PORT;
audp_relay(&udpserver,&ts,srv,PORT);
audp_relay(&udpserver,&ts,srv,PORT);
#define FINDSCKPORTOFS 208 + 12 + 46
overwrite_next_chunk[FINDSCKPORTOFS] = (char) (port & 0xff);
overwrite_next_chunk[FINDSCKPORTOFS+1] = (char) ((port >> 8) & 0xff);
if (audp_listen(&udpserver,PORT) != 0) {
atcp_sync_connect(&clients[n],srv,SCANPORT);
audp_setup(&client,(char*)ip,ESCANPORT); 

SCANPORT 80 we know about, saw this with getting the "Server: " string.  There are two other port defines not yet known, 4156 and 10100.  "udpserver" looks internesting.  But grep'ing for it is verbose.  Lets look for PORT using vi and the search command "/PORT".

Using dos2unix and indent to fix the erratic indenting and makes this patchwork code more readable.

In a function called relay() I see "audp_relay (&udpserver, &ts, srv, PORT);" which strikes my eye.  Looking at how this works, this sets up a list of servers to broadcast to. Not sure I fully understand how, but at this point it is safe to say port 4156/udp is the one for 8).  There is also the give away at "if (audp_listen (&udpserver, PORT) != 0)" where the peer listening udp is started.

isreal() pops up, looks like the code skips isreal() networks. senderror() looks like it can be chatty as all peers will get errors.  This is not efficient.  Still not there for 9) but more time to review the code.  It does appear this worm avoids internal networks.

-------------

Mostly brute force code study.  Following the actions of ssl1 and ssl2 around and more on the exploit().  I can compile .unlock.c on RH 8.0 and 7.3 and it produces an executable, but I am not in a lab with this so running it isn't an option.

-------------

Going to spend about 2-4 hours today, completing the write up of what I know and submitting.


Summary

There is little doubt this worm is the real item designed to worm throughout many systems, giving someone control of a large number of machines in a denial of service attack.  By avoiding internet addresses used only for internal purposes it avoids detection by loosely configured firewalls the purpose is clear to be deceptive and gain control of large numbers of systems. 

Although this worm in it's current form is limited to Linux/Intel/Apache/mod_ssl systems, there should be no comfort taken if your platform is not listed.  It would be easy enough to correct byte order issues, integer sizing issues, other portability issues and socket leaks to make this worm a much more effective and hazardous than it currently is.  It could also be easily adapted to other buffer overflow vulnerabilities and cross platform variants.

In fact, after reviewing the code and seeing a few bugs, I wonder if this worm just didn't get away early from it's creator.  As with more polish, it could become really a bad worm.  The patch work nature of the source code indicates a "hack and worm" code kit exists.  And the source code is now widely available, I would expect to see future derivatives of this code again.

If you are a "White Hat", this worm exemplifies the need to a good firewall architecture and best practices.  A number of good practices would have gone along way in minimizing the impact and detecting this worm immediately.  For example on a firewall, only allowing incoming connections on 80/tcp and 443/tcp and denying the initiated connections going outbound would see firewall logs on outbound to port 25/tcp, 80/tcp and 4156/udp give up that a system was infected.

However, a good firewall is not a reason not to patch susceptible Apache/mod_ssl servers.  It is often hard to explain to business managers why you need to drop what your doing, but keeping vulnerabilities down and a good firewall setup should be considered a necessity.  As the time from the CERT alert to this worms version was short -- too short for relying on just patching as I know first hand many commercial vendors did not have a fix out for this by September 20th.