Project Honeynet Scan of the Month 24
redhive laboratories
This month's project involved the forensic analysis of a floppy that was seized by the police from a suspected drug dealer (made up scenario). You may want to read the police report for the appropriate background information. The following is a detailed writ covering our methodologies and conclusions. We begin with brief answers to the six questions that were asked. We then discuss in detail our analysis and conclusions; It is important to note that we do not just discuss the correct approach but our entire approach, incorrect paths and all. We end the writ with a listing of the tools and reference materials we used to conduct our analysis.
1. Who is Joe Jacob's supplier of marijuana and what is the address listed for the supplier? By examining Jimmy Jungle.doc, which is apparently a letter from Joe to his supplier, we determined that the following is the name and address of Joe Jacob's supplier:
Jimmy Jungle 2. What crucial data is available within the coverpage.jpg file and why is this data crucial? The cover page file contains the password to the encrypted zip file (scheduled visits.zip), which can be found at the end of the file (offset 0x3d20) as "pw=goodtimes". 3. What (if any) other high schools besides Smith Hill does Joe Jacobs frequent? After successfully opening the encrypted scheduled visits spreadsheet it is apparent that Joe Jacobs also frequents the following high schools:
Key High School 4. For each file, what processes were taken by the suspect to mask them from others? The restored and repaired floppy image contains 3 files. These files were modified to mask them from others in the following ways: 5. What processes did you (the investigator) use to successfully examine the entire contents of each file? This answer to this question lies in the Detailed Analysis section. 6. Bonus Question: What Microsoft program was used to create the Cover Page file. What is your proof (Proof is the key to getting this question right, not just making a guess). The cover page file contains padding in the 4 byte repeating sequence of 0x28a28a00. We created JFIF images using Microsoft Paint and other programs. By examining the padding within these files it is obvious that the 0x28a28a00 padding is unique to Microsoft Paint.
First things first we need to download, verify, and decompress the floppy image. We began analysis on our linux box as it provides all of our favorite standard tools (grep, strings, perl, etc...) plus excellent forensic tools.
$ curl http://project.honeynet.org/scans/scan24/image.zip > image.zip % Total % Received % Xferd Average Speed Time Curr. Dload Upload Total Current Left Speed 100 18146 100 18146 0 0 14077 0 0:00:01 0:00:01 0:00:00 34284 $ md5sum image.zip b676147f63923e1f428131d59b1d6a72 image.zip $ unzip image.zip Archive: image.zip inflating: image We check to see what 'file' has to say about image and then we mount it. $ file image image: x86 boot sector, system MSDOS5.0, FAT (12 bit) # insmod vfat # insmod loop # losetup /dev/loop1 image # mkdir mounted # mount -o ro /dev/loop1 mounted/ # cd mounted/ # ls cover page.jpgc schedu~1.exe 'file' reports that image is a FAT12 MSDOS disk. We switch to root, load the FAT module, load the loop module, setup a loop device, and mount the image on that device. You'll notice that we don't use the noexec and nosuid flags for mount because we trust 'file' and because we like to live dangerously. Two files. A JPEG and an executable, the familiar .exe extention further confirms that we are indeed dealing with a DOS image. Lets try and figure out what these files are. # file * cover page.jpgc : PC formatted floppy with no filesystem schedu~1.exe: Zip archive data, at least v2.0 to extract # hexdump cover\ page.jpgc\ \ \ \ \ \ \ \ \ \ \ 0000000 f6f6 f6f6 f6f6 f6f6 f6f6 f6f6 f6f6 f6f6 * 0000200 0000 0000 0000 0000 0000 0000 0000 0000 * 0003ce0 # cp schedu~1.exe .. # cd .. # unzip # mv schedu~1.exe schedu~1.zip # unzip schedu~1.zip Archive: schedu~1.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. note: schedu~1.zip may be a plain executable, not an archive unzip: cannot find zipfile directory in one of schedu~1.zip or schedu~1.zip.zip, and cannot find schedu~1.zip.ZIP, period. The initial thought that cover page.jpgc was a JPEG was dismissed by 'file'. The contents of the file were extremely confusing, it was a slew of 0xf6's followed by a slew of nulls. This raised a series of questions. What is this file -- An XOR table to be used in the future? Perhaps the file's statistics (offsets, sizes, etc...) provide some kind of important information. Why does the filename contain trailing spaces? The executable is reportedly a zip file. We renamed and attempted to decompress it but failed. Is it corrupt or is it not a zip file? We jump to Google and find a reference on zip file formats [4]. Examining the contents of the file verifies that it is indeed a zip file, however the trailing PK tag was removed, thereby rendering the file unrecognizable by standard zip utilities. We restore the zip file in two ways: by hand (beyond the scope of this document) and also the easy way using PKZIPFIX, a DOS utilitiy from the old school pkzip package. We transferred the fixed zip file to our windows box and were able to successfully open it using windows compressed folders. The zip file contained an encrypted Excel spreadsheet, "scheduled visits.xls". Despite our best efforts we were never able to completely restore the zip file so that it could be recognized by *all* compression utilities. At this point we thought we had everything we needed and came up with three methods to crack the password:
Our next attempt at cracking the zip file was to launch a partial known plaintext attack [1, 3]. There are two prerequisites to launching this attack. The first is that you need to have at least 13 bytes of known plaintext. The second is that you must compress the known plaintext using the exact same method used to compress the target zip file. The target zip file contains only one file, though we can not be absolutely sure what kind of file it is we know that it is probably an Excel spreadsheet based off of the extension. Viewing various Excel files it is obvious that they share a standard header. We extracted 48 bytes of this header and stored it as excel_header.dat. We then compressed that file with various utilities and at various compression levels. We then took each of these compressed header files and launched the known plaintext attack against the target zip file using Advanced ZIP Password Recovery. Again we were disappointed as the results were not fruitful. Our initial thought in response to this failure was that the file was not an actual Excel spreadsheet and was mearly named as such to serve as a diversion. At this point we had no option left but to launch a brute force attack. We started the brute force attack on our linux box using Zip Cracker, and in the process of waiting for the results restarted our analysis with an entirely new approach. In the end it took Zip Cracker 9 days to determine the password. If you run 'strings' against the image file and examine the output you will notice that there is significantly more data than what appears when the image is mounted. It's time to throw the image file into our favorite hex/text editor, Ultra Edit. We can clearly identify 3 different files:
-------------------- <-- 1 sector -- Starts at 0 | Dos Boot Code | |____________________| | FAT #1 | <-- 6 sectors -- Starts at 200h |____________________| | FAT #2 | <-- 6 sectors -- Starts at 1400h |____________________| | Directory | <-- 8 sectors -- Starts at 2600h |____________________| | Data Section | <--Remainder of disk -- Starts @ 4200h |____________________| We began the reconstruction process with the Microsoft Word document. This file did not appear when we mounted the original image. It is apparent that the file was deleted as the FAT table entry did not exist. To restore the FAT table we extracted the file by hand to a fresh floppy and extracted the appropriate entry from this new FAT table. The file is also marked as erased in the directory entry. This is obvious by the 0xe5 at offset 0x2600, whose value was changed to 0x42. We next moved on to the reconstruction of the JPG image. To begin it was obvious that the long filename in the directory entry was changed from .jpg to .jpgc. This was fixed by nulling out the 'c' at offset 0x2662. We then wondered why this file showed up as the 0xf6's instead of the actual file. Analyzing the directory entry for this file it is apparent that the starting cluster value was modified. The current starting cluster at offset 0x26ba (bytes 26-27) is 0x01a4 -- this needs to be remedied. The first cluster starts at offset 0x3800 and the JPG file starts at offset 0x9200 -- clusters sizes are 0x200 each. A little math: 9200 - 3800 = 5400, 0x5400 / 0x200 = 42 or 0x002a (bytes have to be swapped), so we swapped 0x01a4 with 0x002a. The final file that needed restoration was the zip file. The first thing that needed to be done was to rename the extension from .exe back to .zip (offset 0x2708). Checking through the rest of directory entry it becomes apparent that the file length was incorrectly set as 1000 bytes (0xe803) when it should really be 2416 bytes (0x7009). This change was made at offset 0x71c (bytes 28-31). At offset 0xcf20 we find this very intriguing line: "pw=goodtimes". Could this be the password to the encrypted zip file? Apparently it is. What is interesting is that "goodtimes" exists in our standard dictionary, the dictionary attack must have failed due to improper formatting of the zip file. This line falls right below the JPG file, is this the crucial data that question number 2 is referencing? We weren't convinced at first. The fact that the JPG was grainy led us to believe that perhaps the image contained some steganographic data. We ran Stegdetect against the file and did not find anything, this was enough evidence to convince us to change the file length of the JPG to officially contain the "pw=goodtimes" string. There really isn't much more to this then what was mentioned in the answers section. The repeating 0x28a28a00 pattern really caught our attention and it was thought that perhaps this sequence was unique to a certain Microsoft program, and it didn't take much effort to create a series of JPG (JFIF) images using different image applications and compare them all. Our theory was quickly proven to hold water. While we could be wrong, we've already beaten this thing to death and didn't feel like proceeding any further.
[1] "A Known Plaintext Attack on the PKZIP Stream Cipher", Eli Biham Paul & C. Kocher, http://citeseer.nj.nec.com/122586.html
|