Here we go. Please forgive me my English skills. > 1. Identify the encryption algorithim used to encrypt the file. The encryption algorithm is very simple. Every bit of the original file was negated (bitwise NOT => byte XOR 0xff), so decryption and encryption process is identical. > 2. How did you determine the encryption method? My initial assumption was that your challenge would not require any sophisticated cryptoanalysis because of the given amount of data and time. The second thought was that the size of the file is so small (532 B) that is must be some kind of text - a little piece of code probably (shell script, C, perl, etc), unless any compression was used of course. What I started to look for was a character based encryption algorithm which would use bitwise and/or arithmetic operations so that characters are transformed regardless of position in the file. Why? Maybe because I had noticed that there were recurrent sequences of the same bytes (e.g. 0x9d 0x96 [0x91]). At first I picked a few bytes from the file of which I thought I could know their decrypted value: - last byte: 0xf5, B11110101, 245 guess: '\n' 0x0a, B00001010, 10 - first byte: 0xa4, B10100100, 164 guess: '#' 0x23, B00100011, 35 - second byte: 0x99, B10011001, 153 guess #1: '!' 0x21, B00100001, 33 ( #!/bin/sh ) guess #2: 'i' 0x69, B01101001, 105 ( #include ... ) Every encrypted byte has the most significant bit set (> 128) - it is true for _all_ the bytes in 'somefile' but this alone is nothing really fascinating. The last byte stroke me. All I had to do to obtain its presumed decrypted value was to negate all the bits (i.e. XOR 0xff) but I thought that it would be too simple (!). The other hypothetical decrypted values were so likely to me that I tried to find something more elaborate which would be common to all of my three guesses. I have considered bit rotations, differend XOR masks, adding a constant value (like in the Ceasar method) with no astonishing results naturally ;) Then I tried to be a little bit more general. After all, the above reasoning was based on very uncertain assumptions. I wrote a C program which calculated the total number of _different_ characters in a given file and their relative number of manifestations (percentage) and compared the encrypted message with different types of files. Below you can find some examples: [ some of the files were taken from linux (PLD distro) running on a i386 machine ] - encrypted message ('somefile' - 532 B): 5.6 % : d3 6.4 % : 8f 8.4 % : d0 9.8 % : 8c Symbols total: 45 - long shell script with comments (/etc/rc.d/rc.sysinit - 10 KB): 4.6 % : 6e 4.6 % : 74 6.2 % : 65 13.8 % : 20 Symbols total: 94 - short shell script (/etc/rc.d/init.d/rstatd - 780 B): 6.3 % : 0a 6.3 % : 73 8.5 % : 74 12.5 % : 20 Symbols total: 60 - short binary file (gettext.o from /usr/lib/libc.a - 540 B): [ ELF 32-bit LSB relocatable, Intel 80386, version 1, stripped ] 2.2 % : 74 2.4 % : 01 2.8 % : 2e 64.5 % : 00 Symbols total: 70 - short gzip file (509 B): 1.2 % : 97 1.2 % : 9e 1.6 % : 78 2.0 % : 00 Symbols total: 219 - english text (PPP-HOWTO ;) - 150 KB) 5.8 % : 74 7.3 % : 5f 7.3 % : 65 19.4 % : 20 Symbols total: 97 If my initial assumption concerning encryption method was right, it would only change the values of characters but neither their total number nor "frequency pattern". If not, below observations would be completely irrelevant. It is quite clear that the encrypted file could not be of a typical binary type because of the distance between the two most frequent characters (9.8% & 8.4% vs. 2.8% & 64.5%). The same with a compressed file in which the information is definitely more equally dispersed over character space (more different values used, that is how compression works AFAIK) - it was not shown above but the min/max frequency values for the analyzed file were 0.2/9.8 while for the gzip file - 0.2/2.0. The rest of the compared files have one thing in common - the most frequent character is 0x20 (' '). Unfortunately the distance between 0x20 and other values is clearly bigger than the analogous distance between 0x8c and 0xd0 in our 'somefile'. Nevertheless, I changed every ocurrence of 0x8c into 0x20 and 0xf5 into 0x0a ('\n' - I just had this sort of impression that it is right) to see if I could recognize any structure or pattern from some source code. In vain. At this point I started loosing faith that any of my assumptions are close to truth. The algorithm could be position dependent or not character based. The original file could be strangely formated (e.g. with no '\n' characters and minimum number of white spaces) or even not a text file. I decided to give it one last shot and check the aforementioned XOR 0xff transformation. Much to my suprise, I hit the nail on the head. :) > 3. Decrypt the file, be sure to explain how you decrypted the file. Here is the code of an encryptor/decryptor. I think it is completely self explanatory. The program sends the result to the standard output. --- begin --- #include int main(int argc, char *argv[]) { FILE *fin; unsigned char c; fin = fopen(argv[1], "r"); while(!feof(fin)) { fread(&c, sizeof(unsigned char), 1, fin); /* XOR 0xff - bitwise NOT */ putchar(c ^ 0xff); } fclose(fin); return 0; } --- end --- And here is the decrypted file: --- begin --- [file] find=/dev/pts/01/bin/find du=/dev/pts/01/bin/du ls=/dev/pts/01/bin/ls file_filters=01,lblibps.so,sn.l,prom,cleaner,dos,uconf.inv,psbnc,lpacct,USER [ps] ps=/dev/pts/01/bin/psr ps_filters=lpq,lpsched,sh1t,psr,sshd2,lpset,lpacct,bnclp,lpsys lsof_filters=lp,uconf.inv,psniff,psr,:13000,:25000,:6668,:6667,/dev/pts/01,sn.l, prom,lsof,psbnc [netstat] netstat=/dev/pts/01/bin/netstat net_filters=47018,6668 [login] su_loc=/dev/pts/01/bin/su ping=/dev/pts/01/bin/ping passwd=/dev/pts/01/bin/passwd shell=/bin/sh su_pass=l33th4x0r --- end --- > 4. Once decrypted, explain the purpose/function of the file and why it > was encrypted. With a high degree of certainty I can say that the original name of the file was 'uconf.inv'. 'uconf' from 'Universal CONFiguration' (?) 'inv' from 'bit INVersion' It is little wonder now, why you had changed this to 'somefile'. The file is a part of a quite popular rootkit which is being found on compromised hosts (running Solaris and Linux) in different versions. Intrusions were made through popular and vulnerable services like rpc.ttdbserverd, snmpXdmid (solaris) or lprng (linux). A profound analysis of the complete rootkit is not a subject of this report. You can visit: - http://www.sans.org/y2k/the_compromise.htm - http://www.sans.org/infosecFAQ/malicious/comp_sys.htm - http://archives.neohapsis.com/archives/sf/sun/2001-q2/0096.html to obtain some information (probably not complete). One of the components of the tool was a set of altered basic system utilities, i.e. ls, find, du, ps, netstat, passwd, su, ping. In version of the rootkit that I know, the size of the binaries stayed unchanged but the beginning of the files was occupied by probably the same wrapper program which simply spawned the original utilites located at the paths given in uconf.inv (path to uconf.inv was hardcoded - probably /dev/pts/01 in our case) and filtered the results so that they did not contain comma-separated strings from 'x_filters' lines. The general purpose was to hide malicious activitity on the host. Uconf.inv syntax: [section_name1] utility_name1=exact_path1 utility_name2=exact_path2 .... section_name1_filters=comma_separated_list [section_name2] [..] The exact behaviour of the wrapper was determined by the name under which it ran. The name (i.e. 'utility_name' with an exception of 'su') was searched in the file, appropriate binary ('exact_path') executed and filtering ('section_name_filters') performed. Example: running 'ps' consisted of running '/dev/pts/01/bin/psr' and cutting out lines with any of the following strings: 'lpq, lpsched, sh1t, psr, sshd2, lpset, lpacct, bnclp, lpsys'. The name of the file had to be changed to something different than 'ps' ('psr' here) to hide the original utility in the process list but not the line with 'ps' (i.e. the wrapper process). The line with 'lsof_filters' is quite mysterious because there is no appropriate 'lsof=' definition. Possibly lsof forging was not based on the idea of a common wrapper. 'su_pass' holds plain text password for some kind of local backdoor (through 'su' itself or 'xlogin' - a binary supplied with the rootkit). Explanation of some of the filter strings: file_filters - 01 - to hide '/dev/pts/01' directory, - sn.l - file with logs from a sniffer, - prom - directory where logs from the sniffer were placed, - cleaner - utility to clean logs, - psbnc,lpacct - binaries of 'Psychoid Bouncer'. ps_filters - lpq - TFN (Tribe Flood Network) client, - lpsched - eggdrop (irc bot), - sshd2 - sshd for remote backdoor (port 13000 or 25000), - lpset - sniffer, - lpsys - identd (?). lsof and net_filters - 6667, 6668, 47018 - ports to hide activity of an eggdrop, psbnc or TFN client. Why 'uconf.inv' was encrypted? To hide the password and make finding all substituted binaries more difficult. I do not see any other serious reason. > 5. What lesson did you learn from this challenge? I have actually encountered this rootkit some time ago on solaris 2.5 box. Although I found all uploaded binaries and changes made to the system, I did not even try to decode uconf.inv. Here are my conclusions: - the knowledge that something can be done is sometimes necessary to do it (or at least highly recommended), - one should start from the simplest solutions :) > 6. How long did this challenge take you? Approximately 10 h (including writing this report). Thanks go to imoszarh.