Name: Sebastien DAMAYE Email: sebastien.damaye@aldeid.com Answer 1: 00:25:00:fe:07:c4 Answer 2: AppleTV/2.4 Answer 3a: h Answer 3b: ha Answer 3c: hac Answer 3d: hack Answer 4: Hackers Answer 5: http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v Answer 6: Sneakers Answer 7: $9.99 Answer 8: iknowyourewatchingme Description: +-----------------------------------------------------+ | FORENSICS CONTEST | | PUZZLE #3 - Ann’s AppleTV | | http://forensicscontest.com/2009/12/28/anns-appletv | +-----------------------------------------------------+ | Sébastien DAMAYE - 2010-01-11 | +-----------------------------------------------------+ Questions: ----------- 1. What is the MAC address of Ann's AppleTV? 00:25:00:fe:07:c4 2. What User-Agent string did Ann's AppleTV use in HTTP requests? AppleTV/2.4 3. What were Ann's first four search terms on the AppleTV (all incremental searches count)? h, ha, hac, hack 4. What was the title of the first movie Ann clicked on? Hackers 5. What was the full URL to the movie trailer (defined by "preview-url")? http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v 6. What was the title of the second movie Ann clicked on? Sneakers 7. What was the price to buy it (defined by "price-display")? $9.99 8. What was the last full term Ann searched for? iknowyourewatchingme =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 0. Rough analysis ----------------------------------------------------------------------------- We first have to check file (pcap) integrity: ------------- file integrity ------------- $ wget http://forensicscontest.com/contest03/evidence03.pcap $ md5sum evidence03.pcap f8a01fbe84ef960d7cbd793e0c52a6c9 evidence03.pcap ------------- /file integrity ------------- By using nsm-console (nsm-console/capinfos), we can extract following information: ------------- nsm-console/capinfos ------------- File name: /home/sdamaye/puzzle3/evidence03.pcap File type: Wireshark/tcpdump/... - libpcap File encapsulation: Ethernet Number of packets: 1778 File size: 1537222 bytes Data size: 1508750 bytes Capture duration: 171.171561 seconds Start time: Mon Dec 28 05:08:01 2009 End time: Mon Dec 28 05:10:52 2009 Data rate: 8814.26 bytes/s Data rate: 70514.05 bits/s Average packet size: 848.57 bytes ------------- /nsm-console/capinfos ------------- It is interesting to notice that the capture is done on a very short time (less than 3 minutes). Using tshark, we can request the protocol hierarchy: ------------- protocol hierarchy ------------- $ tshark -r evidence03.pcap -z io,phs (...truncated...) frame frames:1778 bytes:1508750 eth frames:1778 bytes:1508750 ip frames:1778 bytes:1508750 udp frames:28 bytes:6102 dns frames:28 bytes:6102 tcp frames:1750 bytes:1502648 http frames:167 bytes:93189 image-gif frames:33 bytes:21202 xml frames:18 bytes:20852 tcp.segments frames:65 bytes:46469 http frames:65 bytes:46469 xml frames:17 bytes:11732 image-jfif frames:48 bytes:34737 ------------- /protocol hierarchy ------------- We know we have to deal with xml and images (gif [Graphics Interchange Format] and jpeg [JPEG File Interchange Format]). This could also have been performed by issuing following commands: ------------- Content-types ------------- $ tcpflow -r evidence03.pcap $ strings * | grep Content-Type | sort | uniq Content-Type: image/gif Content-Type: image/jpeg Content-Type: text/xml Content-Type: text/xml; charset=UTF-8; encoding=UTF-8 ------------- /Content-types ------------- It could also be appropriate to see the implied hosts: ------------- List of hosts ------------- $ argus -r evidence03.pcap -w evidence03.ra $ rahosts -r evidence03.ra 192.168.1.10: (11) 4.2.2.1, 8.18.65.10, 8.18.65.27, 8.18.65.32, 8.18.65.58, 8.18.65.67, 8.18.65.82, 8.18.65.88 - 8.18.65.89, 66.235.132.121, 224.0.0.251 $ racluster -M norep -m saddr daddr -nr evidence03.ra -w - \ | rasort -L0 -m bytes -s saddr daddr pkts bytes SrcAddr DstAddr TotPkts TotBytes 192.168.1.10 8.18.65.82 760 739734 <-- (image/jpeg) 192.168.1.10 8.18.65.10 391 399031 <-- (image/jpeg) 192.168.1.10 8.18.65.58 189 180068 <-- (image/jpeg) 192.168.1.10 66.235.132.121 147 46145 <-- (image/gif) 192.168.1.10 8.18.65.27 69 39138 <-- (text/xml) 192.168.1.10 8.18.65.67 67 34888 <-- (text/xml) 192.168.1.10 8.18.65.32 46 30470 <-- (text/xml) 192.168.1.10 8.18.65.88 39 17231 <-- (text/xml) 192.168.1.10 8.18.65.89 42 15943 <-- (text/xml) 192.168.1.10 4.2.2.1 22 3458 <-- (DNS) 192.168.1.10 224.0.0.251 6 2644 <-- (MDNS [Apple's Multicast DNS]) ------------- /List of hosts ------------- The information related to file types is not provided by rahosts. Nevertheless, it is given by pyHttpXtract.py, the script I wrote on the occasion of this puzzle. Whois requests provided by ip2asn (a module of nsm-console) resolve hosts as follows: ------------- nsm-console/ip2asn ------------- Bulk mode; whois.cymru.com [2009-12-29 11:46:00 +0000] 3356 | 8.18.65.10 | LEVEL3 Level 3 Communications 3356 | 8.18.65.32 | LEVEL3 Level 3 Communications 3356 | 8.18.65.27 | LEVEL3 Level 3 Communications 3356 | 8.18.65.82 | LEVEL3 Level 3 Communications 3356 | 8.18.65.88 | LEVEL3 Level 3 Communications 3356 | 4.2.2.1 | LEVEL3 Level 3 Communications 3356 | 8.18.65.89 | LEVEL3 Level 3 Communications 15224 | 66.235.132.121 | OMNITURE ==== 3356 | 8.18.65.67 | LEVEL3 Level 3 Communications NA | 224.0.0.251 | NA 3356 | 8.18.65.58 | LEVEL3 Level 3 Communications NA | 192.168.1.10 | NA ------------- /nsm-console/ip2asn ------------- Many servers are resolved as Level 3 Communications, a leading international provider of fiber-based communications services. It is also interesting to notice the presence of Omniture (confirmed as Omniture DC/2.0.0 in the HTTP headers), that is known to be a tracking system (Web analytics) and that (we will discover later in the capture) is only delivering 2x2px gif blank images. These images are called "web-beacons" and enable statistics. "At present web analytics data are typically collected from server logs or using web beacons. Web-beacons are small image requests placed in a web page to cause communication between the user's device and a server. The server may be controlled by the analytics provider, by the vendor whose website contains the web-beacons, or by another party. Web-beacons are also known as clear GIFs, web bugs, image requests, or pixel tags. Web-beacons can be used for advertising, behavioral targeting, and other processes, to gather information a visits to websites. Web-beacons are commonly used by analytics providers to gather analytics data." (Source: http://www.faqs.org/patents/app/20080249905) =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 1. What is the MAC address of Ann's AppleTV? ----------------------------------------------------------------------------- It is stated that "Ann got a brand new AppleTV, and configured it with the static IP address 192.168.1.10". It is easy to get corresponding physical address thanks to the "-e" (shows mac addresses) option in tcpdump: ------------- mac address ------------- $ tcpdump -e -r evidence03.pcap -c 1 'host 192.168.1.10' reading from file evidence03.pcap, link-type EN10MB (Ethernet) 05:08:01.139183 00:25:00:fe:07:c4 (oui Unknown) > 00:23:69:ad:57:7b (oui Unknown), ethertype IPv4 (0x0800), length 79: 192.168.1.10.49174 > vnsc-pri.sys.gtei.net.domain: 40605+ A? ax.itunes.apple.com. (37) ------------- /mac address ------------- We see that Ann's AppleTV mac address is "00:25:00:fe:07:c4" It is interesting to notice that nsm-console's p0f (2.0.8) detects "FreeBSD 6.x" as OS. In fact, we learn on wikipedia (source: http://en.wikipedia.org/wiki/Apple_TV) that Apple TV 3.0.1 is based on Mac OS X 10.4 since October 29, 2009. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 2. What User-Agent string did Ann's AppleTV use in HTTP requests? ----------------------------------------------------------------------------- User-Agent information is transmitted in the HTTP requests. Tcpdump enables to see this information: ------------- User-agent ------------- $ tcpdump -X -r evidence03.pcap 05:08:01.453151 IP 192.168.1.10.49163 > 8.18.65.67.www: Flags [P.], ack 1, win 65535, options [nop,nop,TS val 1093999775 ecr 2140351007], length 346 (...truncated...) 0x00e0: 3031 3135 4437 465b 4345 5d0d 0a55 7365 0115D7F[CE]..Use <------- 0x00f0: 722d 4167 656e 743a 2041 7070 6c65 5456 r-Agent:.AppleTV <------- 0x0100: 2f32 2e34 0d0a 4966 2d4d 6f64 6966 6965 /2.4..If-Modifie <------- (...truncated...) ------------- /User-agent ------------- We can see that User-Agent string is: AppleTV/2.4 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 3. What were Ann's first four search terms on the AppleTV (all incremental searches count)? ----------------------------------------------------------------------------- The following python script enables to extract HTTP requests starting with the string "/WebObjects". ------------- webObjects.py ------------- #!/usr/bin/python import pcapy import impacket.ImpactDecoder as Decoders reader = pcapy.open_offline("evidence03.pcap") (header, payload) = reader.next() while payload!='': try: decoder = Decoders.EthDecoder() eth = decoder.decode(payload) ip = eth.child() tcp = ip.child() data = tcp.get_data_as_string() arrline = data.split('\x0d\x0a') for line in arrline: if line.startswith("GET /WebObjects"): line = line.replace('GET /WebObjects/MZStore.woa/wa/', '') line = line.replace('GET /WebObjects/MZSearch.woa/wa/', '') print line (header, payload) = reader.next() except: break ------------- /webObjects.py ------------- Below are the results of the execution of this script: ------------- webObjects.py results ------------- $ ./webObjects.py viewGrouping?id=39 HTTP/1.1 incrementalSearch?media=movie&q=h HTTP/1.1 incrementalSearch?media=movie&q=ha HTTP/1.1 incrementalSearch?media=movie&q=hac HTTP/1.1 incrementalSearch?media=movie&q=hack HTTP/1.1 viewMovie?id=333441649&s=143441 HTTP/1.1 relatedItemsShelf?ct-id=3&id=333441649&storeFrontId=143441&mt=6 HTTP/1.1 incrementalSearch?media=movie&q=s HTTP/1.1 incrementalSearch?media=movie&q=sn HTTP/1.1 incrementalSearch?media=movie&q=sne HTTP/1.1 incrementalSearch?media=movie&q=sneb HTTP/1.1 incrementalSearch?media=movie&q=snea HTTP/1.1 incrementalSearch?media=movie&q=sneak HTTP/1.1 viewMovie?id=283963264&s=143441 HTTP/1.1 relatedItemsShelf?ct-id=3&id=283963264&storeFrontId=143441&mt=6 HTTP/1.1 incrementalSearch?media=movie&q=i HTTP/1.1 incrementalSearch?media=movie&q=ik HTTP/1.1 incrementalSearch?media=movie&q=ikn HTTP/1.1 incrementalSearch?media=movie&q=ikno HTTP/1.1 incrementalSearch?media=movie&q=iknow HTTP/1.1 incrementalSearch?media=movie&q=iknowy HTTP/1.1 incrementalSearch?media=movie&q=iknowyo HTTP/1.1 incrementalSearch?media=movie&q=iknowyou HTTP/1.1 incrementalSearch?media=movie&q=iknowyour HTTP/1.1 incrementalSearch?media=movie&q=iknowyoure HTTP/1.1 incrementalSearch?media=movie&q=iknowyourew HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewa HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewat HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatc HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatch HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatchi HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatchin HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatching HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatchingm HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatchingme HTTP/1.1 ------------- /webObjects.py results ------------- It shows four different types of addresses: - viewGrouping: initial display of movies results - incrementalSearch: Ann's searches - viewMovie: Movies Ann clicked on - relatedItemsShelf: viewMovie related information (movies others also watched) This script indicates that Ann four search terms (incremental) are: "h", "ha", "hac" and "hack". This is an ajax-like search, enabling an update of the results each time a new term is entered. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 4. What was the title of the first movie Ann clicked on? ----------------------------------------------------------------------------- Using previous results (webObjects.py), we deduce that Ann first clicked on reference containg id #333441649: viewMovie?id=333441649&s=143441 HTTP/1.1 In the report produced by pyHttpXtract.py, we find this request on frame #307: ------------- Frame #307 ------------- GET /WebObjects/MZStore.woa/wa/viewMovie?id=333441649&s=143441 HTTP/1.1 Accept: */* Accept-Language: en Accept-Encoding: gzip, deflate Cookie: s_vi=[CS]v1|259C176A85010C29-6000010D80115D7F[CE] User-Agent: AppleTV/2.4 X-Apple-Store-Front: 143441-1,3 Connection: keep-alive Host: ax.itunes.apple.com ------------- /Frame #307 ------------- Frame #309 shows server's response: ------------- Frame #309 ------------- HTTP/1.1 200 OK Last-Modified: Sat, 26 Dec 2009 08:58:38 GMT x-apple-lok-response-date: Sat Dec 26 04:37:43 PST 2009 Content-Encoding: gzip x-apple-lok-current-storefront: 143441-1,3 x-apple-application-site: NWK Content-Type: text/xml x-apple-lok-expire-date: Sat Dec 26 01:38:38 PST 2009 x-apple-lok-mode: disaster-recovery x-apple-lok-stor: memcached x-apple-max-age: 3600 x-apple-woa-inbound-url: /WebObjects/MZStore.woa/wa/viewMovie?id=333441649&s=143441 x-apple-application-instance: 16026 x-apple-lok-path: v0_1:MZStore/viewMovie&id=333441649&s=143441-143441-1,3,pc-3-Ak x-apple-aka-ttl: Generated Sat Dec 26 04:37:43 PST 2009, Expires Sat Dec 26 05:37:43 PST 2009, TTL 3600s x-apple-lok-ttl: Generated Sat Dec 26 00:58:38 PST 2009, Expires Sat Dec 26 01:38:38 PST 2009, TTL 2400s x-webobjects-loadaverage: 0 Content-Length: 2278 Expires: Mon, 28 Dec 2009 04:08:36 GMT Cache-Control: max-age=0, no-cache Pragma: no-cache Date: Mon, 28 Dec 2009 04:08:36 GMT Connection: keep-alive Vary: Accept-Encoding Vary: X-Apple-Store-Front X-Apple-Partner: origin.0 ------------- /Frame #309 ------------- It is interesting to notice that the server's response contains a paramater called "x-apple-woa-inbound-url", containing the request url. This response contains a xml file (Content-Type: text/xml) initially compressed (Content-Encoding: gzip) we can click on in the generated report.html (pyHttpXtract.py). An extract of the xml file is shown below: ------------- Extract of xml file (frame #309) ------------- page-type template-name item template-parameters title Hackers store-version 1.0 pings (...truncated...) ------------- /Extract of xml file (frame #309) ------------- In the key/string couples, the correspondance for "title" is "Hackers", which is the name of the first movie Ann clicked on. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 5. What was the full URL to the movie trailer (defined by "preview-url")? ----------------------------------------------------------------------------- In this same xml file (frame #309), the string corresponding to "preview-url" key is the same for all types (STDQ, SVOD, HDVOD, ...): ------------- Extract of xml file (frame #309) ------------- (...truncated...) STDQ (...truncated...) preview-url http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v (...truncated...) SVOD (...truncated...) preview-url http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v (...truncated...) HDVOD (...truncated...) preview-url http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v (...truncated...) flavors (...truncated...) preview-url http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v (...truncated...) 8:480p (...truncated...) preview-url http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v (...truncated...) 7:720p (...truncated...) preview-url http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v (...truncated...) (...truncated...) ------------- /Extract of xml file (frame #309) ------------- STDQ = Standard Quality VoD = Video on Demand SVoD = Subscription Video on Demand HDVoD = High Definition Video on Demand The preview address is: http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 6. What was the title of the second movie Ann clicked on? ----------------------------------------------------------------------------- Using same results as for question #3, we see that second movie Ann clicked on has id #283963264 and corresponds to following line: viewMovie?id=283963264&s=143441 HTTP/1.1 This reference appears in the generated report.html (pyHttpXtract.py) on frame #1181 with following request: ------------- Frame #1181 ------------- GET /WebObjects/MZStore.woa/wa/viewMovie?id=283963264&s=143441 HTTP/1.1 Accept: */* Accept-Language: en Accept-Encoding: gzip, deflate Cookie: s_vi=[CS]v1|259C176A85010C29-6000010D80115D7F[CE] User-Agent: AppleTV/2.4 X-Apple-Store-Front: 143441-1,3 Connection: keep-alive Host: ax.itunes.apple.com ------------- /Frame #1181 ------------- Server's response is on frame #1183: ------------- Frame #1183 ------------- HTTP/1.1 200 OK Last-Modified: Sat, 26 Dec 2009 04:51:14 GMT x-apple-lok-response-date: Sat Dec 26 02:33:51 PST 2009 Content-Encoding: gzip x-apple-lok-current-storefront: 143441-1,3 x-apple-application-site: CUP Content-Type: text/xml x-apple-lok-expire-date: Fri Dec 25 21:31:14 PST 2009 x-apple-lok-mode: disaster-recovery x-apple-lok-stor: memcached x-apple-max-age: 3600 x-apple-woa-inbound-url: /WebObjects/MZStore.woa/wa/viewMovie?id=283963264&s=143441 x-apple-application-instance: 3127 x-apple-lok-path: v0_1:MZStore/viewMovie&id=283963264&s=143441-143441-1,3,pc-3-Ak x-apple-aka-ttl: Generated Sat Dec 26 02:33:51 PST 2009, Expires Sat Dec 26 03:33:51 PST 2009, TTL 3600s x-apple-lok-ttl: Generated Fri Dec 25 20:51:14 PST 2009, Expires Fri Dec 25 21:31:14 PST 2009, TTL 2400s x-webobjects-loadaverage: 0 Content-Length: 2586 Vary: Accept-Encoding Vary: X-Apple-Store-Front Expires: Mon, 28 Dec 2009 04:09:29 GMT Cache-Control: max-age=0, no-cache Pragma: no-cache Date: Mon, 28 Dec 2009 04:09:29 GMT Connection: keep-alive Vary: X-Apple-Store-Front X-Apple-Partner: origin.0 ------------- /Frame #1183 ------------- This response contains a xml file that provides following results: ------------- Extract of xml file on frame #1183 ------------- page-type template-name item template-parameters title Sneakers (...truncated...) ------------- /Extract of xml file on frame #1183 ------------- The "string" for key "title" is "Sneakers" which corresponds to the second movie Ann clicked on. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 7. What was the price to buy it (defined by "price-display")? ----------------------------------------------------------------------------- Following the xml file, we also find a unique price for standard quality and a 640x480 resolution: ------------- Extract of xml file on frame #1183 ------------- (...truncated...) store-offers STDQ (...truncated...) price-display $9.99 (...truncated...) (...truncated...) flavors 4:640x480LC-128 (...truncated...) price-display $9.99 (...truncated...) (...truncated...) ------------- /Extract of xml file on frame #1183 ------------- We see that the price to watch the movie is $9.99. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 8. What was the last full term Ann searched for? ----------------------------------------------------------------------------- Using the results of webObjects.p script, we see that Ann's last incremental search is: incrementalSearch?media=movie&q=i HTTP/1.1 incrementalSearch?media=movie&q=ik HTTP/1.1 incrementalSearch?media=movie&q=ikn HTTP/1.1 incrementalSearch?media=movie&q=ikno HTTP/1.1 incrementalSearch?media=movie&q=iknow HTTP/1.1 incrementalSearch?media=movie&q=iknowy HTTP/1.1 incrementalSearch?media=movie&q=iknowyo HTTP/1.1 incrementalSearch?media=movie&q=iknowyou HTTP/1.1 incrementalSearch?media=movie&q=iknowyour HTTP/1.1 incrementalSearch?media=movie&q=iknowyoure HTTP/1.1 incrementalSearch?media=movie&q=iknowyourew HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewa HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewat HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatc HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatch HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatchi HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatchin HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatching HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatchingm HTTP/1.1 incrementalSearch?media=movie&q=iknowyourewatchingme HTTP/1.1 The last full term Ann searched for is "iknowyourewatchingme". Does Ann really know that she has been tracked? =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 9. pyHttpXtract.py ----------------------------------------------------------------------------- 9.1. Objectives This script aims at decoding HTTP conversations from a pcap file. It displays each request and response, and for these latests, it decodes the eventual attachements (text/xml, image/gif, image/jpeg), even if they are compressed with gzip. Gzip is specified through RFC1952. Applied to our example, we recognize a gzip signature (1f8b) on frame #45: 746e 6572 3a20 6f72 6967 696e 2e30 0d0a tner: origin.0.. 0d0a 1f8b 0800 0000 0000 0000 edbd e976 ...............v dc38 9a2d fabf 9f02 95f7 56af acb5 2c81 .8.-......V...,. According to RFC1952, we can decompose bits as follows: +---+---+---+---+---+---+---+---+---+---+ |ID1|ID2|CM |FLG| MTIME |XFL|OS | +---+---+---+---+---+---+---+---+---+---+ |1f |8b |08 |00 |00 00 00 00 |00 |00 | +---+---+---+---+---+---+---+---+---+---+ We can identify: +-----+----+------------------------------------------------------------------------------+ | ID1 | 1f | Fixed value of 31 (0x1f, \037) composing first part of gzip magic number | +-----+----+------------------------------------------------------------------------------+ | ID2 | 8b | Fixed value of 139 (0x8b, \213) composing second part of gzip magic number | +-----+----+------------------------------------------------------------------------------+ | CM | 08 | Compression method. "This identifies the compression method used in the file.| | | | CM = 0-7 are reserved. CM = 8 denotes the "deflate" compression method, which| | | | is the one customarily used by gzip and which is documented elsewhere". Since| | | | the value here is 8, we know we have to deal with "deflate" format. | +-----+----+------------------------------------------------------------------------------+ | FLG | | all values are 0 in our capture. This flag byte is divided into individual | | | | bits as follows: | | | | bit 0 FTEXT | | | | bit 1 FHCRC | | | | bit 2 FEXTRA | | | | bit 3 FNAME | | | | bit 4 FCOMMENT | | | | bit 5 reserved | | | | bit 6 reserved | | | | bit 7 reserved | +-----+----+------------------------------------------------------------------------------+ |MTIME| 00 | | +-----+----+------------------------------------------------------------------------------+ | XFL | 00 | | +-----+----+------------------------------------------------------------------------------+ | OS | 00 | OS : | | | | 0 - FAT filesystem (MS-DOS, OS/2, NT/Win32) | | | | 1 - Amiga | | | | 2 - VMS (or OpenVMS) | | | | 3 - Unix | | | | 4 - VM/CMS | | | | 5 - Atari TOS | | | | 6 - HPFS filesystem (OS/2, NT) | | | | 7 - Macintosh | | | | 8 - Z-System | | | | 9 - CP/M | | | | 10 - TOPS-20 | | | | 11 - NTFS filesystem (NT) | | | | 12 - QDOS | | | | 13 - Acorn RISCOS | | | | 255 - unknown | +-----+----+------------------------------------------------------------------------------+ 9.2. Script limitations - Doesn't check input file type (pcap) - Only recognizes text/xml, image/gif, image/jpeg - Only works "off line" (use of pcapy.open_offline) - Tested for Linux Debian 5 only (should be compatible with other systems but untested) - Script is slow due to use of chardet python's module 9.3. Script content [ See pyHttpXtract.py ] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 10. Post analyze and conclusions ----------------------------------------------------------------------------- 10.1. Post analyze Two images couldn't have been read by the script because they seem to be encrypted. According to "command-tab" blog (source: http://www.command-tab.com/2006/07/27/itunes-art/), "Apple limited iTunes album art access to users who download a song or album requiring the artwork". The first image is requested on frame #331: ------------- Frame #331 ------------- GET /us/r1000/032/Video/f0/48/dd/mzi.pizbdeal.enc.jpg?downloadKey2=1265245618_f3a714a27ea9388f7c07104353e1d763 HTTP/1.1 Accept: */* Accept-Language: en Accept-Encoding: gzip, deflate Cookie: s_vi=[CS]v1|259C176A85010C29-6000010D80115D7F[CE] User-Agent: AppleTV/2.4 X-Apple-Store-Front: 143441-1,3 Connection: keep-alive Host: a1.phobos.apple.com ------------- /Frame #331 ------------- Server response is on frame #333: ------------- Frame #333 ------------- HTTP/1.1 200 OK Server: Apache ETag: "083a10d28fed5c5ffb12f54f3626960d:1254204840" Last-Modified: Tue, 29 Sep 2009 06:09:24 GMT Accept-Ranges: bytes Content-Length: 380247 Content-Type: image/jpeg Cache-Control: max-age=2592000 Expires: Wed, 27 Jan 2010 04:08:36 GMT Date: Mon, 28 Dec 2009 04:08:36 GMT Connection: keep-alive ------------- /Frame #333 ------------- The second image is requested on frame #1201: ------------- Frame #1201 ------------- GET /us/r1000/013/Music/ab/89/91/mzi.mndbzjie.enc.jpg?downloadKey2=1265341450_c424218767dba839ab1937b1fce2bd04 HTTP/1.1 Accept: */* Accept-Language: en Accept-Encoding: gzip, deflate Cookie: s_vi=[CS]v1|259C176A85010C29-6000010D80115D7F[CE] User-Agent: AppleTV/2.4 X-Apple-Store-Front: 143441-1,3 Connection: keep-alive Host: a1.phobos.apple.com ------------- /Frame #1201 ------------- And the server's response is on frame #1353: ------------- Frame #1353 ------------- HTTP/1.1 200 OK Server: Apache ETag: "e83c98c11ed2e1f3d24d475ace96b731:1223262319" Last-Modified: Mon, 06 Oct 2008 03:02:06 GMT Accept-Ranges: bytes Content-Length: 11923 Content-Type: image/jpeg Cache-Control: max-age=2592000 Expires: Wed, 27 Jan 2010 04:09:30 GMT Date: Mon, 28 Dec 2009 04:09:30 GMT Connection: keep-alive ------------- /Frame #1353 ------------- We notice that these images are requested with a key (downloadKey2) that seems to be composed of two parts, according to Marv on Record's blog (source: http://marv.kordix.com/archives/000849.html#3): - The first part (before "_") seems to correspond to the expiration date (unix format) - The second part seems to correspond to a MD5 hash I would also like to mention an error (I think it is) on frame #1559 ("Cneonction" instead of "Connection"). In addition, Connection appears twice (close, keep-alive) like if the first one would have been manually written (one of Ann's hack ;-) . ------------- Frame #1559 ------------- HTTP/1.1 200 OK Server: Apache/2.2.6 (Unix) Last-Modified: Thu, 17 Sep 2009 17:52:42 GMT ETag: "17c0789-4517-ae9d2e80" Accept-Ranges: bytes Content-Length: 17687 Cneonction: close <<<--------- error ? Content-Type: image/jpeg Cache-Control: max-age=2592000 Expires: Wed, 27 Jan 2010 04:09:42 GMT Date: Mon, 28 Dec 2009 04:09:42 GMT Connection: keep-alive ------------- /Frame #1559 ------------- 10.2. Conclusions It is my second participation to http://forensicscontest.com/ and I'm always enjoying this stuff. It was really interesting to work on gzip and xml files and discover some Apple's particularities (encrypted jpeg). I also discovered "web-beacons" as tracking system via Omniture server. Nevertheless, I found it more difficult than the previous puzzles for scripting it. In addition, although many previous scripts (puzzles #1 and #2) have been done in perl or ruby, I made the choice of scripting in Python first because it was a challenge and then because it is used by excellent network-oriented tools (Scapy, Honeysnap, ...). Although, I was suprised by the lack of documentation for some Python modules (Impacket) on the Internet. I am pretty sure my script is improvable but I'm quite satisfied with it since I have been learning Python since very recently (few days before publication of this contest). Hence, I didn't have much time to get much knowledges. I am now going to try to improve my script and I'm impatient to work on the next puzzle to answer these questions: What is the next surprise? Are we going to discover whether Ann really knows about us?