Name: Wesley McGrew 
Email: wesley@mcgrewsecurity.com
Answer 1: 00:25:00:fe:07:c4
Answer 2: AppleTV/2.4
Answer 3a: h
Answer 3b: ha
Answer 3c: hac
Answer 3d: hack
Answer 4: Hackers
Answer 5: http://a227.v.phobos.apple.com/us/r1000/008/Video/62/bd/1b/mzm.plqacyqb..640x278.h264lc.d2.p.m4v
Answer 6: Sneakers
Answer 7: $9.99
Answer 8: iknowyourewatchingme

Description: 
Network Forensics Puzzle Contest #3 Submission
Wesley McGrew
Mississippi State University National Forensics Training Center
wesley@mcgrewsecurity.com
http://mcgrewsecurity.com
http://security.cse.msstate.edu/ftc

Summary
-------

I have created a tool, atvsnarf.py, that a forensic examiner can use to extract useful information from a packet capture of AppleTV traffic into a more familiar and flexible format suitable for analysis and reporting.  Usage of this tool requires very little prior knowledge of how the AppleTV interacts with Apple's servers.  This tool has a number of useful features for forensic examiners:

   * Ease of use: All that is required to begin processing is to give atvsnarf.py the pcap file name.

   * CSV output: Easy to import into spreadsheet software for analysis, filtering, and to create tables for reports.

   * .plist extraction: Responses from Apple's servers are extracted, decompressed, and saved to .plist files that are referenced in the CSV output.  These can be analyzed easily using Apple's Property List Editor (part of the Xcode suite), or by viewing the XML directly in a text editor.

   * Accurate timestamps: To avoid issues involving time zones and clocks on the capture station, timestamps are extracted from the Apple server's responses to the AppleTV's requests, rather than using the timestamps from the pcap metadata.

   * Robustness: Since I do not have an AppleTV device to test with, my only test case is the supplied pcap.  All requests analyzed in a given pcap, including those that atvsnarf.py does not recognize, are listed with a starting packet index that can be used by an investigator to find the relevant traffic in Wireshark.  This allows for more detailed analysis when it is required.  Adding other request and response handlers should be relatively simple.

Procedure
---------

The hash given for the pcap was verified:

   MD5 (evidence03.pcap) = f8a01fbe84ef960d7cbd793e0c52a6c9

Being unfamiliar with traffic generated by AppleTV units, I opened the pcap in Wireshark and began examining how the device made its requests to the server.  By looking at the list of TCP conversations, it was obvious that the AppleTV software makes its requests using HTTP.  Examining the requests revealed that all of the requests in the supplied pcap are GETs, meaning that the parameters of the request are given in the URL itself.

Six request types were identified from the pcap, all of which are recognized, identified, and recorded by atvsnarf.py:

   * Viewed Grouping Page: Requests for pages that represent groups of media, such as the initial "Movies" page.
   * Incremental Search: Requests made as the user types search terms, presumably used to populate the autocomplete suggestions (results included in server responses) 
   * Viewed Movie Page: Requests made for individual movie information (lots of information included in response plist)
   * Related Items: Requests (presumably) made to populate a list of similar media to the one just requested.  One of these followed each of the "View" type requests.
   * Submitted Metrics: These requests are apparently a way for Apple to gather usage metrics, each one corresponds with a request of another type
   * Image: Simple requests for images to populate pages

The last two request types may be redundant for some investigations, and can be suppressed from atvsnarf.py's output by using the "-q" flag.

The Tool
--------

atvsnarf.py requires Python and the Scapy packet manipulation library. 

On Backtrack 4, it has been tested with Python 2.5.2 and Scapy 2.1.0.

On Mac OS X, it has been tested with Python 2.5.4 and Scapy 2.0.1, both from the macports repositories.  On this platform and configuration, the first line of the script should be changed to read:
	
	#!/opt/local/bin/python2.5

This ensures that atvsnarf.py uses the macports-installed version of Python that is associated with the macports-installed version of Scapy.

It is convenient to use from OS X, as it is easy to use the Property List Editor to view the extracted server responses.  The version of Scapy in macports has a wickedly huge dependency tree that may take the port command a long time to install.  If you are in a hurry, you may be better off trying to install Scapy by hand (only the bare minimum of Scapy functionality is needed).

atvsnarf.py's Usage:

./atvsnarf.py [-q] <pcap file>

-q : Quiet(er) mode.  Suppress output of metrics submission
     and image requests.  Default is to include these entries

The output is to a .csv file named "<pcap file name>.csv", as well as a series of plist files with the naming format:

<pcap file name>_<num>_<type>.plist

These files contain the plist responses from the iTunes servers the AppleTV device connects to.  The number <num> refers to the packet number in the pcap file of the corresponding request from the pcap file.  This number is indexed starting at 1, to allow the investigator to use this number to find the request and TCPstream in Wireshark.

The <type> refers to the type of request the server response is for.

The "<pcap file name>.csv" file is a comma-delimited file that contains the following information for each request:

   * Packet number (for reference in Wireshark)
   * Timestamp (extracted from server response for accuracy) 
   * Apple TV IP address  (\ Assuming pcap captured on local network)
   * Apple TV MAC address (/         to the Apple TV                )
   * Apple TV User Agent, including Version Number
   * Type of Request.  Supported Types:
      Metrics Submission
      Image
      Viewed Grouping Page
      Incremental Searches
      Viewed Movie Page
      Related Items
      (Unrecognized requests are logged and indexed, too)
   * Notes (Search terms, movie name)
      Also contains a reference to a plist with the server's response
      for most requests where it's useful
   * Full URL of the request

Tool Internals
--------------

atvsnarf.py searches through the pcap file for HTTP requests being made with User-Agent fields that begin with "AppleTV".  When it finds one, it begins processing that request to include it in the CSV.  Useful data about the request is extracted from the request packet and responses from the server.

When a request is found, atvsnarf.py attempts to reconstruct the server's response from the pcap file.  The tool gathers packets from the packet list that belong to that TCP stream, beginning at the packet immediately following the request.  Since the AppleTV uses keep-alive to make more than one request per established connection, atvsnarf.py gathers response packets up to the next request made by the AppleTV on that stream.  The tool finally reassembles the payloads of these packets into the server's response to that request.

From the response, the timestamp can be extracted.  From the request itself, the tool can gather the complete user agent (with version information), the URL of the request, and what may be the IP and MAC addresses of the AppleTV device (assuming the packets were captured on the local network with the AppleTV, and not from some other point between the device and Apple's servers).

The request type is determined by the contents of the URL.  For the most useful request types, the server responses are compressed and not immediately readable.  The tool uses python's gzip library to decompress the compressed .plist responses to these requests, and saves the responses to files that can be more easily viewed by the investigator.

For the more informative requests, a "notes" field is populated in the CSV that contains at the very least a reference to the .plist file containing the server's response.  In the case of incremental searches, this field also includes the search term the user typed.  For movie page requests, this field also contains the name of the movie.  This "notes" field is meant to be a more free-form summary of the most important information in a request, and may be used for different things as more request handlers are added to atvsnarf.py

Caveats
-------

I do not have an AppleTV, so testing of atvsnarf.py is limited to the packet capture provided.  This means that there are almost certainly other request types that it does not handle.  These can be added easily, though, given further tinkering with an AppleTV device while sniffing.

The tool attempts to gracefully acknowledge in the CSV that a request isn't recognized, and provide a packet number that can be looked up in Wireshark to find out more information.  It is possible, maybe even likely, that some unknown requests may crash atvsnarf.py, and/or cause it to consume too much time or memory to process a pcap.  This would have to be determined and remedied with more hands-on testing.

Answering the Questions
-----------------------

All of the questions posed by the challenge were very easily answered by examining atvsnarf.py's CSV output in a spreadsheet program (like Excel or Numbers), and by examining the corresponding .plist files in Apple's Property List Editor.  The same information could be found by examining the .plist file in a text editor, or by other XML viewing tools.

Additional Text:
#!/usr/bin/python
#
# AppleTV Snarf v.01
# Wesley McGrew
# Mississippi State University National Forensics Training Center
# wesley@mcgrewsecurity.com
# http://mcgrewsecurity.com
# http://security.cse.msstate.edu/ftc
# 
# Usage:
#
# ./atvsnarf.py [-q] <pcap file>
#
# -q : Quiet(er) mode.  Suppress output of metrics submission
#      and image requests.  Default is to include these entries
#
#
# The output is to a .csv file named "<pcap file name>.csv", as well 
# as a series of plist files with the naming format:
# 
# <pcap file name>_<num>_<type>.plist
#
# These files contain the plist responses from the iTunes servers
# the AppleTV device connects to.  The number <num> refers to the 
# packet number in the pcap file of the corresponding request from 
# the pcap file.  This number is indexed starting at 1, to allow 
# the investigator to use this number to find the request and TCP
# stream in Wireshark.
#
# The <type> refers to the type of request the server response is
# for.
#
# The "<pcap file name>.csv" file is a comma-delimited file that
# contains the following information for each request:
#
# * Packet number (for reference in Wireshark)
# * Timestamp (extracted from server response for accuracy)
# * Apple TV IP address  (\ Assuming pcap captured on local network)
# * Apple TV MAC address (/         to the Apple TV                )
# * Apple TV User Agent, including Version Number
# * Type of Request.  Supported Types:
#    Metrics Submission
#    Image
#    Viewed Grouping Page
#    Incremental Searches
#    Viewed Movie Page
#    Related Items
#    (Unrecognized requests are logged and indexed, too)
# * Notes (Search terms, movie name)
#    Also contains a reference to a plist with the server's response
#    for most requests where it's useful
# * Full URL of the request

from scapy.all import *
import time
import re
import gzip
import StringIO
import sys

index = 1

def get_response(p,packets):
   resp = ""
   for i in packets[index:]:
      try:
         if i.getlayer(IP).src != p.getlayer(IP).src and i.getlayer(IP).dst != p.getlayer(IP).src:
            continue
         if i.getlayer(IP).src != p.getlayer(IP).dst and i.getlayer(IP).dst != p.getlayer(IP).dst:
            continue
         if i.getlayer(TCP).sport != p.getlayer(TCP).sport and i.getlayer(TCP).dport != p.getlayer(TCP).sport:
            continue
         if i.getlayer(TCP).sport != p.getlayer(TCP).dport and i.getlayer(TCP).dport != p.getlayer(TCP).dport:
            continue
         data = str(i.getlayer(TCP).payload)
         if re.search("User-Agent: AppleTV",data):
            break; 
      except:
         continue 
      resp += str(i.getlayer(TCP).payload)
   return resp

def extract_gzip(str):
   m = re.match(r".*Content-Length: (.+?)\r\n",str,re.DOTALL)
   length = int(m.group(1))
   compressed = str[len(str)-length:]
   sio = StringIO.StringIO(compressed)
   gz = gzip.GzipFile(fileobj=sio)
   uncompressed = gz.read()
   return uncompressed

def get_type(url):
   if re.search("metrics.apple.com",url):
      type = "Submitted Metrics"
   elif re.search(".jpg",url):
      type = "Image"
   elif re.search("viewGrouping",url):
      type = "Viewed Grouping Page"
   elif re.search("incrementalSearch",url):
      type = "Incremental Search"
   elif re.search("viewMovie",url):
      type = "Viewed Movie Page"
   elif re.search("relatedItems",url):
      type = "Related Items"
   else:
      type = "Unknown"
   return type

def usage():
   print "Usage:"
   print sys.argv[0] + " [-q] <pcap file>"
   print ""
   print "-q : Quiet mode - Suppress metrics and image requests"
   print ""
   print "See the comments in the header of the code for more"
   print "documentation"

if (len(sys.argv) != 2) and (len(sys.argv) != 3):
   print "Wrong # of command line arguments"
   usage()
   sys.exit()

quiet = False
if (sys.argv[1] == "-q"):
   quiet = True
   if len(sys.argv) != 3:
      print "Missing pcap filename"
      usage()
      sys.exit()
   filename = sys.argv[2]
else:
   filename = sys.argv[1]

csvfile = open(filename+".csv",'w')
pkt = rdpcap(filename)
for p in pkt:
   try:
      data = str(p.getlayer(TCP).payload)
   except:
      index += 1
      continue
   if re.search("User-Agent: AppleTV",data):
      response = get_response(p,pkt)
      m = re.match(r".*?Date: .+?, (.+?)\r.*",response,re.DOTALL)
      atv_time = m.group(1)
      atv_ip   = p.getlayer(IP).src
      atv_mac  = p.src
      m = re.match(r".*User-Agent: (.+?)\r.*",data,re.DOTALL)
      atv_ua  = m.group(1)
      m = re.match(r"GET (.+) HTTP/1.1",data)
      atv_req = m.group(1)
      m = re.match(r".*Host: (.+)\r\n\r\n",data,re.DOTALL)
      atv_req_host = m.group(1)
      atv_url = "http://" + atv_req_host + atv_req
      atv_type = get_type(atv_url) 
      if atv_type == "Incremental Search":
         m = re.match(r".*q=(.+)",atv_url)
         atv_notes = "Query: " + m.group(1)
         data = extract_gzip(response)
         fp = open(filename+'_'+str(index)+'_incrementalSearch.plist','w')
         fp.write(data)
         fp.close()
         atv_notes += " XML: "+str(index)+"_incrementalSearch.plist"
      elif atv_type == "Viewed Movie Page":
         data = extract_gzip(response)
         fp = open(filename+'_'+str(index)+'_viewedMovie.plist','w')
         fp.write(data)
         fp.close()
         m = re.match(r".*?<key>title</key>.*?<string>(.*?)</string>",data,re.DOTALL)
         title = m.group(1)
         atv_notes = "Movie Title: "+title+" ; XML: "+str(index)+"_viewedMovie.plist"
      elif atv_type == "Viewed Grouping Page":
         data = extract_gzip(response)
         fp = open(filename+'_'+str(index)+'_viewGrouping.plist','w')
         fp.write(data)
         fp.close()
         atv_notes = "XML: "+str(index)+"_viewedGrouping.plist"
      elif atv_type == "Related Items":
         data = extract_gzip(response)
         fp = open(filename+'_'+str(index)+'_relatedItems.plist','w')
         fp.write(data)
         fp.close()
         atv_notes = "XML: "+str(index)+"_relatedItems.plist"
      else:
         atv_notes = ''
      csvfile.write(str(index)+","+atv_time+","+atv_ip+","+atv_mac+","+atv_ua+","+atv_type+","+atv_notes+","+atv_url+"\n")
   index += 1

csvfile.close()