Thursday, April 11, 2019

Nest camera app (DFRWS2018 Challenge)

Last February, I started playing the DFRWS 2018 challenge:

This is a brief description of the challenge taken from the website:

"The DFRWS 2018 challenge is about Internet of Things (IoT), defined generally to include network and Internet connected devices usually for the purpose of monitoring and automation tasks. Consumer-grade “Smart” devices are increasing in popularity and scope. These devices and the data they collect are potentially interesting for digital investigations, but also come with a number of new investigation challenges."

Unfortunately for me, I soon had to dedicate my free time to something else and forgot about the challenge. Since the submission window is now closed (the submission deadline was March 20, 2019), I'd like to share my findings about an extension-less SQLite database named frame_database belonging to the Nest app for Android installed on the mobile device found at crime scene. This post is not a full analysis of the app.

The file is located here:
  • Evidence: Samsung GSM_SM-G925F Galaxy S6 Edge
  • Physical image: blk0_sda.bin
  • Database path: P18\data\\cache\f315c6e2b5434a5381f1f5be6f73b4b3

The frame_raw_data_table table in the database caught my attention due to its name and I thought it could contain pictures or videos. The table contains 4548 records and its column names are:

frame_time | chunk_id | chunk_version | gop_start_rowid | sps_bytes | pps_bytes | frame_bytes | chunk_complete

A bit of googling made me find out that gop, sps and pps are parameters related to H.264 videos and stand for:

  • GOP = Group of pictures
  • SPS = Sequence Parameter Set
  • PPS = Picture Parameter Set

This post at stackoverflow was also useful to understand that a stream could look like this:

(AUD)(SPS)(PPS)(I-Slice)(PPS)(P-Slice)(PPS)(P-Slice) ... (AUD)(SPS)(PPS)(I-Slice)

The explanation of all these terms is beyond the scope of this article. The only thing to care about is that there's a data sequence that needs to be rebuilt in order to get meaningful data to analyze.

Having said that, this is a data snippet of what is stored in the database:

The chunk_id field is a timestamp. Frames belonging to the same group have the same chunk_id. The last frame of each group has the chunk_complete value set to 1. The frame_time field is a timestamp too, but I haven't investigated the relationship between this field and the chunk_id. I noticed there's a significant time difference between the two fields. For instance:

  • frame_time 1526289810036 translates to 14/05/2018 09:23:30 Unix milliseconds
  • chunk_id 1526288836 translates to 14/05/2018 09:07:16 Unix seconds

After several failed data combination attempts, I found out that the following fields containing blob data have to be concatenated in this order to build a working group of frames:

sps_bytes || pps_bytes || frame_bytes (first) || frame_bytes (second) || ... || frame_bytes (last)

I achieved this operation by using a SQL query built with the concatenation operator || and the GROUP_CONCAT function:

SELECT chunk_id,
       ( Hex(sps_bytes) || Hex(pps_bytes)|| frame_bytes ) AS frame_bytes
FROM   (SELECT chunk_id,
               GROUP_CONCAT(Hex(frame_bytes), '') AS frame_bytes
        FROM   frame_raw_data_table
        GROUP  BY chunk_id) 

Then I wrote a quick Python script to automate the data extraction and write data to binary files. To try the script, just copy the code below and customize the paths of db and output_path.

To make Python 3.7.2 correcly execute the SQL query, I had to download the latest version of the SQLite dll from here (I used 64-bit DLL x64 for SQLite version 3.27.2) and save it to C:\Python37\DLLs and overwrite the existing one.

from datetime import datetime
import binascii
import sqlite3

db          = "C:\\temp\\frame_database"
output_path = "C:\\temp\\"

conn = sqlite3.connect(db)
conn = conn.execute("SELECT chunk_id,(hex(sps_bytes) || hex(pps_bytes) || 
frame_bytes) AS frame_bytes FROM (SELECT chunk_id,sps_bytes,pps_bytes,
GROUP_CONCAT(hex(frame_bytes),'') AS frame_bytes FROM 
frame_raw_data_table GROUP BY chunk_id)")

rows = conn.fetchall()

for row in rows:
    chunk_id    = row[0]
    frame_bytes = row[1]
    timestamp   = datetime.utcfromtimestamp(float(chunk_id))
    timestamp   = str(timestamp).replace(" ","_").replace(":","")
    filename    = output_path + str(chunk_id) + "_" + timestamp + ".h264"
    video       = open(filename,"wb")

The very small video files (9KB/83KB) can be played with VLC media player (3.0.6):

There's always something new to learn.

Thanks DFRWS for the challenge!

Sunday, January 27, 2019

Using small details to add additional context to other artifacts

This is a quick post on some files that could add additional context to other artifacts.

VLC media player (tested version: 3.0.5 / 3.0.6)
Artifact: "restart the playback where left off"
So what?: instead of simply saying that a file was opened with VLC, it could be possible to prove that a media file was also watched from beginning to a certain (milli)second.
Description: if a media file is partially played, VLC tracks the last played position to allow the user to resume playback when reopening the same file. Depending on the operating system, the last played position value is stored in the following files:
  • Windows: C:/Users/<username>/AppData/Roaming/vlc/vlc-qt-interface.ini
  • Ubuntu: /home/<username>/.config/vlc/vlc-qt-interface.conf
  • macOS : /Users/<username>/Library/Preferences/org.videolan.vlc.plist
VLC for Windows and Ubuntu stores the values in the [RecentsMRL] section within the vlc-qt-interface file and these values are expressed in milliseconds. VLC for macOS stores the values in the recentlyPlayedMedia array within a .plist file and the values are expressed in seconds. Based on my tests, a zero value may either mean that a media file has been fully played or that less than five percent of the file contents has been played. 

To automate the parsing I wrote a Python 2.7 script. Here's an example of the output of the script

C:\> vlc-qt-interface.ini

Analyzing file: C:\temp\vlc-qt-interface.ini...

 VLC media player ('RecentsMRL' section)
 The entries are listed by default from
 the most recent to the oldest

# | Last Played Position (h:mm:ss) | Media file
1 | 0:04:05 | file:///C:/DATI/audio-video/file2.mp4
2 | 1:26:16 | file:///C:/DATI/audio-video/file1.avi

Output saved to: 20190127_150900_vlc.csv

Adblock Plus (3.4.2) add-on for Firefox (64.0.2)
Artifact: websites that have been whitelisted by the user
So what?: the user visited a website that required ad blockers to be turned off. The user had to manually disable Adblock Plus for that website.
Description: Adblock Plus is a popular ad blocker. The user can choose to allow a site to show ads by "whitelisting" it. When that happens, an entry is added to a storage.js file which is located at:
  • Windows: C:\Users\<username>\AppData\Roaming\Mozilla\Firefox\Profiles\<profileID>.default\browser-extension-data\{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}\storage.js
  • Ubuntu: /home/<username>/.mozilla/firefox/<profileID>.default/browser-extension-data/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}/storage.js
  • macOS: /Users/<username>/Library/Application Support/Firefox/Profiles/<profileID>.default/browser-extension-data/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}/storage.js
On a live machine, the list of all the websites that have been whitelisted can be viewed from Firefox by going to:
  • about:addons | Adblock Plus options | Whitelisted websites tab
That list can be directly extracted from the file mentioned above by using the following regular expression:
  • \[Subscription\]","url=~user~\d*","defaults=whitelist","","\[Subscription filters\]
The entries we're looking for are listed right after the string matching the regular expression. 
This is my script to automate the parsing: Here's an example of the output:

C:\> --default-path

# Analysis of Adblock Plus for Firefox #
File: C:/Users/xxxx/AppData/Roaming/Mozilla/Firefox/Profiles/

Whitelisted websites added by user: 2

I know that this add-on is also available for Chrome, but Chrome uses LevelDBs instead of the storage.js file and at the moment I don't know how to extract what I need in a reliable way.

NoScript (10.2.1) add-on for Firefox (64.0.2)
Artifact: web sites that have been manually set to "trusted" or "untrusted" by the user.
So what?: The user visited a website and, within the browser, manually trusted or untrusted the domain by changing the default settings through the NoScript icon.
Description: NoScript is a security add-on. When the user sets a domain to "trusted" or "untrusted", an entry containing the domain (not the full URL) is added to a .sqlite database named storage-sync.sqlite:
  • Windows: C:/Users/<username>/AppData/Roaming/Mozilla/Firefox/Profiles/<profileID>.default/storage-sync.sqlite
  • Ubuntu: /home/<username>/.mozilla/firefox/<profileID>.default/storage-sync.sqlite
  • macOS: /Users/<username>/Library/Application Support/Firefox/Profiles<profileID>.default/storage-sync.sqlite
From Firefox, NoScript settings can be reviewed by going to:
  • about:addons | NoScript options | Per-site Permissions tab
The same data can be retrieved by directly analyzing the collection_data table within the storage-sync.sqlite file. The record having collection_name = default/{73a6fe31-595d-460b-a920-fcc0f8843232} and record_id = key-policy contains the entries we're looking for. The entries have no timestamp and are stored in JSON format in the record field in the sites object. Some domains are already in the file as part of NoScript default settings.

For instance, I navigated to and set some domains to "trusted" and some others to "untrusted".

From the list alone extracted from the .sqlite file, we don't know if the entries refer to domains visited by the user or domains simply "trusted/untrusted" by the user when visiting something else.

After several trial and error attempts, I've noticed it's possible to distinguish between "visited sites" and "sites not visited" by using HTTP requests and observing the responses. It's possible that the user has visited a domain if the domain:
  • returns a HTTP response
  • doesn't redirect to another domain
  • doesn't have a very small Content-Length size
Moreover, if a domain loads scripts residing on another domain, it's highly probable that the former is the domain that was directly visited by the user.

If all or some of the above conditions are not met, then it's possible that the user has set a domain to "trusted" or "untrusted" when visiting any of the visited domains.

The script I've developed can be downloaded here:
If the script is used with a "-r" option, it will send a HTTP request to each site found in the file and try to separate the results. The more entries there are in the .sqlite file, the more accurate the script output is.

C:\> --default-path -r

Firefox_NoScript v.20190127
Script to extract the permissions that have been manually added to NoScript add-on

Analyzing file: C:/Users/xxxx/AppData/Roaming/Mozilla/Firefox/Profiles/
xxxx.default/storage-sync.sqlite ...

Non-default permissions found: (6)

Sending HTTP requests to 6 domains found in the file...

   Based on the HTTP responses received, it's possible that:
     ==> the user directly visited 1 domain(s):

     ==> the trust level for 5 domain(s) was set by the user
     when visiting other domains:

Output saved to: 20190127_141136_NoScript.csv

The Tor Browser (8.0.4) comes with NoScript (10.2.1) pre-installed. The .sqlite file is located at:
  • C:\Users\<username>\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default\storage-sync.sqlite
Differently from Firefox, and based on my tests, the storage-sync.sqlite file will retain NoScript last session's settings when the Tor Browser is closed. The settings are lost when the Tor Browser is reopened.