662
International Journal on Advances in Intelligent Systems, vol 7 no 3&4, year 2014, http://www.iariajournals.org/intelligent_systems/
2014, © Copyright by authors, Published under agreement with IARIA - www.iaria.org
Representing and Publishing Cyber Forensic Data and its Provenance Metadata:
From Open to Closed Consumption
Tamer Fares Gayed, Hakim Lounis
Dépt. d’Informatique
Université du Québec à Montréal
Succursale Centre-ville, H3C 3P8,
Montréal, Canada
gayed.tamer@courrier.uqam.ca lounis.hakim@uqam.ca
Moncef Bari
Dépt. de Didactique
Université du Québec à Montréal
Succursale Centre-ville, H3C 3P8,
Montréal, Canada
bari.moncef@uqam.ca
Abstract—Role players of any forensic investigation process
record chronologically all forensic data resulted from their
investigation, in order to be presented to the juries in the court
of law. When such results are recorded and posted, they are
called chain of custodies (CoCs). The forensic data provided
within these documents play a vital role in the process of
forensic investigation, because they answer questions about
how evidences are collected, transported, analyzed, and
preserved since their seizure through their production in court.
Provenance metadata accompany these forensic data to answer
questions about the origin of these data and build trustworthy
between role players and juries in order to make the tangible
CoCs admissible in the court of law. Nowadays, with the
advent of the digital age, the forensic investigation is not only
applied to physical crime, but also on digital evidences. The
forensic data and their metadata presented in these tangible
documents need also to undergo a radical transformation from
paper to electronic data in order to accommodate this
evolution. CoCs should be also readable and consumable not
only by human but also by machines. The semantic web is a
fertile land to represent and manage the tangible CoCs,
because it uses web principles known as Linked Data
Principles (LDP), which provide useful information in
Resource Description Framework (RDF) format upon Unified
Resource Identifiers (URI) resolution. In addition, it includes
different provenance vocabularies that can be useful to express
the forensic metadata. Generally, the power of LDP resides in
publishing data publicly without any access restriction on the
web. However, the openness of forensic data and their
metadata should not be the same case. They should obey some
access restriction in order to be shared only between role
players and juries. Public Key Infrastructure (PKI) can be
applied to restrict the access to some or all resources of
represented data and bends the LDP from open to closed
consumption, while maintaining the resolution of such
restricted resources. Juries in turn will consume the restricted
represented data using different LDP consumption
applications. This paper provides the complete framework
explaining how forensic and provenance data are represented
and published using LDP, and how PKI can be used to restrict
these data/resources in order to be shared in a closed scale.
Evaluation of the framework using several empirical
experimentations will not be on the scope of this paper.
Keywords-Linked Open Data, Linked Data Principles, Linked
Closed Data, Public Key Infrastructure, Digital Certificates,
Cyber Forensics, Chain of Custody.
I. INTRODUCTION
The history of forensic investigation task dates back
thousands of years. This task is concentrating to gather and
examine evidences about the past, in order to prosecute in
the future the criminal in the court of law. With the advent
of Information and Communication Technology (ICT),
forensic investigation is not only concentrated on physical
crime, but also on the digital evidences. This emerged a new
type of forensic investigation known by
computer/cyber/digital forensic. It combines computer
science concepts including computer architecture, operating
systems, file systems, software engineering, and computer
networking, as well as legal procedures. At the most basic
level, the digital forensic process has three major phases:
extraction, analysis, and presentation. Extraction phase (i.e.,
it is also known as acquisition) saves the state of the digital
source (e.g., laptop, desktop, computers, mobile phones, or
any other digital devices) and creates an image by saving all
digital values so it can be later analyzed [1]. Analysis phase
takes the acquired data (e.g., file and directory contents and
recovering deleted contents) and examines it to identify
pieces of evidence, and draws conclusions based on the
evidences that were found. During presentation phase, the
audience is typically the judges; in this phase, the
conclusion and corresponding evidence from the
investigation analysis are presented to them [2][3].
However, there exist others models of cyber forensic
process, each of them relies upon reaching a consensus about
how to describe digital forensics and evidences [4][5].
Investigation models are numerous. Many works were
provided to explain and compare such models [6][7][8][9].
Table I shows the current digital forensic models. Each row
of the table presents the name of the digital forensic process
model, while the columns present the processes included in
each of these models [5][10].
The role players such as first responders, investigators,
expert witnesses, prosecutors, police officer, etc. may be
assigned one or more phase in the forensic process. They are
those who are responsible to create and record their own
investigation results and post them in tangible documents.