2016 International Conference on Informatics and Computing (lCIC)
A File Undelete with Aho-Corasick Algorithm
In File Recovery
Opim Salim Sitompul, Andrew Handoko, Romi Fadillah Rahmat
Departement. of Information Technology
Faculty of Computer Science and Information Technology
University of Sumatera Utara
Medan, Indonesia
opim@usu.ac.id, andrewhandoko@rocketmail.com, romi.fadillah@usu.ac.id
Abstract-In this research, a fle undelete method is
proposed by which the fle recovery system retrieved the
fle metadata through a parsing process from the master
fle table (MFT) attributes. Using the Aho-Corasick
algorithm, the process is then continued with a fltering
process in which keywords are matched with fle names.
The result obtained shows that the proposed method is
able to perform recovery of fles that has been deleted
from the fle system. The experiment is performed four
times with various fle condition which had been
overwritten 0%, 18.98%, 32.21% and 59.77% from their
original size. The rate of the average fle recovery success
is 87.50% and the average time required is 0.32 second for
string matching on fle names.
Keywords: file undelete; file recovery; parsing process;
Aho-Corasick algorithm.
I. INTRODUCTION
A file could be used as an authentic evidence in certain
criminal cases. In the form of digital file (digital evidence), an
evident is all data stored or transmitted using a computer to
support or to deny how a criminal act is happened or to show
some important element of a criminal act used as a motive or
an alibi [1]. In digital forensic, there are several types of
procedure, such as file type identification [2] and file undelete
[3]. Most of digital evidence criminalization is related to
deleting digital evidence. Digital evidence can be easily
eliminated by a criminal with deleting the file. In fact, file
deletion could be done only by deleting a reference to the fle
fom system table [4].
In some criminal cases such as corruption, digital evidence
file could be a file that has already been deleted fom file
system and in fact it still could be recovered. Files which have
been deleted fom the file system could not be accessed using
file manager. However, the file is still intact in the hard disk
and could be recovered as long as the allocated spaces for that
file have not been overwritten by other data, through deletion
or hard disk wiping [1]. The file system could be recovered
using a technique called file undelete.
978-1-5090-1648-8/16/$31.00 ©2016 IEEE
In this research, implementations of file undelete and Aho
Corasick algorithm is proposed to recover deleted files. The
Aho-Corasick algorithm has been applied to solve various
problems such as signature-based anti-virus application [4],
structural-to-syntactic matching for similar document [5], set
matching in the field of Bioinformatics [6], searching text
string on digital forensic [7], and text mining [8].
The rest of the paper will be organized as follows. Section
2 describes some previous research on file recovery. The
proposed method is illustrated in Section 3, while section 4
will discuss the result. Finally, section 5 will conclude the
research and give some recomendation for further research.
11. RELATED RESEARCH
Some previous research on deleted file recovery are using
various methods such as Boyer-Moore algorithm [3], carving
process approach to recover multimedia file [9], and forensic
reconstruction for mp3 file [10].
In 2007, a carving method using Boyer-Moore algorithm
was proposed to recover deleted files [3]. Their results showed
that the carving process requires some resource such as a long
processing time and very large storage capacity. The carving
process on 8GB target disk resulting more than 1.1 million files
with a total of 250GB each, in addition to a very large amount
of false positive. Furthermore, applying the Boyer-Moore
algorithm was considered less optimal for matching process of
file header and footer, which is O(m).
Reference [10] had reconstucted MP3 file fagment using
Variable Bit Rate (VBR). The proposed method is able to
enhance successful finding of correct fragment of file to be
reconstructed. The percentage of enhancement for high quality
MP3 file is 49.20 - 69.42%, for medium quality file is 1.80 -
3.75%, and for low quality file is 41.2 - 100.00 %. Successful
enhancement in finding fragment from file will enhance the
performance of carving process.
For the next research in 2011, a carving method was
proposed for multimedia file [9]. The proposed method can
recover multimedia files of types MP3, AVI, and WAV
perfectly for continuously allocated files. Even if a file is