Multi-version Data recovery for Cluster Identifier Forensics Filesystem with Identifier Integrity Mohammed Alhussein , Duminda Wijesekera Department of Computer Science George Mason University Fairfax, VA 22030, USA Abstract— Recovering deleted information from a hard disk has been a long standing problem. The computer forensics community has addressed information recovery through the development of file carving techniques. Two issues, however, still present significant challenges to their on-going efforts – 1) Prior knowledge of file types is required for building file carvers including file headers and footers, and 2) fragmentation prevents file carvers from successful recovery. As a solution, we propose a forensics file system that embeds a special identifier in every cluster that is either currently allocated or was in the past. The identifier keeps track of every cluster mapping the clusters to a single file irrespective of the file status – existing or deleted. We modified an exFAT implementation on FUSE to implement our forensics file system. We also propose a hashing mechanism that can detect malicious or accidental manipulation of a cluster’s identifier. In addition, we introduce the concept of multi-version recovery, where multiple instances of a file can be recovered based on a cluster specific timestamps inserted during the write operation. Finally, using controlled experiments we have been able to verify that our proposed file system successfully recovers all deleted files in our test environment. Keywords: computer forensics; data recovery I. INTRODUCTION Computer forensics is concerned with computer systems that are involved in crimes. These computer systems can be used to aid in criminal activity. For example, criminals can utilize Internet search engines to obtain information on how to commit a physical crime. Computer systems can also be the target of a crime, such as illegally accessing a system to delete information [1]. In both cases, computer forensics attempts to preserve, collect, recover, analyze and present information from computer systems in a way that is acceptable in court [7]. One significant area of computer forensics is the recovery of evidence. An attacker can deliberately delete information where a deletion can be either the goal of the attack, such as deleting incriminating documents or videos, or the attacker can delete data within the computer system such as log files to hide the attack trail. The data recovery process is also invoked in the case of accidental deletion of data, or in cases where storage devices crash and/or get corrupted. There are many techniques used to recover data from storage devices, depending on the nature of the deleted content and the associated file system. When a user deletes a file, the file system information linking to the deleted file is kept intact, and recovery of the deleted file therefore becomes a straightforward operation in most cases. In the File Allocation Table aka “FAT” file system for example, deleting a file will result in marking the corresponding cluster entries as empty i.e., 0 in the FAT table. However, the information pointing to the actual file will still be present until the corresponding entry in the FAT table is associated with another file. In case where pointer information is no longer available in the FAT table, or if the file system itself is corrupted, data recovery becomes more challenging, requiring more sophisticated techniques in the absence of file meta-data that can lead to the location of the file within the storage unit. Such techniques are referred to as File Carving techniques. Pal et al define file carving as “a forensics technique that recovers files based merely on file structure and content and without any matching file system meta-data” [2]. Some file carvers use file structure, such as the file header written by the user application, to recover data. Conversely, more advanced file carvers will use knowledge of the file content to recover data employing statistical and/or artificial intelligence techniques. Although file carving is a powerful technique for data recovery, we highlight two issues that present challenges for forensics investigators: • Prior knowledge of file types and file content • File fragmentation In order for a file carver to be built, knowledge of file types and their content is required. Although many current file carvers can handle numerous file types, this is still an issue when faced with new file types. The second and more important problem is fragmentation. Recovering fragmented files present significant challenges to file carving as demonstrated in [3]. As we can see, as the number of fragments increase, the problem becomes much more difficult. Our system specifically tackles above issues. We propose a forensics file system that can help forensics investigators International Journal of Intelligent Computing Research (IJICR), Volume 4, Issue 3, September 2013 Copyright © 2013, Infonomics Society 348