File carving from raw images and unallocated space. Three carvers with different strengths — Foremost, Scalpel, PhotoRec — and the methodology rule that no single tool finds everything. Tool versions verified against Kali: foremost 1.5.7, photorec 7.2, scalpel 1.60.

Side: blue


Carving methods

Three header/structure-based recovery techniques. The right one depends on file format.

MethodWhenExamples
Header / footerformat has reliable start and end markersJPEG (FF D8 FF E0FF D9), PNG (89 50 4E 4749 45 4E 44), GIF (47 49 46 3800 3B)
Header / maximum sizeheader reliable, footer absent or unreliableTXT files, MP3 (49 44 33), most logs
Header / embedded lengthformat encodes its own sizePE/EXE, ELF, ZIP, DOCX, PDF (length declared in structure)

Carvers cannot recover fragmented files in unallocated space — they can only string contiguous bytes together from a known signature forward. Plan for false-positive cleanup.


Foremost

Header/footer carver with built-in defaults. Safe first pass when triaging an unknown image.

foremost -v -t all -i image.dd -o out_dir            # carve everything Foremost knows
foremost -v -t jpeg,pdf,doc -i image.dd -o out_dir   # specific types only
foremost -v -t all -d -i image.dd -o out_dir         # add -d for UNIX FS indirect blocks
foremost -a -t all -i image.dd -o out_dir            # write all headers, skip error checking
foremost -q -t all -i image.dd -o out_dir            # quick mode (512-byte boundaries only)
FlagPurpose
-i <file>input image (or stdin)
-o <dir>output directory (must not already exist)
-t <types>file types: jpeg,pdf,doc, or all for built-ins
-c <file>custom config (defaults to foremost.conf)
-dUNIX FS indirect block detection
-awrite all headers; skip error detection (corrupted files)
-qquick mode — search on 512-byte boundaries only
-waudit log only, do not write recovered files
-vverbose, log to screen

Output: subdirectories per file type plus audit.txt with execution log. The -o directory must not pre-exist or Foremost refuses to start.

Custom signatures in /etc/foremost.conf:

# extension  case-sensitive  max-size  header (hex)               footer (hex)
gif          y               155000000 \x47\x49\x46\x38\x37\x61   \x00\x3b

Scalpel

Foremost fork — faster, lower memory, no built-in defaults. Every file type must be enabled in the config first.

scalpel -o out_dir image.dd                          # default config
scalpel -c custom.conf -o out_dir image.dd           # custom config
scalpel -b -o out_dir image.dd                       # carve even if footer not found within max size
scalpel -p -o out_dir image.dd                       # preview only — audit log, no files written
FlagPurpose
-o <dir>output directory
-c <file>config file (default /etc/scalpel/scalpel.conf)
-bcarve even when footer absent within max size
-ppreview mode (audit log only)

Default Kali config: every entry is commented out. Uncomment the file types you need, or replace the config:

sudo curl --output /etc/scalpel/scalpel.conf \
  https://raw.githubusercontent.com/sleuthkit/scalpel/master/scalpel.conf

Hard-coded limit: 100 active file types per run.


PhotoRec

Interactive TUI carver from the testdisk package. Widest built-in type list of the three; actively maintained.

photorec image.dd                              # interactive
photorec /log image.dd                         # also write photorec.log
photorec /d out_dir image.dd                   # specify output dir non-interactively, then enter TUI

Workflow inside the TUI:

  1. Confirm media or image file
  2. Select partition, or No partition / Whole disk for raw images
  3. File Opt — toggle which file types to search for
  4. Filesystem type (auto-detected, or Other for raw / unknown)
  5. Free (unallocated only) or Whole (everything)
  6. Output saved to <out_dir>.1/, <out_dir>.2/, etc. (increments to preserve prior results)

Each run produces report.xml listing carved files and offsets.


Tool comparison

ToolDefault typesConfig fileInteractiveNotes
Foremostyes/etc/foremost.confnooldest; reliable zero-config first pass
Scalpelnone — all commented/etc/scalpel/scalpel.confnofaster + lower memory; needs config edit
PhotoRecyes (largest set)n/a (TUI toggle)yesmost signatures; best maintained
Autopsyvia PhotoRecn/ayes (GUI)carves unallocated space only

Each tool has a different signature database and detection algorithm. Same image, three carvers, three different result sets — overlap is partial. Always run at least two tools and compare.

Order of operations when triaging an image:

  1. PhotoRec first — widest net, broadest signature coverage
  2. Foremost second — reliable zero-config baseline; cross-check
  3. Scalpel only when speed matters on a large image, or for niche types not in Foremost’s defaults
  4. Autopsy when integrating carving into a full case workflow (carves only unallocated space)

Verifying carved output

Carvers produce false positives — bytes that look like a header but aren’t really a file. Always verify:

file out_dir/jpg/*.jpg | grep -v 'JPEG image data'         # spot mis-typed JPEGs
foremost --version && md5sum out_dir/audit.txt             # tool-version + audit-trail provenance
identify -verbose out_dir/jpg/00000123.jpg 2>&1 | head     # ImageMagick sanity check
exiftool out_dir/pdf/*.pdf | head                          # metadata for recovered docs

For PNG specifically, structure validation:

pngcheck out_dir/png/*.png                                  # corruption / extra chunks / bad CRC

A carver’s audit log (audit.txt) should be hashed and preserved — it documents which signatures fired at which offsets and is the only record of the carve operation if the output dir gets touched.


Pitfalls

  • Foremost refuses to start if -o directory already exists. Use a fresh path.
  • Scalpel produces zero results on a default Kali install — config is fully commented out.
  • Fragmented files cannot be recovered by any of these tools. They reconstruct contiguous byte runs only.
  • Carved file timestamps are the carve time, not the original file time. Do not rely on mtime of recovered files.
  • PhotoRec output increments (out_dir.1, out_dir.2) — old results remain unless you delete them.
  • Carving large images is slow and IO-bound. Run on local SSD, not network share.

Field Manual | Data Recovery | Autopsy | Forensic Tools