File carving from raw images and unallocated space. Three carvers with different strengths — Foremost, Scalpel, PhotoRec — and the methodology rule that no single tool finds everything. Tool versions verified against Kali: foremost 1.5.7, photorec 7.2, scalpel 1.60.
Side: blue
Carving methods
Three header/structure-based recovery techniques. The right one depends on file format.
| Method | When | Examples |
|---|---|---|
| Header / footer | format has reliable start and end markers | JPEG (FF D8 FF E0…FF D9), PNG (89 50 4E 47…49 45 4E 44), GIF (47 49 46 38…00 3B) |
| Header / maximum size | header reliable, footer absent or unreliable | TXT files, MP3 (49 44 33), most logs |
| Header / embedded length | format encodes its own size | PE/EXE, ELF, ZIP, DOCX, PDF (length declared in structure) |
Carvers cannot recover fragmented files in unallocated space — they can only string contiguous bytes together from a known signature forward. Plan for false-positive cleanup.
Foremost
Header/footer carver with built-in defaults. Safe first pass when triaging an unknown image.
foremost -v -t all -i image.dd -o out_dir # carve everything Foremost knows
foremost -v -t jpeg,pdf,doc -i image.dd -o out_dir # specific types only
foremost -v -t all -d -i image.dd -o out_dir # add -d for UNIX FS indirect blocks
foremost -a -t all -i image.dd -o out_dir # write all headers, skip error checking
foremost -q -t all -i image.dd -o out_dir # quick mode (512-byte boundaries only)| Flag | Purpose |
|---|---|
-i <file> | input image (or stdin) |
-o <dir> | output directory (must not already exist) |
-t <types> | file types: jpeg,pdf,doc, or all for built-ins |
-c <file> | custom config (defaults to foremost.conf) |
-d | UNIX FS indirect block detection |
-a | write all headers; skip error detection (corrupted files) |
-q | quick mode — search on 512-byte boundaries only |
-w | audit log only, do not write recovered files |
-v | verbose, log to screen |
Output: subdirectories per file type plus audit.txt with execution log. The -o directory must not pre-exist or Foremost refuses to start.
Custom signatures in /etc/foremost.conf:
# extension case-sensitive max-size header (hex) footer (hex)
gif y 155000000 \x47\x49\x46\x38\x37\x61 \x00\x3b
Scalpel
Foremost fork — faster, lower memory, no built-in defaults. Every file type must be enabled in the config first.
scalpel -o out_dir image.dd # default config
scalpel -c custom.conf -o out_dir image.dd # custom config
scalpel -b -o out_dir image.dd # carve even if footer not found within max size
scalpel -p -o out_dir image.dd # preview only — audit log, no files written| Flag | Purpose |
|---|---|
-o <dir> | output directory |
-c <file> | config file (default /etc/scalpel/scalpel.conf) |
-b | carve even when footer absent within max size |
-p | preview mode (audit log only) |
Default Kali config: every entry is commented out. Uncomment the file types you need, or replace the config:
sudo curl --output /etc/scalpel/scalpel.conf \
https://raw.githubusercontent.com/sleuthkit/scalpel/master/scalpel.confHard-coded limit: 100 active file types per run.
PhotoRec
Interactive TUI carver from the testdisk package. Widest built-in type list of the three; actively maintained.
photorec image.dd # interactive
photorec /log image.dd # also write photorec.log
photorec /d out_dir image.dd # specify output dir non-interactively, then enter TUIWorkflow inside the TUI:
- Confirm media or image file
- Select partition, or No partition / Whole disk for raw images
- File Opt — toggle which file types to search for
- Filesystem type (auto-detected, or Other for raw / unknown)
- Free (unallocated only) or Whole (everything)
- Output saved to
<out_dir>.1/,<out_dir>.2/, etc. (increments to preserve prior results)
Each run produces report.xml listing carved files and offsets.
Tool comparison
| Tool | Default types | Config file | Interactive | Notes |
|---|---|---|---|---|
| Foremost | yes | /etc/foremost.conf | no | oldest; reliable zero-config first pass |
| Scalpel | none — all commented | /etc/scalpel/scalpel.conf | no | faster + lower memory; needs config edit |
| PhotoRec | yes (largest set) | n/a (TUI toggle) | yes | most signatures; best maintained |
| Autopsy | via PhotoRec | n/a | yes (GUI) | carves unallocated space only |
Each tool has a different signature database and detection algorithm. Same image, three carvers, three different result sets — overlap is partial. Always run at least two tools and compare.
Order of operations when triaging an image:
- PhotoRec first — widest net, broadest signature coverage
- Foremost second — reliable zero-config baseline; cross-check
- Scalpel only when speed matters on a large image, or for niche types not in Foremost’s defaults
- Autopsy when integrating carving into a full case workflow (carves only unallocated space)
Verifying carved output
Carvers produce false positives — bytes that look like a header but aren’t really a file. Always verify:
file out_dir/jpg/*.jpg | grep -v 'JPEG image data' # spot mis-typed JPEGs
foremost --version && md5sum out_dir/audit.txt # tool-version + audit-trail provenance
identify -verbose out_dir/jpg/00000123.jpg 2>&1 | head # ImageMagick sanity check
exiftool out_dir/pdf/*.pdf | head # metadata for recovered docsFor PNG specifically, structure validation:
pngcheck out_dir/png/*.png # corruption / extra chunks / bad CRCA carver’s audit log (audit.txt) should be hashed and preserved — it documents which signatures fired at which offsets and is the only record of the carve operation if the output dir gets touched.
Pitfalls
- Foremost refuses to start if
-odirectory already exists. Use a fresh path. - Scalpel produces zero results on a default Kali install — config is fully commented out.
- Fragmented files cannot be recovered by any of these tools. They reconstruct contiguous byte runs only.
- Carved file timestamps are the carve time, not the original file time. Do not rely on
mtimeof recovered files. - PhotoRec output increments (
out_dir.1,out_dir.2) — old results remain unless you delete them. - Carving large images is slow and IO-bound. Run on local SSD, not network share.