Forensic image creation and hash verification. dd-family tools (dc3dd, dcfldd) for raw images with inline hashing, ewfacquire for E01 containers with case metadata, md5sum / sha256sum for image verification, ewfverify for E01 integrity check. Verified against dc3dd 7.2.646, ewfacquire 20140816, GNU coreutils.
Side: blue
Tool selection
| Need | Tool | Why |
|---|---|---|
| Raw image with inline hashing | dc3dd | DoD Cyber Crime Centre dd variant; cryptographic hash during write, verify-on-read, progress, error handling |
| Alternative raw with split output | dcfldd | DoD Computer Forensics Lab variant; logging, hash piecewise, split files |
| E01 container with case metadata | ewfacquire | embeds case number, examiner, evidence number, hashes inside the container |
| Plain hash of an existing file | sha256sum / md5sum | one-shot hash record |
| Verify an E01 has not been altered | ewfverify | re-hashes disk bytes inside the container against stored hash |
Plain dd works but lacks hashing, progress reporting, and error handling. Use dc3dd or dcfldd for any forensic use.
dd (the baseline, not for forensic use alone)
sudo dd if=/dev/sda of=~/sda.img bs=4M status=progress
sudo dd if=/dev/sda of=~/sda.img bs=4M conv=noerror,sync status=progress # tolerate read errors| Operand | Meaning |
|---|---|
if= | input file or device |
of= | output file |
bs= | block size; 4M is fast for whole-disk imaging on modern hardware |
conv=noerror | continue past read errors |
conv=sync | pad each error block with zeros to keep offsets stable (use with noerror) |
status=progress | show throughput on stderr (newer coreutils only) |
dd does not hash. To validate the resulting image you must hash both source and copy separately and compare.
dc3dd (preferred forensic raw imaging)
# image a disk with inline SHA-256 + verify-on-write
sudo dc3dd if=/dev/sda hash=sha256 log=acquire.log of=disk.img
# multiple hashes, split output, more verbose log
sudo dc3dd if=/dev/sda hash=md5 hash=sha256 hash=sha1 \
ofsz=2G ofs=disk.000 log=acquire.log
# verify a previously-acquired image against the source
sudo dc3dd if=/dev/sda verb=on hash=sha256 vf=disk.img| Operand | Purpose |
|---|---|
if= / of= | source / destination, same as dd |
hash=<algo> | compute hash inline. Repeat for multiple algorithms. Algorithms: md5, sha1, sha256, sha384, sha512 |
hashlog=<file> | write hashes to a separate file |
log=<file> | write everything (operands, hashes, errors, throughput) to one log |
ofsz=<size> | split output into segments of this size (2G, 4G, etc.) |
ofs=<base> | base name for split segments (disk.000, disk.001, …) |
vf=<file> | verify-from: re-read source and compare against this image, hash both |
verb=on | verbose progress |
nwspc=on | enable non-whitespace mode for log |
iflag=skip_bytes / oflag=seek_bytes | offset modes for partial reads/writes |
The log file becomes part of the evidence record — keep it with the image.
dcfldd (alternative forensic dd)
sudo dcfldd if=/dev/sda hash=sha256 hashlog=hashes.txt of=disk.img bs=4M
# split output and hash each piece
sudo dcfldd if=/dev/sda hash=sha256 hashwindow=1G \
hashlog=hashes.txt split=1G splitformat=000 of=disk.img| Operand | Purpose |
|---|---|
hash=<algos> | comma-separated list (e.g. hash=md5,sha256) |
hashwindow=<size> | hash every N bytes (piecewise hashing — useful for spotting where corruption occurred) |
hashlog=<file> | hash output destination |
split=<size> | split output into segments |
splitformat=<fmt> | naming style: 000 for disk.img.000, nnn for ascending numbers |
vf=<file> | verify-from |
Piecewise hashing (hashwindow) lets you compare images section-by-section. Useful when an image differs from the original — you can localise which region changed.
ewfacquire (E01 container)
E01 wraps the raw bitstream with case metadata, hashes, optional compression, and optional encryption.
sudo ewfacquire /dev/sda
# interactive prompts:
# Image path: /evidence/case-001
# Case number: <number>
# Description: <free text describing the system>
# Examiner name: <your name>
# Evidence number: <number>
# Notes: <free text>
# Compression level: best | fast | none
# Format: encase6 | encase5 | encase4 | encase3 | encase2 | encase1 | smart | ftk | linen5 | linen6 | linen7
# Bytes per sector: 512
# Sectors per chunk: 64
# Sectors to read: <total>
# Read error retries: 2
# Bytes per chunk: 65536
# Hash algorithms: md5 + sha1 (default), or md5+sha1+sha256Non-interactive form
sudo ewfacquire \
-C 'CASE-2026-001' \
-D 'Acquired Dell Latitude 5520 SSD, 500 GB' \
-e 'analyst-name' \
-E '8' \
-N 'acquired post-incident, host powered off' \
-t /evidence/case-001/disk \
-d sha256 \
-c best \
-f encase6 \
/dev/sda| Flag | Purpose |
|---|---|
-t <target> | output path prefix; produces target.E01, target.E02, … |
-C <case> | case number |
-D <desc> | description |
-e <name> | examiner name |
-E <num> | evidence number |
-N <notes> | free-text notes |
-d <hash> | additional digest type beyond the format default (e.g. sha256 adds to MD5+SHA1) |
-c <level> | compression: none / fast / best |
-f <format> | EWF format dialect |
After acquisition, validate with ewfverify.
Hashing
md5sum disk.img > disk.img.md5
sha1sum disk.img > disk.img.sha1
sha256sum disk.img > disk.img.sha256
# compute and verify in one operation
sha256sum -c disk.img.sha256 # checks the image against its recorded hashAlgorithms
| Algorithm | Output bits | Forensic role |
|---|---|---|
| MD5 | 128 | legacy. Still in every forensic format for backward compat. Cryptographically broken (collisions feasible) but adequate for verifying a single uncorrupted file |
| SHA-1 | 160 | broken since 2017 (SHAttered). Phasing out but still in EnCase / FTK |
| SHA-256 | 256 | current default. Use as the primary hash for every new acquisition |
| SHA-384 | 384 | rarely needed for forensics; same family as SHA-256 |
| SHA-512 | 512 | similar to 384 |
Record at least two hashes per image (typically MD5 + SHA-256). MD5 stays in for tool-compat; SHA-256 is the cryptographic anchor.
Hash chains and evidence record
Every step that touches the image gets a hash logged:
- Source disk: hash before imaging (where physically possible — for live systems, skip this step and document that the source was live)
- Image after creation: hash recorded by
dc3dd/dcfldd/ewfacquirelog - Each copy made: re-hash on arrival; compare against acquisition hash
- Image after analysis: re-hash to prove no modification during analysis
# typical evidence-record sequence
sudo dc3dd if=/dev/sda hash=md5 hash=sha256 log=acquire.log of=disk.img
md5sum disk.img > disk.img.md5
sha256sum disk.img > disk.img.sha256
# later, on copy arrival
sha256sum -c disk.img.sha256 # exit 0 = matchewfverify (E01 integrity check)
ewfverify disk.E01 # rehash internal data, compare to stored
ewfverify -d sha256 disk.E01 # also compute SHA-256 in addition to stored hashes
ewfverify -l verify.log disk.E01 # write logRun on every E01 image before mounting or analysing. Stored vs calculated hash mismatch = container has been altered or corrupted; analysis is contaminated.
Pre vs post acquisition hashing
Static (dead) acquisition
Source is powered off and write-blocked. Pre-acquisition hash should match post-acquisition hash exactly. Mismatch = imaging error or write-blocker failure.
# before
sha256sum /dev/sda > source.sha256
# image
sudo dc3dd if=/dev/sda hash=sha256 log=acquire.log of=disk.img
# after - source hash from acquire.log should equal source.sha256Live acquisition
Source is running and writing while you image. Pre-acquisition hash will not match post-acquisition hash — data changes during the read. The image’s hash is taken at acquisition time and used downstream to verify the image has not changed since acquisition. Document explicitly in the case notes that pre-acquisition hash is not applicable for a live system.
Pitfalls
- Plain
ddhas no hashing or error reporting. Always usedc3dd/dcflddfor forensic captures. conv=noerrorwithoutconv=syncshifts all subsequent bytes after a bad block. Always pair them.dc3dd hash=md5only hashes once; for multi-hash use repeat the operand:hash=md5 hash=sha256.ewfacquireinteractive mode is convenient but not reproducible. Use the non-interactive form with explicit flags so the case record can recreate the acquisition.ewfverifyon a 100 GiB image takes minutes (it re-hashes every byte). Plan time accordingly.- Hash algorithms differ in CPU cost. SHA-256 is ~2x slower than MD5 on x86; on a modern host this is negligible relative to disk read speed but can matter on constrained capture hardware.
- Forgetting to hash a copy after transport is the single most common chain-of-custody failure. Always re-hash on arrival.
- Different tools may format the same hash differently —
md5sumoutputs<hash> <filename>(two spaces), some tools omit the filename. When automating verification, parse with care.
links:
Field Manual | Data Acquisition | Mount Disk Images | Forensic Tools