Flow-level network forensic analysis with YAF (PCAP → IPFIX flow records) and SiLK (CERT/CC’s flow storage and analysis suite). Useful when packet payloads are gone but flow metadata survives, or for triaging high-volume captures where per-packet inspection is impractical. Pair with tshark for packet-level confirmation.
Side: blue
Flow concepts
A flow is a unidirectional summary of traffic between two endpoints over a short window. Classic 5-tuple identifies a flow:
| Field | |
|---|---|
| 1 | source IP |
| 2 | destination IP |
| 3 | source port |
| 4 | destination port |
| 5 | transport protocol (TCP / UDP / ICMP) |
Each record also carries aggregates: packet count, byte count, start time, duration, end time, and the union of TCP flags seen across all packets in the flow.
NetFlow v5 / v9 and IPFIX (RFC 7011) are interchangeable from the analyst’s perspective. SiLK reads both.
YAF — PCAP to IPFIX
YAF (Yet Another Flowmeter, CERT/CC) reads packets from a PCAP and emits bidirectional IPFIX records. Useful when you have a PCAP but no router.
yaf --in capture.pcap --out capture.yaf # PCAP -> IPFIX records
yafscii --in capture.yaf | less # decode IPFIX to readable textInstall from source:
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig # libfixbuf must resolve
./configure --enable-applable --enable-plugins
make
sudo make installYAF is not packaged in standard distros; expect a from-source build.
SiLK — flow storage and analysis
rwipfix2silk capture.yaf > capture.rw # IPFIX -> SiLK binary
rwfileinfo capture.rw # file metadata: byte order, record count, file size
rwstats capture.rw --fields=sIP --values=bytes --top --count=10 # top talkers
rwsort --fields=sIP,proto capture.rw | rwscan --scan-model=2 # scan detection| Tool | Purpose |
|---|---|
rwipfix2silk | convert IPFIX (yaf output) to SiLK binary format (.rw) |
rwfileinfo | print file metadata: byte order, record count, compression, file size |
rwsort | order records by chosen fields (sIP, proto, stime, etc.) |
rwstats | top-N / bottom-N / descriptive statistics by any key field |
rwscan | apply scan-detection models to sorted flow records |
Scan detection workflow
rwsort --fields=sIP,proto capture.rw | rwscan --scan-model=2Input must be sorted by source IP so all records from one scanner reach the detector contiguously. Output names the scanner IP and scan kind (TCP SYN, UDP, mixed). Confirm in Wireshark on the source IP.
--scan-model= selects the detection algorithm; model 2 covers mixed scan types including internal scans.
Pipeline
capture.pcap # raw evidence
|
v yaf --in capture.pcap --out capture.yaf
capture.yaf # IPFIX records
|
v rwipfix2silk capture.yaf > capture.rw
capture.rw # SiLK binary
|
+-- rwfileinfo capture.rw # provenance metadata
+-- rwstats --fields=sIP --values=bytes --top # top talkers
+-- rwsort --fields=sIP,proto | rwscan # scan detection
Caveats
- Sampling: high-throughput exporters often sample 1-in-N packets. Flow counts scale, per-flow byte counts do not. Always confirm sampling rate before drawing volume conclusions.
- Asymmetric routing: a flow may be split across two routers if return traffic takes a different path. Expect some single-direction flows.
- Timestamp granularity: flow timestamps are typically 1 ms. Sub-millisecond inference about TCP state changes is not reliable.
- TCP flags are a union: the
flagsfield ORs every TCP flag seen in the flow. A long-lived connection showsSAFReven when those flags arrived in different packets. - No payload: flows have packet/byte counts but not payload. For content evidence, pivot to PCAP via tshark using the 5-tuple as the filter.
links:
Field Manual | Network Logging and Flow Analysis | Network Forensics | tshark