Flow-level network forensic analysis with YAF (PCAP → IPFIX flow records) and SiLK (CERT/CC’s flow storage and analysis suite). Useful when packet payloads are gone but flow metadata survives, or for triaging high-volume captures where per-packet inspection is impractical. Pair with tshark for packet-level confirmation.

Side: blue


Flow concepts

A flow is a unidirectional summary of traffic between two endpoints over a short window. Classic 5-tuple identifies a flow:

Field
1source IP
2destination IP
3source port
4destination port
5transport protocol (TCP / UDP / ICMP)

Each record also carries aggregates: packet count, byte count, start time, duration, end time, and the union of TCP flags seen across all packets in the flow.

NetFlow v5 / v9 and IPFIX (RFC 7011) are interchangeable from the analyst’s perspective. SiLK reads both.


YAF — PCAP to IPFIX

YAF (Yet Another Flowmeter, CERT/CC) reads packets from a PCAP and emits bidirectional IPFIX records. Useful when you have a PCAP but no router.

yaf --in capture.pcap --out capture.yaf                   # PCAP -> IPFIX records
yafscii --in capture.yaf | less                           # decode IPFIX to readable text

Install from source:

export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig            # libfixbuf must resolve
./configure --enable-applable --enable-plugins
make
sudo make install

YAF is not packaged in standard distros; expect a from-source build.


SiLK — flow storage and analysis

rwipfix2silk capture.yaf > capture.rw                     # IPFIX -> SiLK binary
rwfileinfo capture.rw                                     # file metadata: byte order, record count, file size
rwstats capture.rw --fields=sIP --values=bytes --top --count=10        # top talkers
rwsort --fields=sIP,proto capture.rw | rwscan --scan-model=2           # scan detection
ToolPurpose
rwipfix2silkconvert IPFIX (yaf output) to SiLK binary format (.rw)
rwfileinfoprint file metadata: byte order, record count, compression, file size
rwsortorder records by chosen fields (sIP, proto, stime, etc.)
rwstatstop-N / bottom-N / descriptive statistics by any key field
rwscanapply scan-detection models to sorted flow records

Scan detection workflow

rwsort --fields=sIP,proto capture.rw | rwscan --scan-model=2

Input must be sorted by source IP so all records from one scanner reach the detector contiguously. Output names the scanner IP and scan kind (TCP SYN, UDP, mixed). Confirm in Wireshark on the source IP.

--scan-model= selects the detection algorithm; model 2 covers mixed scan types including internal scans.


Pipeline

capture.pcap                                          # raw evidence
   |
   v  yaf --in capture.pcap --out capture.yaf
capture.yaf                                           # IPFIX records
   |
   v  rwipfix2silk capture.yaf > capture.rw
capture.rw                                            # SiLK binary
   |
   +-- rwfileinfo capture.rw                          # provenance metadata
   +-- rwstats --fields=sIP --values=bytes --top      # top talkers
   +-- rwsort --fields=sIP,proto | rwscan             # scan detection

Caveats

  • Sampling: high-throughput exporters often sample 1-in-N packets. Flow counts scale, per-flow byte counts do not. Always confirm sampling rate before drawing volume conclusions.
  • Asymmetric routing: a flow may be split across two routers if return traffic takes a different path. Expect some single-direction flows.
  • Timestamp granularity: flow timestamps are typically 1 ms. Sub-millisecond inference about TCP state changes is not reliable.
  • TCP flags are a union: the flags field ORs every TCP flag seen in the flow. A long-lived connection shows SAFR even when those flags arrived in different packets.
  • No payload: flows have packet/byte counts but not payload. For content evidence, pivot to PCAP via tshark using the 5-tuple as the filter.

Field Manual | Network Logging and Flow Analysis | Network Forensics | tshark