Zeek is a passive network metadata logger and scripting framework. Reads packets, reconstructs sessions, decodes application protocols, and emits one structured log per protocol per directory. Investigations start in conn.log and pivot via the connection uid. Pair with tshark for packet-level confirmation.

Side: blue


Run modes

zeek -r capture.pcap                                  # offline: process PCAP, write log dir to cwd
zeek -r capture.pcap LogAscii::use_json=T             # emit JSON instead of TSV (ELK / Splunk friendly)
zeek -i eth0                                          # live capture on interface
zeek -C -r capture.pcap                               # ignore checksum errors (synthetic / replayed PCAPs)

zeek -r writes to the current working directory — always mkdir case && cd case first or you scatter logs. Output is a directory of .log files plus packet_filter.log, loaded_scripts.log, reporter.log (operational metadata).


Standard log files

Each protocol the dissectors recognise produces its own log. Connection logs first; pivot via uid.

LogContentsForensic use
conn.logevery TCP/UDP/ICMP connection — 5-tuple, duration, bytes, history flagsflow baseline, top talkers, beacon detection, lateral-movement candidates
dns.logevery DNS query and responseDGAs, tunnelling, sham answers, slow exfil
http.logrequests + replies — method, host, URI, user-agent, MIME, statusdrive-by, exploit kits, IOC harvest
ssl.logTLS handshakes — SNI, version, cipher, JA3/JA3S, certificate chainencrypted-traffic triage, JA3 fingerprinting
files.logevery file extracted or seen in transit — hash, MIME, source/dest, transferring protomalware drop tracking, hash lookup
dhcp.logDHCP transactions including hostname (Opt 12)host identification at a given timestamp
kerberos.logAS-REQ / AS-REP / TGS-REQ / TGS-REPKerberos timeline, AS-REP roasting, ticket abuse
ntlm.logNTLMSSP messages with usernames + challenge/response contextNTLM relay detection, lateral-movement auth chains
smb_files.log / smb_mapping.logSMB file ops + tree connectsadmin-share access, ransomware spread, tool drops
ssh.logSSH client + server versions, auth result inferred from byte patternbrute-force candidates, anomalous client banners
ldap.logLDAP search/bind activityAD enumeration
notice.logalerts from default detection scripts (scan detection, SSL anomalies, etc.)script-driven alert channel
weird.logprotocol parser surprises (fragmentation, malformed records)first stop for “something is off”

Log format

TSV with a self-describing header. #fields declares column names; #types declares column types.

#separator	\x09
#path	dns
#fields	ts	uid	id.orig_h	id.orig_p	id.resp_h	id.resp_p	proto	query	...
#types	time	string	addr	port	addr	port	enum	string	...
1591367999.305988	Caz0hH2qDUiJTWMCY	192.168.4.76	36844	192.168.4.1	53	udp	testmyids.com	...

id.orig_h = source IP, id.resp_h = destination IP. Same convention across every Zeek log. Timestamps are epoch seconds with microsecond precision.


zeek-cut (field access)

cat dns.log | zeek-cut id.orig_h query answers
cat dns.log | zeek-cut -d ts id.orig_h query answers       # -d converts epoch to ISO-8601 with TZ
cat conn.log | zeek-cut -d ts id.orig_h id.resp_h id.resp_p duration orig_bytes resp_bytes

zeek-cut parses the #fields header so columns are referenced by name. Faster and more readable than awk-by-index. Field numbers differ per log type — awk '$22' works only when you know the exact log.


uid pivot — the core forensic pattern

Every Zeek log inherits the originating connection’s uid (a base62 string like Caz0hH2qDUiJTWMCY). Grep across every log to assemble a single connection’s full footprint.

grep "Caz0hH2qDUiJTWMCY" *.log                            # everything the connection touched

Workflow:

  1. zeek-cut over conn.log → identify suspicious flow, copy its uid
  2. grep <uid> *.log → see every dissected protocol that connection produced
  3. cross-reference files.log for any binary it transferred, dns.log for queries it made

Reduction idioms

Every line below is cat <log> | zeek-cut <fields> | <pipe>.

QuestionPipeline
top originator IPscat conn.log | zeek-cut id.orig_h | sort | uniq -c | sort -rn | head
top destination ports per sourcecat conn.log | zeek-cut id.orig_h id.resp_p | sort | uniq -c | sort -rn | head
longest-duration connections (exfil candidates)cat conn.log | zeek-cut id.orig_h id.resp_h duration | sort -k3 -rn | head
HTTP requests timelinecat http.log | zeek-cut -d ts uid method host uri | head
files seen on the wirecat files.log | zeek-cut tx_hosts mime_type filename total_bytes md5 sha1
DNS queries by sourcecat dns.log | zeek-cut id.orig_h query | sort | uniq -c | sort -rn | head -20
SSL SNI by sourcecat ssl.log | zeek-cut id.orig_h server_name | sort -u
Kerberos principals seencat kerberos.log | zeek-cut client cname realm | sort -u
NTLM authenticated identitiescat ntlm.log | zeek-cut id.orig_h username domainname hostname | sort -u
SMB files writtencat smb_files.log | zeek-cut id.orig_h path name action | sort -u

Connection history flag decode

The history field in conn.log summarises TCP connection lifecycle as a string of letters. Capital = originator, lowercase = responder.

LetterMeaning
S / sSYN
H / hSYN-ACK
A / apure ACK
D / ddata packet
F / fFIN
R / rRST
T / tretransmit

Common signatures:

  • S only — half-open SYN (scan, half-open beacon)
  • Sh — SYN, server SYN-ACK, client never replied (port open, target abandoned)
  • ShAdDaFf — full TCP session with bidirectional data and graceful close
  • ShAFf — full TCP session with no data — odd, possibly handshake-only probing

notice.log

Default detection scripts emit alerts here. Treat as the Snort-equivalent alert channel for behaviour Zeek recognises out of the box.

cat notice.log | zeek-cut -d ts note msg uid
cat notice.log | zeek-cut note | sort | uniq -c | sort -rn       # alert-type frequency

Common notice types: Scan::Port_Scan, Scan::Address_Scan, SSL::Invalid_Server_Cert, Conn::Content_Gap, Weird::Activity. The script at /usr/share/zeek/policy/protocols/conn/known-services.zeek (or distro equivalent) controls what fires.


Pitfalls

  • zeek -r writes logs to cwd. Run from a clean per-case directory or output mixes between captures.
  • zeek -i eth0 requires permissions — usually setcap cap_net_raw,cap_net_admin=eip on the binary, not just root.
  • TSV uses literal tabs as separators (\x09). awk without -F'\t' and a default FS of any-whitespace silently splits on spaces inside string fields.
  • Long-duration conn.log rows for connections that never closed: Zeek emits conn.log rows on connection close; until then the flow is in memory only. Look for low duration + S0 history when sessions are abruptly killed.
  • uid collisions can happen across very long captures — but the same uid plus id.orig_h:id.orig_p -> id.resp_h:id.resp_p is unique. Disambiguate when needed.
  • weird.log flags parser disagreements; an attacker actively confusing the dissectors will appear here before anywhere else. Check first when traffic looks “wrong”.

Field Manual | Network Security Monitoring | Network Forensics | tshark