Zeek is a passive network metadata logger and scripting framework. Reads packets, reconstructs sessions, decodes application protocols, and emits one structured log per protocol per directory. Investigations start in conn.log and pivot via the connection uid. Pair with tshark for packet-level confirmation.
Side: blue
Run modes
zeek -r capture.pcap # offline: process PCAP, write log dir to cwd
zeek -r capture.pcap LogAscii::use_json=T # emit JSON instead of TSV (ELK / Splunk friendly)
zeek -i eth0 # live capture on interface
zeek -C -r capture.pcap # ignore checksum errors (synthetic / replayed PCAPs)zeek -r writes to the current working directory — always mkdir case && cd case first or you scatter logs. Output is a directory of .log files plus packet_filter.log, loaded_scripts.log, reporter.log (operational metadata).
Standard log files
Each protocol the dissectors recognise produces its own log. Connection logs first; pivot via uid.
| Log | Contents | Forensic use |
|---|---|---|
conn.log | every TCP/UDP/ICMP connection — 5-tuple, duration, bytes, history flags | flow baseline, top talkers, beacon detection, lateral-movement candidates |
dns.log | every DNS query and response | DGAs, tunnelling, sham answers, slow exfil |
http.log | requests + replies — method, host, URI, user-agent, MIME, status | drive-by, exploit kits, IOC harvest |
ssl.log | TLS handshakes — SNI, version, cipher, JA3/JA3S, certificate chain | encrypted-traffic triage, JA3 fingerprinting |
files.log | every file extracted or seen in transit — hash, MIME, source/dest, transferring proto | malware drop tracking, hash lookup |
dhcp.log | DHCP transactions including hostname (Opt 12) | host identification at a given timestamp |
kerberos.log | AS-REQ / AS-REP / TGS-REQ / TGS-REP | Kerberos timeline, AS-REP roasting, ticket abuse |
ntlm.log | NTLMSSP messages with usernames + challenge/response context | NTLM relay detection, lateral-movement auth chains |
smb_files.log / smb_mapping.log | SMB file ops + tree connects | admin-share access, ransomware spread, tool drops |
ssh.log | SSH client + server versions, auth result inferred from byte pattern | brute-force candidates, anomalous client banners |
ldap.log | LDAP search/bind activity | AD enumeration |
notice.log | alerts from default detection scripts (scan detection, SSL anomalies, etc.) | script-driven alert channel |
weird.log | protocol parser surprises (fragmentation, malformed records) | first stop for “something is off” |
Log format
TSV with a self-describing header. #fields declares column names; #types declares column types.
#separator \x09
#path dns
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto query ...
#types time string addr port addr port enum string ...
1591367999.305988 Caz0hH2qDUiJTWMCY 192.168.4.76 36844 192.168.4.1 53 udp testmyids.com ...
id.orig_h = source IP, id.resp_h = destination IP. Same convention across every Zeek log. Timestamps are epoch seconds with microsecond precision.
zeek-cut (field access)
cat dns.log | zeek-cut id.orig_h query answers
cat dns.log | zeek-cut -d ts id.orig_h query answers # -d converts epoch to ISO-8601 with TZ
cat conn.log | zeek-cut -d ts id.orig_h id.resp_h id.resp_p duration orig_bytes resp_byteszeek-cut parses the #fields header so columns are referenced by name. Faster and more readable than awk-by-index. Field numbers differ per log type — awk '$22' works only when you know the exact log.
uid pivot — the core forensic pattern
Every Zeek log inherits the originating connection’s uid (a base62 string like Caz0hH2qDUiJTWMCY). Grep across every log to assemble a single connection’s full footprint.
grep "Caz0hH2qDUiJTWMCY" *.log # everything the connection touchedWorkflow:
zeek-cutoverconn.log→ identify suspicious flow, copy itsuidgrep <uid> *.log→ see every dissected protocol that connection produced- cross-reference
files.logfor any binary it transferred,dns.logfor queries it made
Reduction idioms
Every line below is cat <log> | zeek-cut <fields> | <pipe>.
| Question | Pipeline |
|---|---|
| top originator IPs | cat conn.log | zeek-cut id.orig_h | sort | uniq -c | sort -rn | head |
| top destination ports per source | cat conn.log | zeek-cut id.orig_h id.resp_p | sort | uniq -c | sort -rn | head |
| longest-duration connections (exfil candidates) | cat conn.log | zeek-cut id.orig_h id.resp_h duration | sort -k3 -rn | head |
| HTTP requests timeline | cat http.log | zeek-cut -d ts uid method host uri | head |
| files seen on the wire | cat files.log | zeek-cut tx_hosts mime_type filename total_bytes md5 sha1 |
| DNS queries by source | cat dns.log | zeek-cut id.orig_h query | sort | uniq -c | sort -rn | head -20 |
| SSL SNI by source | cat ssl.log | zeek-cut id.orig_h server_name | sort -u |
| Kerberos principals seen | cat kerberos.log | zeek-cut client cname realm | sort -u |
| NTLM authenticated identities | cat ntlm.log | zeek-cut id.orig_h username domainname hostname | sort -u |
| SMB files written | cat smb_files.log | zeek-cut id.orig_h path name action | sort -u |
Connection history flag decode
The history field in conn.log summarises TCP connection lifecycle as a string of letters. Capital = originator, lowercase = responder.
| Letter | Meaning |
|---|---|
S / s | SYN |
H / h | SYN-ACK |
A / a | pure ACK |
D / d | data packet |
F / f | FIN |
R / r | RST |
T / t | retransmit |
Common signatures:
Sonly — half-open SYN (scan, half-open beacon)Sh— SYN, server SYN-ACK, client never replied (port open, target abandoned)ShAdDaFf— full TCP session with bidirectional data and graceful closeShAFf— full TCP session with no data — odd, possibly handshake-only probing
notice.log
Default detection scripts emit alerts here. Treat as the Snort-equivalent alert channel for behaviour Zeek recognises out of the box.
cat notice.log | zeek-cut -d ts note msg uid
cat notice.log | zeek-cut note | sort | uniq -c | sort -rn # alert-type frequencyCommon notice types: Scan::Port_Scan, Scan::Address_Scan, SSL::Invalid_Server_Cert, Conn::Content_Gap, Weird::Activity. The script at /usr/share/zeek/policy/protocols/conn/known-services.zeek (or distro equivalent) controls what fires.
Pitfalls
zeek -rwrites logs to cwd. Run from a clean per-case directory or output mixes between captures.zeek -i eth0requires permissions — usuallysetcap cap_net_raw,cap_net_admin=eipon the binary, not just root.- TSV uses literal tabs as separators (
\x09).awkwithout-F'\t'and a defaultFSof any-whitespace silently splits on spaces inside string fields. - Long-duration
conn.logrows for connections that never closed: Zeek emitsconn.logrows on connection close; until then the flow is in memory only. Look for lowduration+S0history when sessions are abruptly killed. uidcollisions can happen across very long captures — but the sameuidplusid.orig_h:id.orig_p -> id.resp_h:id.resp_pis unique. Disambiguate when needed.weird.logflags parser disagreements; an attacker actively confusing the dissectors will appear here before anywhere else. Check first when traffic looks “wrong”.
links:
Field Manual | Network Security Monitoring | Network Forensics | tshark