Chapter 9: Packet Analysis and Protocol Investigation

Learning Objectives

Extract files and artifacts from TCP streams using PCAP captures and Wireshark
Identify intrusion key elements from PCAP data: source/destination addresses, ports, protocols, and payloads
Interpret protocol headers for intrusion analysis across Ethernet, IPv4/IPv6, TCP, UDP, ICMP, DNS, HTTP, SMTP, and ARP
Apply basic regular expressions to filter and identify security artifacts in Wireshark and SIEM platforms

Section 1: PCAP Analysis and File Extraction

A PCAP (Packet CAPture) file is a binary file containing timestamped network frames. Two formats are common: the original libpcap format and PCAPNG (Wireshark's default), which supports multiple interfaces and extended metadata.

Capture Tools

Tool	Role	Primary Use Case
`tcpdump`	CLI capture and filter	Headless servers, scripted capture, IR triage
Wireshark	GUI capture and analysis	Deep inspection, stream reassembly, file extraction
`tshark`	CLI version of Wireshark	Scripted analysis, automation, SIEM integration
NetworkMiner	Passive analysis	Artifact extraction, host profiling

Common tcpdump Capture Commands

# Capture all traffic on eth0, rotate every 100MB
tcpdump -i eth0 -w /evidence/capture-%Y%m%d-%H%M%S.pcap -C 100 -Z analyst

# Capture only traffic to/from a suspected C2 server
tcpdump -i eth0 host 203.0.113.55 -w /evidence/c2-traffic.pcap

# Capture DNS and HTTP for exfiltration analysis
tcpdump -i eth0 'port 53 or port 80' -w /evidence/dns-http.pcap

TCP Stream Reassembly in Wireshark

TCP is stream-oriented — application data is segmented across multiple packets. Wireshark's reassembly engine merges segments in sequence number order.

Following a TCP Stream: Right-click any packet → Follow > TCP Stream. Switch to Raw or Hex mode for binary data. A binary starting with MZ bytes is a Windows PE executable.

Exporting HTTP Objects

For HTTP traffic: File > Export Objects > HTTP lists every reassembled HTTP object — HTML pages, images, executables, ZIP archives. This is far more efficient than manual stream following.

Encrypted Traffic Limitations

Approach	Requirement	Notes
Pre-master secret log	Browser key logging pre-configured	Set `SSLKEYLOGFILE` env variable
Server private key	Access to TLS private key	Only works for non-PFS cipher suites
Endpoint inspection	EDR/agent on the host	Captures data before encryption
JA3/JA3S fingerprinting	Metadata only	Identifies TLS client/server fingerprints

PCAP Incident Response Triage Workflow

Key Points — Section 1

PCAP vs PCAPNG: PCAPNG is Wireshark's extended format supporting multi-interface captures and metadata comments.
TCP Reassembly: Wireshark automatically merges segments; right-click any packet and select Follow > TCP Stream to see the full conversation.
MZ magic bytes (4D 5A) at the start of a TCP stream payload identify a Windows PE executable being transferred.
File > Export Objects > HTTP recovers all reassembled HTTP objects (files, scripts, archives) from a PCAP in one step.
Encrypted traffic requires key material (SSLKEYLOGFILE, server private key, or endpoint EDR) for plaintext recovery; without it, use JA3 fingerprinting.

Pre-Check — Section 1

1. What are the magic bytes (hex) that identify a Windows PE executable at the start of a TCP stream payload?

50 4B 03 04 4D 5A FF D8 FF 25 50 44 46

2. Which Wireshark menu path lets you extract all HTTP objects (files, images, scripts) from a PCAP in one step?

Edit > Find Packet Analyze > Follow > HTTP Stream File > Export Objects > HTTP Statistics > Protocol Hierarchy

Section 2: Intrusion Key Elements in Packet Data

The 5-Tuple

Every network session is uniquely identified by its 5-tuple:

Element	Description	Example
Source IP	Originating host address	192.168.1.105
Destination IP	Target host address	203.0.113.55
Source Port	Originating application port	54231 (ephemeral)
Destination Port	Target service port	443 (HTTPS)
Protocol	Layer 4 protocol	TCP (6)

Port/Service Anomalies

Scenario	What to Look For
Non-standard port for common service	HTTP on port 8888, SSH on port 2222
Legitimate port, wrong protocol	IRC traffic on port 443, DNS tunneling on port 53
High ephemeral-range destination port	Possible reverse shell or C2 callback
Sequential port scanning pattern	SYN packets to consecutive destination ports

Payload Inspection — File Magic Bytes

Magic Bytes (Hex)	File Type
`4D 5A`	Windows PE executable (EXE/DLL)
`50 4B 03 04`	ZIP archive
`25 50 44 46`	PDF document
`FF D8 FF`	JPEG image
`89 50 4E 47`	PNG image

Protocol Anomalies

ICMP packet with 1,400-byte payload (normal ping is 32–56 bytes) → data exfiltration
DNS response > 512 bytes without EDNS0 → malformed / amplification attempt
HTTP GET with no Host: header → scanner or RFC violation
TVqQ in an HTTP response body → Base64-encoded MZ header (PE executable delivery)

Key Points — Section 2

The 5-tuple (src IP, dst IP, src port, dst port, protocol) is the universal session anchor across Wireshark, firewall logs, IDS alerts, and SIEM events.
Combining IP reputation, port anomaly detection, and payload magic-byte inspection gives a complete attacker intent picture without decryption.
Base64 string TVqQ in an HTTP body signals a PE executable is being delivered encoded.
Wireshark filter to correlate a 5-tuple: ip.addr == 203.0.113.55 && tcp.port == 443
Multiple internal hosts connecting to the same external IP can indicate centralized C2 or lateral movement infrastructure.

Pre-Check — Section 2

3. An analyst sees an ICMP Echo Request with a 1,400-byte payload. What does this most likely indicate?

Normal network diagnostics Data exfiltration via ICMP covert channel A fragmentation attack A SYN flood in progress

4. The five elements of a network session 5-tuple are: src IP, dst IP, src port, dst port, and ___?

VLAN ID TTL value Protocol (Layer 4) MAC address

Section 3: Protocol Header Analysis

Ethernet and ARP

Field	Size	Intrusion Relevance
Destination MAC	6 bytes	Broadcast (`FF:FF:FF:FF:FF:FF`) used in ARP scans
Source MAC	6 bytes	Spoofed MACs indicate MAC flooding or evasion
EtherType	2 bytes	`0x0800`=IPv4; `0x86DD`=IPv6; `0x0806`=ARP
VLAN Tag (802.1Q)	4 bytes (optional)	Double-tagging (`0x8100`) = VLAN hopping attack

ARP Spoofing: An attacker broadcasts gratuitous ARP replies claiming a victim IP maps to the attacker's MAC. Detection: multiple ARP replies for the same IP from different MACs in rapid succession.

IPv4 TTL Baseline

OS	Default TTL
Linux/Unix	64
Windows	128
Cisco IOS	255
Solaris	255

A packet arriving with TTL=64 claiming to be from a Windows machine is suspicious — it may be spoofed or the OS fingerprint is masked.

TCP Flag Analysis

Flag Pattern	Legitimate Use	Malicious Use
SYN only	Connection initiation	SYN flood (DoS), SYN scan (Nmap -sS)
SYN + FIN	Never valid	Stealth scan — evade stateful FW
No flags (NULL)	Never valid	NULL scan (Nmap -sN)
FIN only	Never valid in isolation	FIN scan (Nmap -sF)
URG+PSH+FIN (Xmas)	Never valid	Xmas scan (Nmap -sX)
RST flood	Error condition	DoS against TCP sessions
ACK only	Established session	ACK scan for firewall mapping

# SYN flood detection
tcp.flags.syn == 1 && tcp.flags.ack == 0

# Stealth scan (invalid SYN+FIN)
tcp.flags.syn == 1 && tcp.flags.fin == 1

# NULL scan
tcp.flags == 0x000

# Xmas scan
tcp.flags == 0x029

TCP Flag Combinations: Legitimate vs. Malicious

Key Points — Section 3

SYN+FIN, NULL, and Xmas flag combinations are never valid in legitimate TCP — their presence in a PCAP is an immediate alert.
TTL baseline analysis: Linux defaults to TTL=64, Windows to TTL=128. A mismatch between claimed OS and observed TTL suggests spoofing.
ARP spoofing detection: Multiple ARP replies for the same IP from different MACs in rapid succession — use filter arp.duplicate-address-detected.
ICMP tunneling is detected by payload size anomalies: icmp.type == 8 && data.len > 100.
VLAN double-tagging (0x8100 0x8100 EtherType) is the hallmark of a VLAN hopping attack.

Pre-Check — Section 3

5. A packet arrives claiming to originate from a Windows system, but its IP TTL is 64. What does this indicate?

The packet traveled through exactly 64 routers The packet may be spoofed — Windows default TTL is 128, not 64 The packet is fragmented The source IP is on a different subnet

6. Which TCP flag combination is used by Nmap's Xmas scan?

SYN + ACK SYN + FIN URG + PSH + FIN No flags (NULL)

Section 4: Application Protocol Analysis

DNS Tunneling Detection

DNS (UDP/TCP port 53) is almost universally allowed through firewalls, making it a favored channel for data exfiltration and C2. Data is encoded into DNS query subdomain labels:

aGVsbG8gd29ybGQ.attacker-c2.com
dGhpcyBpcyBleGZpbA.attacker-c2.com

Indicator	Normal DNS	DNS Tunneling
Query name length	< 30 characters	Often > 50 characters
Query frequency	Low, irregular	High, regular (beaconing)
Record types	A, AAAA, MX, CNAME	TXT, NULL, heavy TXT responses
Unique subdomains	Few per domain	Hundreds of unique subdomains
Query names	Human-readable	Random/encoded strings

# Long DNS query names (key heuristic for tunneling/DGA)
dns.qry.name.len > 36

# TXT record queries (commonly abused for tunneling)
dns.qry.type == 16

HTTP Header Analysis

Header	Legitimate Value	Suspicious Value
`User-Agent`	Browser string (Mozilla/5.0...)	Missing, empty, or tool signature (sqlmap, Nikto)
`Host`	Target domain name	IP address directly, or missing entirely
`Referer`	Previous page URL	Missing when expected, URL injection attempts
`X-Forwarded-For`	Legitimate proxy chain	Spoofed IPs, multiple forged headers

SMTP Exfiltration Extraction

# Step 1: Filter SMTP traffic
smtp

# Step 2: Follow TCP Stream, find Base64 attachment block
# Step 3: Decode the attachment
base64 -d attachment.b64 > attachment.bin
file attachment.bin

C2 Beaconing Pattern

Regular intervals between identical-length packets (e.g., every 60 seconds, 200-byte POST)
Fixed destination IP and port
Minimal variation in packet size (automated, not human-driven)
Detection in Wireshark: use Statistics > I/O Graph — beaconing shows as regular spikes

DNS Tunneling: Data Exfiltration Flow

Key Points — Section 4

DNS tunneling detection: dns.qry.name.len > 36 and dns.qry.type == 16 (TXT queries) are the primary Wireshark filters.
A missing or empty User-Agent HTTP header is a strong indicator of an automated scanner, bot, or custom C2 client.
C2 beaconing produces statistically regular traffic intervals — visible as uniform spikes in Wireshark's I/O graph.
SMTP attachments in PCAP can be extracted by following the TCP stream, locating the Base64 block, and decoding with base64 -d.
Unencrypted POP3 (port 110) and IMAP (port 143) transmit credentials in cleartext — filtering pop || imap may reveal harvested credentials.

Pre-Check — Section 4

7. Which Wireshark filter is the primary heuristic for detecting DNS tunneling based on query name length?

dns.qry.type == 1 dns.qry.name.len > 36 dns.flags.response == 0 dns.count.answers > 5

8. In HTTP traffic, what does a missing or empty User-Agent header typically indicate?

The client is using HTTP/2 The request is from a mobile device An automated scanner, bot, or custom C2 client The connection is encrypted with TLS

Section 5: Regular Expressions for Security Analysis

Core Regex Syntax

Metacharacter	Meaning	Example	Matches
`.`	Any single character	`a.c`	abc, a1c, a_c
`*`	Zero or more of preceding	`ab*c`	ac, abc, abbc
`+`	One or more of preceding	`ab+c`	abc, abbc (not ac)
`?`	Zero or one of preceding	`colou?r`	color, colour
`^`	Start of string/line	`^GET`	Lines starting with GET
`$`	End of string/line	`\.php$`	Strings ending in .php
`[]`	Character class	`[0-9]`	Any digit
`{n,m}`	Quantifier range	`\d{1,3}`	1 to 3 digits
`(?i)`	Case-insensitive flag	`(?i)malware`	MALWARE, Malware, malware

Security-Relevant Patterns

# IPv4 address (strict)
\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b

# IPv4 for log extraction (practical)
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

# URL pattern
https?://[^\s"'<>]+

# Suspicious TLD domains (phishing/malware)
\b\w+\.(xyz|top|club|work|gq|tk|ml|cf|ga)\b

# Email address
[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}

Wireshark PCRE Display Filters

# C2 keywords across all frame bytes
frame matches "(?i)(beacon|c2|callback|malware)"

# Encoded PowerShell in HTTP URIs
http.request.uri matches "(?i)(powershell|cmd\.exe|base64)"

# Suspicious file extensions
http.request.uri matches "\.(exe|dll|bat|ps1|vbs|jar)$"

# DNS tunneling: long subdomain names
dns.qry.name matches "^[a-zA-Z0-9]{30,}\."

# SQL injection attempts
http.request.uri matches "(?i)(union.*select|or\s+1=1|drop\s+table)"

# Empty User-Agent (scanner/bot)
http.user_agent matches "^$"

SIEM (Splunk) Rex Integration

# Extract src IPs from firewall logs
index=firewall sourcetype=asa
| rex field=_raw "src=(?P<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
| stats count by src_ip

# Web shell access patterns
index=web_logs
| regex uri="(?i)\.(php|aspx|jsp)\?.*cmd="

# Beaconing detection
index=proxy
| rex field=url "^https?://(?P<domain>[^/]+)"
| bucket _time span=1m
| stats count by _time, domain, src_ip
| where count > 10

Threat Hunting Patterns

# C2 beaconing — regular PHP POST intervals
http.request.method == "POST" && http.request.uri matches "^/[a-z]{4,8}\.php$"

# Credential harvesting — cleartext
http.authorization matches "Basic "
ftp.request.command == "PASS"
pop.request.command == "PASS" || imap.request matches "LOGIN"

# Lateral movement — SMB between internal hosts
smb || smb2 && ip.src matches "^10\." && ip.dst matches "^10\."

Key Points — Section 5

Wireshark uses PCRE (Perl-Compatible Regular Expressions) via the matches operator (also ~); the filter bar turns green for valid syntax.
The (?i) flag enables case-insensitive matching — critical since attackers vary casing to evade string-based signatures.
Named capture groups (?P<name>...) in Splunk rex extract fields for correlation and statistical analysis.
Regex finds patterns; combining with statistical thresholds (count, rate, frequency) converts patterns into actionable alerts.
Patterns developed in Wireshark translate directly to SIEM platforms with minor syntax adjustments — build once, deploy everywhere.

Post-Check — Sections 1–5

9. What Wireshark filter detects suspicious file downloads by matching executable extensions in HTTP request URIs?

http.request.method == "GET" http.request.uri matches "\.(exe|dll|bat|ps1|vbs|jar)$" tcp.port == 80 && data.len > 1000 frame matches "MZ"

10. In Splunk's rex command, what is the purpose of named capture groups like (?P<src_ip>...)?

They make the regex case-insensitive They extract matched substrings as named fields for further analysis They anchor the pattern to the start of the string They define alternation between two patterns

11. Which of the following is the correct Wireshark filter to detect SQL injection attempts in HTTP URIs?

http.request.uri contains "select" http.request.uri matches "(?i)(union.*select|or\s+1=1|drop\s+table)" http.response.code == 500 tcp.payload matches "sql"

12. An analyst wants to extract all IP addresses from Splunk firewall logs. Which Splunk command should they use?

stats count by src regex _raw="ip" rex field=_raw "src=(?P<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" iplocation src_ip

13. What does the Wireshark filter tcp.flags == 0x000 detect?

A SYN flood attack An RST flood A NULL scan (Nmap -sN) — no TCP flags set An Xmas scan

14. A host at 10.1.2.50 is generating 847 DNS queries in 30 minutes, all to subdomains of c2.evildomain.xyz, with subdomain labels 40–60 characters long. What attack is most likely occurring?

DNS amplification DDoS Cache poisoning attack DNS tunneling for data exfiltration BGP route hijacking

15. To decrypt HTTPS traffic in Wireshark when using a browser, which approach should an analyst set up before capturing?

Set the SSLKEYLOGFILE environment variable to capture the pre-master secret Use JA3 fingerprinting Follow the TLS stream in Wireshark Export objects from the HTTPS session