Chapter 3: Security Monitoring and Data Visibility
Learning Objectives
Compare attack surface analysis with vulnerability assessment methodologies
Identify data types generated by tcpdump, NetFlow, NGFW, stateful firewalls, and content filters
Explain how NAT, tunneling, encryption, TOR, and P2P impact security monitoring visibility
Describe uses of full packet capture, session data, metadata, and alert data in security monitoring
Pre-Quiz — Test Your Current Knowledge
1. An analyst wants to know which internal host initiated a suspicious outbound connection, but the firewall only logs a public IP. What additional data source is required?
Full packet capture of the session
NAT/PAT translation table correlated with the timestamp
URL filtering log for the destination domain
NetFlow byte count for the flow
2. Which statement best describes the difference between attack surface analysis and vulnerability assessment?
Attack surface analysis enumerates all potential entry points; vulnerability assessment evaluates which are exploitable
Both are identical processes, the terms are interchangeable
Vulnerability assessment is performed first, then attack surface analysis refines the results
3. A SOC analyst needs to reconstruct the exact contents of a file that was exfiltrated over HTTP. Which data type must be available?
NetFlow IPFIX records for the session
NGFW application log showing the URL category
Full packet capture (PCAP) of the session
Session metadata from Zeek conn.log
4. What capability distinguishes a next-generation firewall from a stateful firewall?
Stateful firewalls track TCP connection state; NGFWs do not
NGFWs enforce policy at Layer 7 using DPI and application identity; stateful firewalls only inspect Layer 3/4
NGFWs block all encrypted traffic; stateful firewalls allow it through
Stateful firewalls are hardware-only; NGFWs are software-only
5. Why does TLS encryption present the greatest visibility challenge in modern SOC operations?
TLS prevents NetFlow from recording flow statistics
Over 95% of web traffic is now HTTPS, hiding Layer 7 payload content from network monitoring tools
TLS randomizes source and destination ports on every connection
TLS prevents the SIEM from correlating log timestamps
6. What data does NetFlow capture that makes it suitable for detecting C2 beaconing behavior?
Full application payload of each C2 command
TLS certificate details of the C2 server
Flow timing, byte counts, and destination IP across many sessions over time
HTTP User-Agent string used by the implant
7. An employee uploads 2 GB of source code to personal cloud storage over HTTPS. Without SSL inspection, what does a SOC analyst see in NetFlow?
The file names and types transferred
A large outbound flow to a cloud provider IP — suspicious but with no content detail
A DLP alert identifying the data as Confidential
The HTTP POST body containing the file data
8. Which TOR traffic attribute CAN be observed by a SOC analyst on the enterprise perimeter?
The destination .onion address being contacted
The contents of messages passed through the circuit
The connection from the internal workstation to a known TOR entry guard IP
The identity of the exit node serving the session
9. Which data type has the longest practical retention period and supports User and Entity Behavior Analytics (UEBA)?
Full packet capture
Session/transaction data
Raw syslog from switches
Endpoint process memory dumps
10. DNS tunneling is difficult to block at the firewall level because:
DNS uses TCP port 443, which is universally allowed
DNS port 53 UDP is almost universally permitted outbound, so blocking it breaks legitimate name resolution
DNS traffic is always encrypted end-to-end
Firewalls cannot inspect UDP traffic
11. AVC (Application Visibility and Control) in an NGFW identifies applications by:
Port number alone — port 80 always means HTTP
Behavioral signatures, protocol structure, and DPI — regardless of port used
User-supplied application name in the packet header
Comparing source IP against a known application server list
12. In the three-tier SOC workflow, full packet capture is used at which stage?
Tier 1 — continuously, as the primary baselining tool
Tier 2 — to classify initial alert severity
Tier 3 — on-demand for forensic investigation after an incident is confirmed suspicious
It is not used in tiered workflows — only in standalone forensics labs
13. What is the primary security risk introduced by load balancers for SOC visibility?
Load balancers encrypt all traffic, preventing inspection
Perimeter monitoring sees the VIP address, not which backend server processed the request
Load balancers block NGFW DPI capabilities
Load balancers prevent NetFlow from recording flow statistics
14. JA3/JA3S hashes help SOC analysts because they:
Decrypt TLS sessions without requiring the private key
Identify client and server TLS implementations from ClientHello parameters, flagging known malware families without decryption
Provide the full URL of HTTPS requests
Block all connections with self-signed certificates
15. DLP is superior to NetFlow alone for detecting insider exfiltration because:
DLP captures more bytes per session record than NetFlow
DLP (with SSL inspection) classifies content type and sensitivity — identifying what is being sent, not just that a large upload occurred
DLP prevents all outbound HTTPS traffic by default
NetFlow cannot detect upload sessions
Section 1: Attack Surface and Vulnerability Analysis
1.1 Defining the Attack Surface
The attack surface is the complete set of entry points — physical, logical, and human — through which an adversary could attempt unauthorized access. It falls into three categories:
Category
Examples
Network attack surface
Open ports, exposed services, external-facing APIs, wireless access points
Software attack surface
Web applications, firmware, OS services, third-party libraries
Human attack surface
Phishing targets, social engineering vectors, insider access
Attack surface analysis answers: "What can an attacker reach?" It is a discovery exercise, not an assessment of weakness. The goal is enumeration of all potential entry points, then prioritization based on exposure, asset value, and likelihood of targeting.
1.2 Vulnerability Assessment vs. Penetration Testing
Every element of the attack surface must have a corresponding monitoring data source. Unmapped assets create blind spots. The pipeline is:
flowchart LR
A["Attack Surface Map\n(Enumerate all entry points:\nports, services, APIs,\nwireless, humans)"] --> B["Visibility Gap Analysis\n(Cross-reference each asset\nagainst existing sensor coverage;\nidentify unmonitored elements)"]
B --> C["Sensor Placement\n(Deploy NetFlow, PCAP taps,\nNGFW, DLP, email filter\nto close gaps)"]
C --> D["Monitored Perimeter\n(Every surface element\nhas a data source;\nblind spots eliminated)"]
style A fill:#0a1628,stroke:#58a6ff,color:#c9d1d9
style B fill:#0a1628,stroke:#388bfd,color:#c9d1d9
style C fill:#0a1628,stroke:#1f6feb,color:#c9d1d9
style D fill:#0a200a,stroke:#56d364,color:#c9d1d9
Key Points — Section 1
Attack surface analysis enumerates entry points; vulnerability assessment evaluates exploitability — these are distinct, sequential activities.
Penetration testing goes further than vulnerability scanning: it chains weaknesses and simulates adversary goals, requiring high skill and carrying moderate production risk.
Every unmonitored attack surface element is a blind spot — sensor placement strategy should follow the attack surface map.
NetFlow at WAN egress misses east-west lateral movement; PCAP at web servers misses internal-only threats — coverage must match surface topology.
Attack surface is dynamic: every new service, API, or user account expands it and requires re-evaluation of monitoring coverage.
Section 2: Security Monitoring Technologies
2.1 tcpdump and Full Packet Capture
tcpdump is a command-line packet analyzer that captures complete packets — Layer 2 frame through full application payload — writing to PCAP files. Wireshark provides a GUI for the same data. Together they represent the forensic gold standard.
PCAP captures everything: headers, TCP flags, sequence numbers, and full application-layer content. An HTTP session PCAP contains exact request/response bytes including credentials, file contents, and commands. The tradeoff is storage: a busy 1 Gbps link generates 100–500 GB/day, making PCAP impractical as a continuous baseline — it is a targeted, on-demand forensic instrument.
2.2 NetFlow and IPFIX
NetFlow captures flow-level summaries defined by the five-tuple: src IP, dst IP, src port, dst port, IP protocol. For every unique five-tuple, the exporter records aggregate statistics: byte count, packet count, start/end timestamps.
Field
Description
srcaddr / dstaddr
Source and destination IPv4 addresses
srcport / dstport
Source and destination Layer 4 ports
prot
IP protocol (6=TCP, 17=UDP, 1=ICMP)
dPkts / dOctets
Packet count and byte count
First / Last
Flow start and end timestamps
tcp_flags
Cumulative OR of TCP flags in the flow
NetFlow does not capture payload. At ~200 bytes/record it offers months of retention. IPFIX (RFC 7011) is the IETF-standardized successor to NetFlow v9, extending fields to include application IDs, URLs, and vendor-specific elements.
Security use cases: detecting port scans, top talkers (DDoS indicators), lateral movement (unusual internal-to-internal connections), baselines for anomaly detection, and beaconing (regular low-volume flows to a single external IP).
2.3 NGFW vs. Stateful Firewalls
A stateful firewall tracks TCP session state and enforces rules based on Layer 3/4 information only. It cannot distinguish legitimate HTTPS from malware over port 443.
A next-generation firewall (NGFW) extends to Layer 7 via Deep Packet Inspection (DPI), identifying the actual application regardless of port. Application Visibility and Control (AVC) classifies Facebook, BitTorrent, Dropbox, and thousands of apps by behavioral signatures.
PCAP = forensic ground truth; NetFlow = scalable flow summaries; NGFW logs = application and threat context at Layer 7. All three are needed.
PCAP storage cost (100–500 GB/day on 1 Gbps) makes it a targeted instrument, not a continuous baseline tool.
NetFlow enables beaconing and scan detection by analyzing flow patterns over time, even though it contains no payload.
A stateful firewall cannot distinguish legitimate HTTPS from malware on port 443 — that requires NGFW DPI/AVC.
NGFW AVC identifies applications by behavioral signature — a BitTorrent client on port 443 is still identified as BitTorrent.
Section 3: Content Filtering and Visibility
3.1 Web Content Filtering
Web content filters intercept HTTP/HTTPS requests and enforce policy based on URL categories, reputation scores, and content analysis. Every request generates a log entry: timestamp, user, destination URL, category, action, bytes transferred.
sequenceDiagram
participant User as User Workstation
participant Filter as Web Filter / Proxy
participant Intel as URL Category Engine
participant Log as SIEM / Log Store
participant Web as External Web Server
User->>Filter: HTTPS request (SNI: example.com)
Filter->>Intel: Classify URL + check reputation
Intel-->>Filter: Category: Malware C2 / Score: High Risk
alt Blocked by Policy
Filter-->>User: Block page returned
Filter->>Log: BLOCKED | user | URL | category | timestamp
else Allowed by Policy
Filter->>Web: Forward request (SSL re-encrypted)
Web-->>Filter: Response
Filter-->>User: Response forwarded
Filter->>Log: ALLOWED | user | URL | category | bytes | timestamp
end
A URL filter log showing 50 requests to "Command and Control" category URLs over 10 minutes is a high-fidelity indicator of compromise that no NetFlow record alone would reveal. Because most web traffic is now HTTPS, filters must perform SSL inspection (MITM via enterprise CA) to classify encrypted content; without it, only SNI and certificate details are visible.
3.2 Email Filtering
Email security gateways examine sender authentication (SPF, DKIM, DMARC), reputation, content, and rewrite embedded links for click-time analysis. Email filter logs often provide the first-stage incident timeline: when a phishing email arrived, whether it was delivered or quarantined, and whether any user clicked the link.
Event Type
Security Value
Blocked phishing email
High — blocked initial access attempt
Malicious attachment quarantined
High — evidence of targeted delivery
URL click after delivery
Critical — user engaged with potentially malicious link
Mass delivery from new sender
Medium — BEC or campaign indicator
3.3 Data Loss Prevention (DLP)
DLP monitors data in motion, at rest, and in use to detect unauthorized exfiltration. Integrated with NGFWs and content filters, DLP adds semantic context that packet and flow data alone cannot provide.
Endpoint and file server scans for sensitive data patterns
Data in use
Clipboard, print, USB transfer activity on endpoints
Detection methods include pattern matching (regex for SSNs, credit cards), document fingerprinting, classification label enforcement, and machine learning. Without DLP, an insider exfiltrating 2 GB of source code to personal cloud storage appears as a large upload to a cloud provider IP — suspicious but ambiguous. With DLP and SSL inspection, the session is classified as Google Drive, file types identified, content flagged as Confidential, and the session blocked with a SIEM alert generated.
Key Points — Section 3
Web filter logs add semantic context — not just that traffic occurred, but what category and what action was taken (blocked C2, allowed social media).
Without SSL inspection, web filters can only see SNI and certificate details for HTTPS traffic — URL path and content remain hidden.
Email filter logs provide the initial access timeline for phishing incidents: delivery time, quarantine status, and click-through events.
DLP with SSL inspection transforms an ambiguous "large upload to cloud" NetFlow entry into a confirmed Confidential data exfiltration alert.
Content filtering layers together transform raw network data into actionable intelligence about what the traffic means, not just that it occurred.
Section 4: Data Visibility Challenges
4.1 NAT and PAT
PAT (NAT overload) maps all internal hosts to a single public IP, differentiated only by source port. External logs show all enterprise traffic originating from one IP. Forensic attribution requires a four-step chain:
flowchart LR
A["External Alert\nSuspicious connection from\n203.0.113.10:4501 at 02:14:33"] --> B["Step 1: Query Firewall Log\nConfirm public IP, port,\ntimestamp, direction"]
B --> C["Step 2: Query PAT Table\nPublic 203.0.113.10:4501\n→ Private 10.1.1.55:52341"]
C --> D["Step 3: Query DHCP Leases\nPrivate 10.1.1.55\n→ MAC aa:bb:cc:dd:ee:ff\n→ Hostname DESKTOP-FINANCE03"]
D --> E["Attribution Complete\nInternal host identified"]
style A fill:#2d1a1a,stroke:#f85149,color:#c9d1d9
style B fill:#0a1628,stroke:#58a6ff,color:#c9d1d9
style C fill:#0a1628,stroke:#388bfd,color:#c9d1d9
style D fill:#0a1628,stroke:#1f6feb,color:#c9d1d9
style E fill:#0a200a,stroke:#56d364,color:#c9d1d9
4.2 Tunneling and Protocol Encapsulation
DNS tunneling exploits the near-universal allowance of port 53 UDP outbound. Attackers encode commands and exfiltrated data as DNS query strings to an attacker-controlled authoritative server. Detection indicators include: high query volume to a single domain, extremely long query strings, high-entropy query names (base64), uncommon record types (TXT, NULL), and low TTL with frequent re-queries.
HTTP/HTTPS tunneling sends C2 traffic over port 80/443. Without SSL inspection, NGFW and IDS tools see only connection metadata — the payload is invisible. NetFlow shows the connection; nothing shows the content.
4.3 Encryption — SSL/TLS
As of 2024, over 95% of web traffic is HTTPS. Network monitoring tools can observe the destination IP/port, TLS handshake metadata, SNI field, certificate details, and flow-level statistics. They cannot observe HTTP paths, API bodies, file contents, or authentication tokens without active decryption.
NGFW SSL inspection acts as a TLS proxy: terminate client session, inspect plaintext, re-encrypt to server. Requires deploying a trusted enterprise CA to all managed endpoints. Inapplicable to sessions with certificate pinning.
Encrypted Traffic Analysis (ETA) / JA3 fingerprinting infers application type and threat presence from packet size distributions, inter-arrival timing, flow duration, and TLS ClientHello parameters — without decryption.
4.4 TOR (The Onion Router)
TOR routes traffic through three relay hops (entry guard, middle relay, exit node) with layered encryption so no single relay knows both source and destination.
4.5 P2P Traffic
P2P networks use dynamic ports and distributed architecture. BitTorrent traffic resembles certain malware behaviors (many short flows to many IPs), creating false positives. Sophisticated attackers use P2P protocols as covert channels to disguise exfiltration as ordinary file transfer traffic. NGFW AVC detects P2P by behavioral signatures regardless of port.
4.6 Load Balancing
Load balancers distribute connections across backends, translating external connections to a VIP. Perimeter NetFlow and PCAP see only the VIP, not which backend server processed a specific request. Visibility requires sensors on backend segments, load balancer correlation logs, or per-server application logging.
Key Points — Section 4
PAT attribution requires a four-step chain: firewall log → PAT table → DHCP lease → hostname. Any missing link breaks attribution.
DNS tunneling exploits universal port 53 allowance; detection relies on behavioral analysis (query length, entropy, volume) not port blocking.
SSL/TLS hides Layer 7 content; SSL inspection (NGFW as TLS proxy) restores visibility but requires enterprise CA deployment and is inapplicable to pinned certificates.
TOR's only observable indicator at the perimeter is the connection to a known entry guard IP — destination, content, and relay chain are all hidden.
No single mitigation closes all visibility gaps — layered approaches (SSL inspection + behavioral analytics + NAT log correlation) are required.
Section 5: Security Data Types and Their Uses
5.1 Full Packet Capture
Full PCAP is the forensic ground truth: every bit of every packet including headers and payload. Storage: 100–500 GB/day. Retention: 7–30 days. Analysis requires expert tools (Wireshark, tcpdump, NetworkMiner, Zeek).
When PCAP is required: malware payload extraction and reverse engineering, file reconstruction from sessions, validating true positives (was data actually transferred?), attack chain reconstruction for legal proceedings, and decrypting captured TLS sessions when keys are available.
5.2 Session and Transaction Data
Session data (e.g., Zeek conn.log) captures summary attributes without payload: source/destination IP and port, service, bytes in/out, connection duration, and state. HTTP sessions include method, URI, host, user agent, and response code. Record size: under 100 bytes. Retention: 12+ months. Supports UEBA (User and Entity Behavior Analytics) to identify unusual session patterns over time.
5.3 Statistical and Flow Data (NetFlow/IPFIX)
Flow-level statistical data enables: volumetric anomaly detection, scan detection (port sweep patterns), beaconing detection (regular low-volume flows to external IP at consistent intervals), and top-N bandwidth analysis.
Beaconing example: A C2 implant checks in every 300 seconds with 200-byte HTTP GET requests. NetFlow shows 288 flows/day, each 200 bytes, to the same external IP, evenly spaced. Invisible in a single log view but immediately apparent when plotted over time.
5.4 Metadata
Metadata = data about communications that describes structure and behavior without revealing content. Includes TLS certificate details, DNS query/response pairs, HTTP header fields (User-Agent, Content-Type), and SMTP envelope information.
Why metadata matters even without decryption: JA3/JA3S TLS fingerprints identify malware families from ClientHello parameters; certificate SANs can reveal malicious infrastructure; HTTP User-Agent strings identify attacker frameworks (Cobalt Strike default profiles); DNS timing supports tunneling detection.
5.5 Alert Data
Alert data is generated when detection rules, signatures, or anomaly thresholds fire. Components: alert ID, timestamp, triggering rule, severity and confidence, source/destination attributes, and evidence snippet. Alert data triggers the SOC workflow tiers.
5.6 Data Type Summary
Data Type
Content
Size/Record
Retention
Primary SOC Use
Full PCAP
Complete packets + payload
~1500 bytes avg
7–30 days
Forensics, validation
NetFlow/IPFIX
5-tuple + stats, no payload
~150–200 bytes
Months
Baselining, anomaly detection
Session/transaction
Connection summary, no payload
<100 bytes
12+ months
Triage, UEBA
Metadata
Headers, certs, query data
Variable (small)
12+ months
Encrypted traffic analysis
Alert data
Security event, signature match
<1 KB
Years
Immediate triage
Key Points — Section 5
PCAP is forensic ground truth but cannot scale as a continuous tool — it is retrieved on-demand after Tier 2 triage confirms a suspicious event.
Session data (<100 bytes) and NetFlow (~200 bytes) enable 12-month+ retention, making long-term behavioral analysis (UEBA, beaconing) practical.
Metadata enables encrypted traffic analysis: JA3 fingerprints identify malware TLS implementations; DNS timing detects tunneling — all without decryption.
Alert data is the trigger, not the investigation — it points analysts to the right data sources; session and PCAP data provide the answers.
The three-tier model (continuous monitoring → alert triage → forensic PCAP) balances cost, scalability, and depth for each phase of the SOC workflow.
Post-Quiz — Test Your Understanding
Post-Quiz — Apply What You Learned
1. An analyst wants to know which internal host initiated a suspicious outbound connection, but the firewall only logs a public IP. What additional data source is required?
Full packet capture of the session
NAT/PAT translation table correlated with the timestamp
URL filtering log for the destination domain
NetFlow byte count for the flow
2. Which statement best describes the difference between attack surface analysis and vulnerability assessment?