Study Guide: Chapter 3 — Security Monitoring and Data Visibility

Pre-Quiz — Test Your Current Knowledge

1. An analyst wants to know which internal host initiated a suspicious outbound connection, but the firewall only logs a public IP. What additional data source is required?

Full packet capture of the session

NAT/PAT translation table correlated with the timestamp

URL filtering log for the destination domain

NetFlow byte count for the flow

2. Which statement best describes the difference between attack surface analysis and vulnerability assessment?

Attack surface analysis identifies exploitable weaknesses; vulnerability assessment maps all entry points

Attack surface analysis enumerates all potential entry points; vulnerability assessment evaluates which are exploitable

Both are identical processes, the terms are interchangeable

Vulnerability assessment is performed first, then attack surface analysis refines the results

3. A SOC analyst needs to reconstruct the exact contents of a file that was exfiltrated over HTTP. Which data type must be available?

NetFlow IPFIX records for the session

NGFW application log showing the URL category

Full packet capture (PCAP) of the session

Session metadata from Zeek conn.log

4. What capability distinguishes a next-generation firewall from a stateful firewall?

Stateful firewalls track TCP connection state; NGFWs do not

NGFWs enforce policy at Layer 7 using DPI and application identity; stateful firewalls only inspect Layer 3/4

NGFWs block all encrypted traffic; stateful firewalls allow it through

Stateful firewalls are hardware-only; NGFWs are software-only

5. Why does TLS encryption present the greatest visibility challenge in modern SOC operations?

TLS prevents NetFlow from recording flow statistics

Over 95% of web traffic is now HTTPS, hiding Layer 7 payload content from network monitoring tools

TLS randomizes source and destination ports on every connection

TLS prevents the SIEM from correlating log timestamps

6. What data does NetFlow capture that makes it suitable for detecting C2 beaconing behavior?

Full application payload of each C2 command

TLS certificate details of the C2 server

Flow timing, byte counts, and destination IP across many sessions over time

HTTP User-Agent string used by the implant

7. An employee uploads 2 GB of source code to personal cloud storage over HTTPS. Without SSL inspection, what does a SOC analyst see in NetFlow?

The file names and types transferred

A large outbound flow to a cloud provider IP — suspicious but with no content detail

A DLP alert identifying the data as Confidential

The HTTP POST body containing the file data

8. Which TOR traffic attribute CAN be observed by a SOC analyst on the enterprise perimeter?

The destination .onion address being contacted

The contents of messages passed through the circuit

The connection from the internal workstation to a known TOR entry guard IP

The identity of the exit node serving the session

9. Which data type has the longest practical retention period and supports User and Entity Behavior Analytics (UEBA)?

Full packet capture

Session/transaction data

Raw syslog from switches

Endpoint process memory dumps

10. DNS tunneling is difficult to block at the firewall level because:

DNS uses TCP port 443, which is universally allowed

DNS port 53 UDP is almost universally permitted outbound, so blocking it breaks legitimate name resolution

DNS traffic is always encrypted end-to-end

Firewalls cannot inspect UDP traffic

11. AVC (Application Visibility and Control) in an NGFW identifies applications by:

Port number alone — port 80 always means HTTP

Behavioral signatures, protocol structure, and DPI — regardless of port used

User-supplied application name in the packet header

Comparing source IP against a known application server list

12. In the three-tier SOC workflow, full packet capture is used at which stage?

Tier 1 — continuously, as the primary baselining tool

Tier 2 — to classify initial alert severity

Tier 3 — on-demand for forensic investigation after an incident is confirmed suspicious

It is not used in tiered workflows — only in standalone forensics labs

13. What is the primary security risk introduced by load balancers for SOC visibility?

Load balancers encrypt all traffic, preventing inspection

Perimeter monitoring sees the VIP address, not which backend server processed the request

Load balancers block NGFW DPI capabilities

Load balancers prevent NetFlow from recording flow statistics

14. JA3/JA3S hashes help SOC analysts because they:

Decrypt TLS sessions without requiring the private key

Identify client and server TLS implementations from ClientHello parameters, flagging known malware families without decryption

Provide the full URL of HTTPS requests

Block all connections with self-signed certificates

15. DLP is superior to NetFlow alone for detecting insider exfiltration because:

DLP captures more bytes per session record than NetFlow

DLP (with SSL inspection) classifies content type and sensitivity — identifying what is being sent, not just that a large upload occurred

DLP prevents all outbound HTTPS traffic by default

NetFlow cannot detect upload sessions

Section 1: Attack Surface and Vulnerability Analysis

1.1 Defining the Attack Surface

The attack surface is the complete set of entry points — physical, logical, and human — through which an adversary could attempt unauthorized access. It falls into three categories:

Category	Examples
Network attack surface	Open ports, exposed services, external-facing APIs, wireless access points
Software attack surface	Web applications, firmware, OS services, third-party libraries
Human attack surface	Phishing targets, social engineering vectors, insider access

Attack surface analysis answers: "What can an attacker reach?" It is a discovery exercise, not an assessment of weakness. The goal is enumeration of all potential entry points, then prioritization based on exposure, asset value, and likelihood of targeting.

1.2 Vulnerability Assessment vs. Penetration Testing

Dimension	Vulnerability Assessment	Penetration Testing
Question answered	"What weaknesses exist?"	"Can an attacker actually exploit them?"
Approach	Automated scanning, non-exploitative	Manual exploitation, adversary simulation
Scope	Broad — all systems in scope	Narrow — specific targets, limited time
Output	List of CVEs with severity ratings	Proof-of-concept exploits, attack chains
Frequency	Continuous or quarterly	Annual or after major changes
Risk to production	Low	Medium to high

1.3 Correlation: Attack Surface → Sensor Placement

Every element of the attack surface must have a corresponding monitoring data source. Unmapped assets create blind spots. The pipeline is:

flowchart LR A["Attack Surface Map\n(Enumerate all entry points:\nports, services, APIs,\nwireless, humans)"] --> B["Visibility Gap Analysis\n(Cross-reference each asset\nagainst existing sensor coverage;\nidentify unmonitored elements)"] B --> C["Sensor Placement\n(Deploy NetFlow, PCAP taps,\nNGFW, DLP, email filter\nto close gaps)"] C --> D["Monitored Perimeter\n(Every surface element\nhas a data source;\nblind spots eliminated)"] style A fill:#0a1628,stroke:#58a6ff,color:#c9d1d9 style B fill:#0a1628,stroke:#388bfd,color:#c9d1d9 style C fill:#0a1628,stroke:#1f6feb,color:#c9d1d9 style D fill:#0a200a,stroke:#56d364,color:#c9d1d9

Section 2: Security Monitoring Technologies

2.1 tcpdump and Full Packet Capture

tcpdump is a command-line packet analyzer that captures complete packets — Layer 2 frame through full application payload — writing to PCAP files. Wireshark provides a GUI for the same data. Together they represent the forensic gold standard.

PCAP captures everything: headers, TCP flags, sequence numbers, and full application-layer content. An HTTP session PCAP contains exact request/response bytes including credentials, file contents, and commands. The tradeoff is storage: a busy 1 Gbps link generates 100–500 GB/day, making PCAP impractical as a continuous baseline — it is a targeted, on-demand forensic instrument.

2.2 NetFlow and IPFIX

NetFlow captures flow-level summaries defined by the five-tuple: src IP, dst IP, src port, dst port, IP protocol. For every unique five-tuple, the exporter records aggregate statistics: byte count, packet count, start/end timestamps.

Field	Description
`srcaddr / dstaddr`	Source and destination IPv4 addresses
`srcport / dstport`	Source and destination Layer 4 ports
`prot`	IP protocol (6=TCP, 17=UDP, 1=ICMP)
`dPkts / dOctets`	Packet count and byte count
`First / Last`	Flow start and end timestamps
`tcp_flags`	Cumulative OR of TCP flags in the flow

NetFlow does not capture payload. At ~200 bytes/record it offers months of retention. IPFIX (RFC 7011) is the IETF-standardized successor to NetFlow v9, extending fields to include application IDs, URLs, and vendor-specific elements.

Security use cases: detecting port scans, top talkers (DDoS indicators), lateral movement (unusual internal-to-internal connections), baselines for anomaly detection, and beaconing (regular low-volume flows to a single external IP).

2.3 NGFW vs. Stateful Firewalls

A stateful firewall tracks TCP session state and enforces rules based on Layer 3/4 information only. It cannot distinguish legitimate HTTPS from malware over port 443.

A next-generation firewall (NGFW) extends to Layer 7 via Deep Packet Inspection (DPI), identifying the actual application regardless of port. Application Visibility and Control (AVC) classifies Facebook, BitTorrent, Dropbox, and thousands of apps by behavioral signatures.

flowchart TD subgraph StatefulFW["Stateful Firewall"] SF1["Layer 3: Source/Dest IP — Allow / Deny"] SF2["Layer 4: Port + Protocol — TCP State Tracking"] SF3["Layer 7: Unknown — Application Invisible"] SF1 --> SF2 --> SF3 end subgraph NGFW["Next-Generation Firewall"] NF1["Layer 3: Source/Dest IP — Allow / Deny"] NF2["Layer 4: Port + Protocol — TCP State Tracking"] NF3["Layer 7: DPI + AVC — App ID, User, URL Category"] NF4["Threat Engine — IPS Signatures, Malware Detection"] NF5["DLP + SSL Inspection — Content Classification"] NF1 --> NF2 --> NF3 --> NF4 --> NF5 end style SF3 fill:#2d1a1a,stroke:#f85149,color:#f85149 style NF3 fill:#0a1628,stroke:#58a6ff,color:#c9d1d9 style NF4 fill:#0a1628,stroke:#388bfd,color:#c9d1d9 style NF5 fill:#0a200a,stroke:#56d364,color:#c9d1d9 style StatefulFW fill:#161b22,stroke:#30363d,color:#8b949e style NGFW fill:#161b22,stroke:#1f6feb,color:#8b949e

Capability	Stateful Firewall	NGFW
Layer 3/4 enforcement	Yes	Yes
Application identification	No	Yes (AVC/DPI)
User identity integration	No	Yes (AD/LDAP)
URL categorization	No	Yes
IPS/threat detection	No	Yes (integrated)
SSL/TLS inspection	No	Yes (optional)
DLP integration	No	Yes
Log richness	Low (connection events)	High (app, user, threat)

Section 3: Content Filtering and Visibility

3.1 Web Content Filtering

Web content filters intercept HTTP/HTTPS requests and enforce policy based on URL categories, reputation scores, and content analysis. Every request generates a log entry: timestamp, user, destination URL, category, action, bytes transferred.

sequenceDiagram participant User as User Workstation participant Filter as Web Filter / Proxy participant Intel as URL Category Engine participant Log as SIEM / Log Store participant Web as External Web Server User->>Filter: HTTPS request (SNI: example.com) Filter->>Intel: Classify URL + check reputation Intel-->>Filter: Category: Malware C2 / Score: High Risk alt Blocked by Policy Filter-->>User: Block page returned Filter->>Log: BLOCKED | user | URL | category | timestamp else Allowed by Policy Filter->>Web: Forward request (SSL re-encrypted) Web-->>Filter: Response Filter-->>User: Response forwarded Filter->>Log: ALLOWED | user | URL | category | bytes | timestamp end

A URL filter log showing 50 requests to "Command and Control" category URLs over 10 minutes is a high-fidelity indicator of compromise that no NetFlow record alone would reveal. Because most web traffic is now HTTPS, filters must perform SSL inspection (MITM via enterprise CA) to classify encrypted content; without it, only SNI and certificate details are visible.

3.2 Email Filtering

Email security gateways examine sender authentication (SPF, DKIM, DMARC), reputation, content, and rewrite embedded links for click-time analysis. Email filter logs often provide the first-stage incident timeline: when a phishing email arrived, whether it was delivered or quarantined, and whether any user clicked the link.

Event Type	Security Value
Blocked phishing email	High — blocked initial access attempt
Malicious attachment quarantined	High — evidence of targeted delivery
URL click after delivery	Critical — user engaged with potentially malicious link
Mass delivery from new sender	Medium — BEC or campaign indicator

3.3 Data Loss Prevention (DLP)

DLP monitors data in motion, at rest, and in use to detect unauthorized exfiltration. Integrated with NGFWs and content filters, DLP adds semantic context that packet and flow data alone cannot provide.

Dimension	Description
Data in motion	HTTPS uploads, email attachments, file transfers — inspected inline
Data at rest	Endpoint and file server scans for sensitive data patterns
Data in use	Clipboard, print, USB transfer activity on endpoints

Detection methods include pattern matching (regex for SSNs, credit cards), document fingerprinting, classification label enforcement, and machine learning. Without DLP, an insider exfiltrating 2 GB of source code to personal cloud storage appears as a large upload to a cloud provider IP — suspicious but ambiguous. With DLP and SSL inspection, the session is classified as Google Drive, file types identified, content flagged as Confidential, and the session blocked with a SIEM alert generated.

Section 4: Data Visibility Challenges

4.1 NAT and PAT

PAT (NAT overload) maps all internal hosts to a single public IP, differentiated only by source port. External logs show all enterprise traffic originating from one IP. Forensic attribution requires a four-step chain:

flowchart LR A["External Alert\nSuspicious connection from\n203.0.113.10:4501 at 02:14:33"] --> B["Step 1: Query Firewall Log\nConfirm public IP, port,\ntimestamp, direction"] B --> C["Step 2: Query PAT Table\nPublic 203.0.113.10:4501\n→ Private 10.1.1.55:52341"] C --> D["Step 3: Query DHCP Leases\nPrivate 10.1.1.55\n→ MAC aa:bb:cc:dd:ee:ff\n→ Hostname DESKTOP-FINANCE03"] D --> E["Attribution Complete\nInternal host identified"] style A fill:#2d1a1a,stroke:#f85149,color:#c9d1d9 style B fill:#0a1628,stroke:#58a6ff,color:#c9d1d9 style C fill:#0a1628,stroke:#388bfd,color:#c9d1d9 style D fill:#0a1628,stroke:#1f6feb,color:#c9d1d9 style E fill:#0a200a,stroke:#56d364,color:#c9d1d9

4.2 Tunneling and Protocol Encapsulation

DNS tunneling exploits the near-universal allowance of port 53 UDP outbound. Attackers encode commands and exfiltrated data as DNS query strings to an attacker-controlled authoritative server. Detection indicators include: high query volume to a single domain, extremely long query strings, high-entropy query names (base64), uncommon record types (TXT, NULL), and low TTL with frequent re-queries.

HTTP/HTTPS tunneling sends C2 traffic over port 80/443. Without SSL inspection, NGFW and IDS tools see only connection metadata — the payload is invisible. NetFlow shows the connection; nothing shows the content.

4.3 Encryption — SSL/TLS

As of 2024, over 95% of web traffic is HTTPS. Network monitoring tools can observe the destination IP/port, TLS handshake metadata, SNI field, certificate details, and flow-level statistics. They cannot observe HTTP paths, API bodies, file contents, or authentication tokens without active decryption.

NGFW SSL inspection acts as a TLS proxy: terminate client session, inspect plaintext, re-encrypt to server. Requires deploying a trusted enterprise CA to all managed endpoints. Inapplicable to sessions with certificate pinning.

Encrypted Traffic Analysis (ETA) / JA3 fingerprinting infers application type and threat presence from packet size distributions, inter-arrival timing, flow duration, and TLS ClientHello parameters — without decryption.

4.4 TOR (The Onion Router)

TOR routes traffic through three relay hops (entry guard, middle relay, exit node) with layered encryption so no single relay knows both source and destination.

4.5 P2P Traffic

P2P networks use dynamic ports and distributed architecture. BitTorrent traffic resembles certain malware behaviors (many short flows to many IPs), creating false positives. Sophisticated attackers use P2P protocols as covert channels to disguise exfiltration as ordinary file transfer traffic. NGFW AVC detects P2P by behavioral signatures regardless of port.

4.6 Load Balancing

Load balancers distribute connections across backends, translating external connections to a VIP. Perimeter NetFlow and PCAP see only the VIP, not which backend server processed a specific request. Visibility requires sensors on backend segments, load balancer correlation logs, or per-server application logging.

Section 5: Security Data Types and Their Uses

5.1 Full Packet Capture

Full PCAP is the forensic ground truth: every bit of every packet including headers and payload. Storage: 100–500 GB/day. Retention: 7–30 days. Analysis requires expert tools (Wireshark, tcpdump, NetworkMiner, Zeek).

When PCAP is required: malware payload extraction and reverse engineering, file reconstruction from sessions, validating true positives (was data actually transferred?), attack chain reconstruction for legal proceedings, and decrypting captured TLS sessions when keys are available.

5.2 Session and Transaction Data

Session data (e.g., Zeek conn.log) captures summary attributes without payload: source/destination IP and port, service, bytes in/out, connection duration, and state. HTTP sessions include method, URI, host, user agent, and response code. Record size: under 100 bytes. Retention: 12+ months. Supports UEBA (User and Entity Behavior Analytics) to identify unusual session patterns over time.

5.3 Statistical and Flow Data (NetFlow/IPFIX)

Flow-level statistical data enables: volumetric anomaly detection, scan detection (port sweep patterns), beaconing detection (regular low-volume flows to external IP at consistent intervals), and top-N bandwidth analysis.

Beaconing example: A C2 implant checks in every 300 seconds with 200-byte HTTP GET requests. NetFlow shows 288 flows/day, each 200 bytes, to the same external IP, evenly spaced. Invisible in a single log view but immediately apparent when plotted over time.

5.4 Metadata

Metadata = data about communications that describes structure and behavior without revealing content. Includes TLS certificate details, DNS query/response pairs, HTTP header fields (User-Agent, Content-Type), and SMTP envelope information.

Why metadata matters even without decryption: JA3/JA3S TLS fingerprints identify malware families from ClientHello parameters; certificate SANs can reveal malicious infrastructure; HTTP User-Agent strings identify attacker frameworks (Cobalt Strike default profiles); DNS timing supports tunneling detection.

5.5 Alert Data

Alert data is generated when detection rules, signatures, or anomaly thresholds fire. Components: alert ID, timestamp, triggering rule, severity and confidence, source/destination attributes, and evidence snippet. Alert data triggers the SOC workflow tiers.

5.6 Data Type Summary

Data Type	Content	Size/Record	Retention	Primary SOC Use
Full PCAP	Complete packets + payload	~1500 bytes avg	7–30 days	Forensics, validation
NetFlow/IPFIX	5-tuple + stats, no payload	~150–200 bytes	Months	Baselining, anomaly detection
Session/transaction	Connection summary, no payload	<100 bytes	12+ months	Triage, UEBA
Metadata	Headers, certs, query data	Variable (small)	12+ months	Encrypted traffic analysis
Alert data	Security event, signature match	<1 KB	Years	Immediate triage

Post-Quiz — Test Your Understanding

Post-Quiz — Apply What You Learned