Chapter 2: Generative AI and AI Use Cases

Learning Objectives

Section 1: Generative AI Deep Dive

Pre-Quiz: Generative AI Deep Dive

1. What mechanism allows a transformer model to weigh the relevance of every word in an input against every other word simultaneously?

Feed-forward network
Self-attention
Positional encoding
Token embedding

2. Which generative AI challenge refers to the model producing confident-sounding but factually incorrect output?

Latency
Data governance
Hallucination
Cost overrun

3. Compared to traditional ML, modern generative AI inference is primarily:

Compute-bound and lightweight
Memory-bound and autoregressive
Storage-bound and batch-oriented
Network-bound and synchronous

4. According to the chapter, AI racks with hardware like NVIDIA's GB200 can consume more than how much power per rack?

20 kW
50 kW
100 kW
200 kW

5. Which mitigation strategy helps reduce LLM hallucination by retrieving relevant documents from an external knowledge base before generation?

Model distillation
Prompt engineering
Retrieval-augmented generation (RAG)
Federated learning

Key Points: Generative AI Deep Dive

The Transformer Architecture and LLMs

Every modern generative AI system traces its lineage to the 2017 paper "Attention Is All You Need." The transformer replaced sequential processing of earlier recurrent neural networks with self-attention, allowing the model to weigh the relevance of every word against every other word simultaneously.

A transformer consists of two primary blocks: the Encoder (reads input and builds a rich internal representation) and the Decoder (uses that representation to generate output one token at a time).

Transformer Architecture Data Flow

flowchart LR A["Input Sequence\n(Tokens)"] --> B["Token\nEmbedding"] B --> C["Positional\nEncoding"] C --> D["ENCODER\nSelf-Attention +\nFeed-Forward Layers"] D --> E["Rich Internal\nRepresentation"] E --> F["DECODER\nMasked Self-Attention +\nCross-Attention +\nFeed-Forward Layers"] F --> G["Softmax\nOutput Layer"] G --> H["Predicted\nNext Token"] H -.->|"Appended to input\n(autoregressive loop)"| F style D fill:#2563eb,color:#fff,stroke:#1e40af style F fill:#7c3aed,color:#fff,stroke:#5b21b6 style G fill:#059669,color:#fff,stroke:#047857
ComponentRoleReal-World Analogy
Token embeddingConverts words into numerical vectorsAssigning GPS coordinates to every word to measure distances between meanings
Self-attentionWeighs relationships between all tokensA conference call where every participant hears every other simultaneously
Feed-forward networkTransforms attention outputs through nonlinear layersA skilled editor refining raw notes from a conference call
Positional encodingInjects word-order informationPage numbers on a manuscript
Softmax output layerProduces probability distribution over vocabularyA ranked shortlist of the most likely next words

LLM Autoregressive Text Generation

At inference time, the model receives a prompt, processes it through dozens of transformer layers, and predicts the next token. It appends that token to the input and repeats -- an autoregressive loop until a stopping condition is met.

flowchart TD A["User Prompt"] --> B["Tokenize Input"] B --> C["Process Through\nTransformer Layers"] C --> D["Predict Next Token\n(Probability Distribution)"] D --> E{"Stopping Condition\nMet?"} E -->|"No"| F["Append Token\nto Sequence"] F --> C E -->|"Yes"| G["Return Complete\nGenerated Text"] style A fill:#2563eb,color:#fff,stroke:#1e40af style D fill:#7c3aed,color:#fff,stroke:#5b21b6 style E fill:#d97706,color:#fff,stroke:#b45309 style G fill:#059669,color:#fff,stroke:#047857
Animation: Step-by-step walkthrough of a transformer processing input tokens through encoder layers, self-attention, and decoder generation loop.

Challenges of Generative AI

Hallucination: LLMs can produce factually incorrect text because they predict statistically likely token sequences rather than retrieve verified facts. Mitigations include RAG, human-in-the-loop validation, and grounding outputs against authoritative sources.

Cost: Training a frontier LLM costs tens of millions of dollars. AI racks (e.g., NVIDIA GB200) consume over 100 kW per rack -- five times the 20 kW standard for traditional cloud racks.

Latency: Generative AI workloads are memory-bound and may not run efficiently on classic GPU architectures, resulting in slower token generation.

Resource Consumption: Data centers supporting AI workloads consume enormous water quantities (up to 500,000 gallons/day). Dense deployments require liquid cooling or microfluidics.

Data Governance: Privacy protection and regulatory compliance add complexity across the entire AI lifecycle.

ChallengeInfrastructure ImpactMitigation
HallucinationRisk of incorrect configurations in productionRAG pipelines, human-in-the-loop, grounding
CostHigh CapEx (GPU clusters) and OpEx (power, cooling)Right-sizing, spot instances, model distillation
LatencySlow inference degrades real-time automationEdge inference, quantized models, AI accelerators
Resource consumptionWater and power strain on local utilitiesLiquid cooling, on-site renewables, microfluidics
Data governanceCompliance risk across jurisdictionsData classification, audit trails, federated learning

Traditional vs. Modern AI

DimensionTraditional AI / MLModern Generative AI
Model typeTask-specific (decision trees, SVMs)General-purpose foundation models (transformers)
Training dataCurated, labeled datasetsMassive unlabeled corpora (billions of tokens)
Training costHours to days on CPU/GPUWeeks to months on thousands of GPUs
Inference patternLow-latency, lightweightMemory-bound, autoregressive
InfrastructureStandard servers, modest GPUDense GPU racks (100 kW+), liquid cooling
OutputStructured (labels, scores)Unstructured (text, images, code)
Key riskBias, overfittingHallucination, prompt injection

Future Trends

AI-Dedicated Data Centers: Active capacity will expand from 11.5 GW (2026) to 43.6 GW (2031). The industry is moving toward multipurpose data centers with dedicated "AI zones" and AI-as-a-Service models.

Innovative Cooling: Operators are exploring on-site renewables, natural gas microturbines, and microfluidics where coolant is delivered directly to chip surfaces.

New Chip Architectures: Purpose-built AI inference accelerators, chiplet-based architectures, and in-memory computing target the memory-bound nature of generative AI.

Sustainability Mandates: Regulatory pressure will require water recycling, carbon offsets, and transparent energy reporting.

Animation: Side-by-side comparison showing traditional ML inference (lightweight single-GPU server) vs. modern generative AI inference (dense GPU rack with liquid cooling and high-bandwidth fabric).
Post-Quiz: Generative AI Deep Dive

1. What mechanism allows a transformer model to weigh the relevance of every word in an input against every other word simultaneously?

Feed-forward network
Self-attention
Positional encoding
Token embedding

2. Which generative AI challenge refers to the model producing confident-sounding but factually incorrect output?

Latency
Data governance
Hallucination
Cost overrun

3. Compared to traditional ML, modern generative AI inference is primarily:

Compute-bound and lightweight
Memory-bound and autoregressive
Storage-bound and batch-oriented
Network-bound and synchronous

4. According to the chapter, AI racks with hardware like NVIDIA's GB200 can consume more than how much power per rack?

20 kW
50 kW
100 kW
200 kW

5. Which mitigation strategy helps reduce LLM hallucination by retrieving relevant documents from an external knowledge base before generation?

Model distillation
Prompt engineering
Retrieval-augmented generation (RAG)
Federated learning

Section 2: Enterprise AI Use Cases

Pre-Quiz: Enterprise AI Use Cases

1. In AI-driven network monitoring, what is the first step before anomalies can be detected?

Deploying automated response playbooks
Establishing a baseline of normal behavior
Installing signature-based detection rules
Configuring manual alert thresholds

2. In the intelligent automation maturity model, at which level does AI detect, decide, and remediate without human intervention?

Level 0 -- Manual
Level 1 -- Alert-driven
Level 2 -- Semi-automated
Level 3 -- Fully automated

3. In the switch failure prediction example, which two correlated anomalies did the AI model detect?

High memory utilization and CRC errors
Rising CPU temperature and declining fan RPM
Voltage fluctuations and packet drops
Increased latency and BGP flaps

4. Which Cisco product applies behavioral modeling to identify threats in encrypted traffic without decryption?

Cisco Catalyst Center
Cisco Secure Network Analytics (formerly Stealthwatch)
Cisco Meraki Dashboard
Cisco ISE

5. What type of data sources does AI network monitoring typically ingest? (Choose the most complete answer.)

Only syslog messages and SNMP traps
SNMP traps, NetFlow records, syslog messages, and gNMI streaming telemetry
Only NetFlow records and packet captures
Only streaming telemetry via gRPC

Key Points: Enterprise AI Use Cases

AI for Network Management and Security

AI-powered network management replaces reactive "break-fix" workflows with continuous, intelligent monitoring. The system ingests telemetry from switches, routers, firewalls, and servers, then applies ML models to identify patterns humans would miss.

AI-Driven Network Monitoring Pipeline

flowchart TD subgraph Sources["Data Sources"] S1["SNMP Traps"] S2["NetFlow Records"] S3["Syslog Messages"] S4["gNMI/gRPC\nStreaming Telemetry"] end Sources --> DL["Centralized\nData Lake"] DL --> BL["Baseline Establishment\n(Days to Weeks of\nNormal Behavior)"] BL --> RT["Real-Time Analysis\n(Compare Against Baseline)"] RT --> DET{"Anomaly\nDetected?"} DET -->|"No"| RT DET -->|"Yes"| SCORE["Score by\nRisk Severity"] SCORE --> AR["Automated Response"] subgraph Actions["Response Actions"] A1["Block Malicious\nTraffic"] A2["Isolate Device\n(ACL / VLAN)"] A3["Alert SOC"] end AR --> Actions style DL fill:#2563eb,color:#fff,stroke:#1e40af style BL fill:#7c3aed,color:#fff,stroke:#5b21b6 style DET fill:#d97706,color:#fff,stroke:#b45309 style AR fill:#dc2626,color:#fff,stroke:#b91c1c

The pipeline works in four stages: (1) Data ingestion from SNMP, NetFlow, syslog, and gNMI; (2) Baseline establishment over days to weeks; (3) Real-time analysis comparing incoming telemetry against baseline; (4) Automated response -- blocking traffic, isolating devices, or alerting the SOC. This reduces containment time from minutes to seconds.

Animation: Animated pipeline showing telemetry data flowing from network devices through baseline analysis to anomaly detection and automated response actions.

Predictive Analytics and Anomaly Detection

Predictive analytics forecasts what will happen next by analyzing historical trends -- failure rates, traffic patterns, and degradation signals. The chapter's worked example shows a Nexus 9000 switch where the AI detects correlated anomalies: rising CPU temperature (52 to 59 C) and declining Fan 3 RPM (4800 to 3900). The system forecasts failure within 10-14 days and proactively schedules maintenance -- zero downtime, zero packet loss.

Predictive Analytics: Switch Failure Forecasting

flowchart TD A["Nexus 9000 Switch\nTelemetry Collection\n(6 months of metrics)"] --> B["AI Model Analyzes\nHistorical Trends"] B --> C["Correlated Anomalies Detected"] C --> D["CPU Temp Rising:\n52 to 54 to 57 to 59 C"] C --> E["Fan 3 RPM Declining:\n4800 to 4500 to 4200 to 3900"] D --> F["Forecast: CPU Exceeds\nSafe Limit in 14 Days"] E --> G["Forecast: Fan 3 Below\nThreshold in 10 Days"] F --> H["Generate Proactive\nMaintenance Ticket"] G --> H H --> I["Schedule Fan Tray\nReplacement"] I --> J["Pre-Stage\nReplacement Part"] J --> K["Zero Downtime\nZero Packet Loss"] style C fill:#d97706,color:#fff,stroke:#b45309 style F fill:#dc2626,color:#fff,stroke:#b91c1c style G fill:#dc2626,color:#fff,stroke:#b91c1c style K fill:#059669,color:#fff,stroke:#047857

Anomaly detection uses ML to establish behavioral baselines and flag deviations scored by risk severity. This is particularly effective at catching zero-day exploits, insider threats, and slow-and-low data exfiltration that signature-based systems miss.

Intelligent Automation

Intelligent automation is the glue connecting detection and analytics to real-world remediation. Without it, AI insights are dashboards; with it, they become closed-loop actions.

LevelDescriptionExample
Level 0 -- ManualHuman detects and remediatesEngineer notices high CPU via CLI, manually investigates
Level 1 -- Alert-drivenAI detects, human remediatesAI flags anomalous BGP flap; engineer fixes
Level 2 -- Semi-automatedAI detects and recommends; human approvesAI detects DDoS, recommends rate-limit ACL; engineer approves
Level 3 -- Fully automatedAI detects, decides, and remediatesAI detects compromised host, isolates to quarantine VLAN

Intelligent Automation Maturity Levels

stateDiagram-v2 direction LR L0: Level 0 -- Manual\nHuman detects\nHuman remediates L1: Level 1 -- Alert-Driven\nAI detects\nHuman remediates L2: Level 2 -- Semi-Automated\nAI detects + recommends\nHuman approves L3: Level 3 -- Fully Automated\nAI detects + decides\nAI remediates [*] --> L0 L0 --> L1: Add AI monitoring L1 --> L2: Add AI recommendations L2 --> L3: Add autonomous execution L3 --> [*]

Infrastructure Requirements by Use Case

Use CaseCompute NeedsNetwork NeedsStorage Needs
AI network monitoringModerate (inference on telemetry)Low-latency telemetry pipelineTime-series database for baselines
Predictive analyticsModerate to high (training)Bulk data transfer for trainingLarge historical dataset storage
Anomaly detectionModerate (real-time scoring)Inline or tap-based traffic accessShort-term flow cache, long-term archive
Intelligent automationLow (decision engine)API access (RESTCONF, gNMI)Playbook and policy repository
Animation: Visual progression through automation Levels 0-3 showing the expanding role of AI at each maturity stage, from fully manual to fully autonomous remediation.
Post-Quiz: Enterprise AI Use Cases

1. In AI-driven network monitoring, what is the first step before anomalies can be detected?

Deploying automated response playbooks
Establishing a baseline of normal behavior
Installing signature-based detection rules
Configuring manual alert thresholds

2. In the intelligent automation maturity model, at which level does AI detect, decide, and remediate without human intervention?

Level 0 -- Manual
Level 1 -- Alert-driven
Level 2 -- Semi-automated
Level 3 -- Fully automated

3. In the switch failure prediction example, which two correlated anomalies did the AI model detect?

High memory utilization and CRC errors
Rising CPU temperature and declining fan RPM
Voltage fluctuations and packet drops
Increased latency and BGP flaps

4. Which Cisco product applies behavioral modeling to identify threats in encrypted traffic without decryption?

Cisco Catalyst Center
Cisco Secure Network Analytics (formerly Stealthwatch)
Cisco Meraki Dashboard
Cisco ISE

5. What type of data sources does AI network monitoring typically ingest? (Choose the most complete answer.)

Only syslog messages and SNMP traps
SNMP traps, NetFlow records, syslog messages, and gNMI streaming telemetry
Only NetFlow records and packet captures
Only streaming telemetry via gRPC

Section 3: AI Toolset Mastery -- Jupyter Notebook

Pre-Quiz: AI Toolset Mastery -- Jupyter Notebook

1. Which magic command in Jupyter AI sends a natural-language prompt to an LLM to generate code?

%run
%%ai
%load
%%python

2. What is the correct command to install the Jupyter AI extension?

pip install jupyterlab-ai
pip install jupyter-ai
conda install jupyter-ai-magics
npm install @jupyter/ai

3. In the three-cell AI-assisted workflow pattern, what are the three stages in order?

Execute, Generate, Analyze
Generate, Analyze, Execute
Generate, Execute, Analyze
Analyze, Generate, Execute

4. Which Python library is used for gNMI/gRPC streaming telemetry in network automation notebooks?

Netmiko
NAPALM
pyGNMI
Paramiko

5. Which Jupyter AI chat command generates an entire runbook notebook from a text description?

/ask
/learn
/generate
/create

Key Points: AI Toolset Mastery -- Jupyter Notebook

JupyterLab for Network Automation

Jupyter Notebook (and its modern interface, JupyterLab) is an open-source web application for creating documents with live code, visualizations, and narrative text. Documents are organized into cells -- executable code (Python) or formatted text (Markdown).

This cell-based structure is ideal for network automation because you can write and test scripts in one cell, display output below, add Markdown documentation, and share the entire notebook as a reproducible workflow.

Setting up JupyterLab for network automation:

# Install JupyterLab and key network automation libraries
pip install jupyterlab netmiko napalm pygnmi pandas nornir

# Launch JupyterLab
jupyter lab --ip=0.0.0.0 --port=8888 --no-browser

For data center work, you typically connect to network devices using Netmiko (SSH-based), NAPALM (vendor-neutral abstraction), or pyGNMI (gNMI/gRPC streaming telemetry).

Python Code with AI Assistance

Jupyter AI is an open-source extension that connects LLMs directly to JupyterLab via a chat interface and magic commands. Install and activate it:

pip install jupyter-ai

# In a notebook cell, load the magic extension:
%load_ext jupyter_ai_magics

Then use the %%ai magic command to interact with an LLM:

%%ai anthropic:claude-sonnet
Write a Python function using Netmiko that connects to a Cisco Nexus 9000
switch via SSH, retrieves "show vlan brief", and returns a pandas DataFrame.

Jupyter AI-Assisted Network Automation Workflow

flowchart LR subgraph Step1["Cell 1: Generate"] A1["Engineer describes\ntask in natural language"] --> A2["%%ai magic sends\nprompt to LLM"] A2 --> A3["LLM returns\ngenerated code"] end subgraph Step2["Cell 2: Execute"] B1["Run generated code\nagainst live devices"] --> B2["Collect telemetry\nor config output"] B2 --> B3["Store results in\npandas DataFrame"] end subgraph Step3["Cell 3: Analyze"] C1["Visualize data\nwith matplotlib"] --> C2["Identify patterns\nor anomalies"] C2 --> C3["Share notebook\nas documentation"] end Step1 --> Step2 --> Step3 Step3 -.->|"Iterate and refine"| Step1 style Step1 fill:#eff6ff,stroke:#2563eb style Step2 fill:#f5f3ff,stroke:#7c3aed style Step3 fill:#ecfdf5,stroke:#059669

AI Models for Productivity

Beyond code generation, Jupyter AI supports several productivity workflows:

CapabilityHow to Use ItUse Case
Code generation%%ai magic commandGenerate Netmiko/NAPALM/pyGNMI scripts from natural language
Code explanation%%ai with "explain this code" promptUnderstand inherited automation scripts
Error debuggingPaste traceback into %%ai cellDiagnose why a RESTCONF call returned 400
Content summarization%%ai with "summarize" promptCondense a 200-page vendor release note
Notebook generationJupyter AI chat: /generateCreate an entire runbook notebook from text description
File Q&AJupyter AI chat: /learn then /askAsk questions about local config files or logs

Exam tips: Know how to install/activate Jupyter AI, understand the %%ai provider:model-name syntax, be familiar with pandas DataFrames for telemetry data, and recognize that Jupyter AI supports multiple LLM providers configured via environment variables or settings UI.

Animation: Walkthrough of the three-cell Jupyter workflow -- typing a natural language prompt in Cell 1, watching AI-generated code appear, executing it in Cell 2 with live device output, and visualizing results in Cell 3 with matplotlib charts.
Post-Quiz: AI Toolset Mastery -- Jupyter Notebook

1. Which magic command in Jupyter AI sends a natural-language prompt to an LLM to generate code?

%run
%%ai
%load
%%python

2. What is the correct command to install the Jupyter AI extension?

pip install jupyterlab-ai
pip install jupyter-ai
conda install jupyter-ai-magics
npm install @jupyter/ai

3. In the three-cell AI-assisted workflow pattern, what are the three stages in order?

Execute, Generate, Analyze
Generate, Analyze, Execute
Generate, Execute, Analyze
Analyze, Generate, Execute

4. Which Python library is used for gNMI/gRPC streaming telemetry in network automation notebooks?

Netmiko
NAPALM
pyGNMI
Paramiko

5. Which Jupyter AI chat command generates an entire runbook notebook from a text description?

/ask
/learn
/generate
/create

Your Progress

Answer Explanations