12.1.1 The Cisco Catalyst SD-WAN Fabric
Cisco Catalyst SD-WAN separates the WAN into distinct functional planes, each managed by a dedicated controller:
| Plane | Controller | Role |
| Management | vManage (SD-WAN Manager) | Central GUI/API — all automation targets this controller |
| Control | vSmart (SD-WAN Controller) | Distributes routing, TLOC, and policy to all devices via OMP |
| Orchestration | vBond (SD-WAN Validator) | NAT traversal broker; authenticates devices during onboarding |
| Data | vEdge / cEdge (WAN Edge) | Forwards user traffic; runs BFD and OMP |
graph TD
A[vManage
SD-WAN Manager
Management Plane] -->|REST API / NETCONF| B[vSmart
SD-WAN Controller
Control Plane]
A -->|Orchestration| C[vBond
SD-WAN Validator
Orchestration Plane]
B -->|OMP: routes, TLOCs, policies| D[WAN Edge 1
cEdge / vEdge
Data Plane]
B -->|OMP: routes, TLOCs, policies| E[WAN Edge 2
cEdge / vEdge
Data Plane]
C -->|NAT traversal / auth| D
C -->|NAT traversal / auth| E
D <-->|IPsec + BFD tunnels| E
subgraph "Automation Target"
A
end
subgraph "Control Plane"
B
C
end
subgraph "Data Plane"
D
E
end
The data-plane fabric is built from encrypted IPsec tunnels. Each endpoint is identified by a TLOC — a three-tuple of (system-IP, color, encapsulation). Colors represent logical transport labels: mpls, biz-internet, lte, etc.
- OMP (Overlay Management Protocol) — TCP-based control-plane protocol (similar to BGP) running between WAN Edge devices and vSmart. Distributes routes, TLOCs, and service chain reachability.
- BFD (Bidirectional Forwarding Detection) — Runs between every WAN Edge pair across each transport color. Detects data-plane tunnel liveness at subsecond intervals.
12.1.2 The vManage REST API & Authentication
Every vManage instance ships with a Swagger UI at https://<vManage-IP>:8443/apidocs. All API requests target paths under /dataservice. Authentication requires two steps:
- Session Cookie — POST credentials to
/j_security_check to receive a JSESSIONID cookie.
- CSRF Token — GET
/dataservice/client/token and set the result as the X-XSRF-TOKEN request header for all subsequent write operations.
sequenceDiagram
participant Client as Python Client
participant vM as vManage API
Client->>vM: POST /j_security_check (j_username, j_password)
vM-->>Client: HTTP 200 + Set-Cookie: JSESSIONID
Client->>vM: GET /dataservice/client/token (Cookie: JSESSIONID)
vM-->>Client: HTTP 200 + X-XSRF-TOKEN value
Note over Client: session.headers["X-XSRF-TOKEN"] = token
Client->>vM: GET/POST /dataservice/endpoint (Cookie + X-XSRF-TOKEN)
vM-->>Client: JSON response data
vManage Authentication Flow
1
POST /j_security_check — send j_username + j_password as form data
2
← Response: HTTP 200 + Set-Cookie: JSESSIONID
3
GET /dataservice/client/token — with JSESSIONID cookie
4
← Response: plain-text X-XSRF-TOKEN value
5
All subsequent calls: Cookie + X-XSRF-TOKEN header + Content-Type: application/json
API Response Patterns
| Pattern | When Used | How to Handle |
| JSON data block | GET list/detail operations | Parse response.json()["data"] |
| Object ID | POST creating new objects | Parse response.json()["policyId"], ["listId"], etc. |
| Async task ID | Template attach, policy activate | Poll GET /device/action/status/<id> until done or failure |
| Empty body (HTTP 200) | Update/delete operations | Check response.status_code == 200 |
12.2.1 Feature Templates vs. Device Templates
Feature templates define a single feature's configuration — a VPN interface, BGP routing, NTP settings — with parameterized fields. Some values are hardcoded; others are marked as device-specific variables (e.g., vipType: "variableName"). Device templates assemble multiple feature templates into a complete device configuration blueprint.
graph TD
FT1[Feature Template: cisco_system] --> DT
FT2[Feature Template: cisco_vpn VPN 0] --> DT
FT3[Feature Template: cisco_vpn_interface] --> FT2
FT4[Feature Template: cisco_vpn VPN 512] --> DT
FT5[Feature Template: cisco_ntp] --> DT
DT[Device Template: Branch-C1111-Standard]
DT -->|attach with variables| D1[Branch Site 1]
DT -->|attach with variables| D2[Branch Site 2]
DT -->|attach with variables| D3[Branch Site N]
style DT fill:#0055aa,color:#ffffff
style FT1 fill:#0077cc,color:#ffffff
style FT2 fill:#0077cc,color:#ffffff
style FT3 fill:#0077cc,color:#ffffff
style FT4 fill:#0077cc,color:#ffffff
style FT5 fill:#0077cc,color:#ffffff
12.2.2–12.2.3 Template API Operations
Feature templates are managed at /template/feature; device templates at /template/device. Both use the same CRUD pattern: GET to list, POST to create, PUT to update, DELETE to remove. Creating a feature template payload requires templateType (e.g., cisco_vpn_interface), deviceType, and a templateDefinition object where each field specifies its vipType as either constant or variableName. Device templates reference previously created feature template IDs inside generalTemplates (with optional subTemplates for nested components).
12.2.4 Template Attachment Workflow
Template attachment is a three-phase asynchronous transaction:
flowchart TD
A([Start: Select Template + Target Devices]) --> B
B["Phase 1 — Generate Variables\nPOST /template/device/config/input\nreturns variable schema per device"]
B --> C["Fill in device-specific values\nsystem-IP, hostname, interface IPs, site-ID"]
C --> D["Phase 2 — Preview Configuration\nPOST /template/device/config/preview\nreturns rendered CLI config"]
D --> E{Config correct?}
E -- No --> C
E -- Yes --> F["Phase 3 — Attach Template\nPOST /template/device/config/attachfeature\nreturns action_id"]
F --> G["Poll Task Status\nGET /device/action/status/{action_id}\nevery 10 seconds"]
G --> H{status?}
H -- done --> I([Attachment successful])
H -- failure --> J["Log error\nCall detachfeature to rollback"]
H -- in-progress --> G
If attachment fails, use POST /template/device/config/detachfeature to return the device to CLI mode. This is the rollback mechanism.
12.3.1 Centralized vs. Localized Policies
| Dimension | Centralized Policy (vSmart) | Localized Policy (vEdge/cEdge) |
| Enforced by | vSmart controllers | Individual WAN Edge routers |
| Scope | Fabric-wide: all devices in listed sites/VPNs | Per-device |
| Use cases | AAR, traffic engineering, control/data policies | QoS, ACLs, route policies, ZBF |
| API family | /template/policy/vsmart | /template/policy/vedge |
12.3.2 Policy Building Blocks and AAR
A centralized policy is built in a layered hierarchy — each layer produces an ID referenced by the next:
Centralized Policy Build Order
Step 1: Policy Lists
SLA classes, site lists, VPN lists, app lists
/template/policy/list/sla /template/policy/list/site
Step 2: Policy Definitions
AAR rules reference List IDs
/template/policy/definition/approute
Step 3: Policy Assembly
vSmart policy references Definition IDs + site/VPN scope
POST /template/policy/vsmart
Step 4: Activation
Push to vSmart controllers; poll action ID
POST /template/policy/vsmart/activate/<id>?confirm=true
flowchart TD
L1["Policy Lists\n/template/policy/list/sla\n/template/policy/list/site\n/template/policy/list/vpn"]
L1 --> L2["Policy Definitions\n/template/policy/definition/approute\n(AAR sequences referencing List IDs)"]
L2 --> L3["Policy Assembly\nPOST /template/policy/vsmart\nReferences definition IDs + site/VPN scope"]
L3 --> L4["Policy Activation\nPOST /template/policy/vsmart/activate/{id}?confirm=true"]
L4 --> L5["Poll Activation Task\nGET /device/action/status/{action_id}"]
L5 --> L6{Done?}
L6 -- Yes --> L7([Policy ACTIVE on vSmart])
L6 -- Failure --> L8([Deactivate + review])
L6 -- No --> L5
style L1 fill:#1a7a1a,color:#ffffff
style L2 fill:#1a7a1a,color:#ffffff
style L3 fill:#0055aa,color:#ffffff
style L4 fill:#0055aa,color:#ffffff
12.3.3 AAR Policy Structure
An AAR policy definition (type: "appRoute") contains sequences. Each sequence has a match (on an app list) and actions (set preferred color, apply SLA class with fallbackToBestPath). The SLA class definition specifies latency, loss, and jitter thresholds.
12.3.4 Modifying Active AAR Policies
AAR policies can be updated in-place without recreation: GET the current definition, modify the preferredColor parameter in the target sequence, then PUT the updated object back. This is the GET-modify-PUT pattern — no deactivation required for definition edits.
Complete Policy Lifecycle Reference
| Operation | Method | Endpoint |
| Create centralized policy | POST | /template/policy/vsmart |
| Activate centralized policy | POST | /template/policy/vsmart/activate/<id>?confirm=true |
| Deactivate centralized policy | POST | /template/policy/vsmart/deactivate/<id>?confirm=true |
| Create SLA class | POST | /template/policy/list/sla |
| Create site list | POST | /template/policy/list/site |
| Create VPN list | POST | /template/policy/list/vpn |
| Create AAR definition | POST | /template/policy/definition/approute |
| Update AAR definition | PUT | /template/policy/definition/approute/<id> |
12.4.1–12.4.2 Monitoring Architecture and Quick Health Check
The vManage monitoring API supports two access patterns:
- Real-time device queries — GET with a
deviceId query parameter (the device's system-IP). Returns live operational state.
- Statistics aggregation queries — POST with a structured query payload to the statistics endpoints. Queries the vManage time-series database for historical metrics.
The fastest health check is GET /device/counters?deviceId=<system-ip>, which returns a composite summary including bfdSessionsUp, bfdSessionsDown, ompPeersUp, ompPeersDown, controlConnections, and certValidationStatus.
12.4.3–12.4.5 BFD, OMP, and Tunnel Monitoring
| Endpoint | Data Returned |
GET /device/bfd/sessions?deviceId=<ip> | Per-session state, peer IPs, TLOC colors, uptime |
GET /device/bfd/history?deviceId=<ip> | State transitions — use for flap detection |
GET /device/omp/peers?deviceId=<ip> | OMP peer sessions, state, routes/TLOCs exchanged |
GET /device/omp/routes/received?deviceId=<ip> | Routes received from vSmart |
GET /device/omp/tlocs/received?deviceId=<ip> | TLOC entries received via OMP |
GET /device/tunnel/statistics?deviceId=<ip> | Per-tunnel latency, loss%, jitter |
GET /device/control/connections?deviceId=<ip> | Active control-plane connections (vManage, vSmart, vBond) |
12.4.6–12.4.7 AAR Statistics and Alarm Management
For trend analysis, use POST /statistics/approute/fec/aggregation with a structured query payload specifying time range, aggregation fields, histogram interval, and metrics (latency avg, loss_percentage avg, jitter avg, vqoe_score avg). The vqoe_score (0–10) is a composite application experience indicator.
Alarms follow a four-tier severity model: Critical (I), Major (II), Medium (III), Minor (IV). Query with POST /alarms using a structured JSON body with time-range and severity filter rules. Common alarm types:
BFD_TLOC_DOWN — data-plane tunnel lost
OMP_PEER_DOWN — control-plane OMP session dropped
SLA_VIOLATION — path metrics exceeded SLA thresholds
TUNNEL_DOWN — IPsec tunnel formation failure
12.4.8 Webhooks for Real-Time Alarm Delivery
For real-time incident response, configure vManage webhooks under Administration → Settings → Alarm Notifications. vManage POSTs alarm events to your URL when they fire. The target URL must be reachable from vManage's transport interface (VPN 0) on port 443 and must respond with HTTP 200.
flowchart TD
A([Start Health Report]) --> B["GET /device — full device inventory"]
B --> C["Filter: reachability == reachable"]
C --> D{For each device}
D --> E["GET /device/counters?deviceId={system_ip}"]
E --> F{bfdSessionsDown > 0?}
F -- Yes --> G["WARN: BFD sessions down"]
F -- No --> H{ompPeersDown > 0?}
G --> H
H -- Yes --> I["CRITICAL: OMP peers down"]
H -- No --> J{More devices?}
I --> J
J -- Yes --> D
J -- No --> K["POST /alarms — severity Critical/Major, last 1 hour"]
K --> L{Alarms found?}
L -- Yes --> M["ALERT: N critical/major alarms"]
L -- No --> N["Print all collected issues"]
M --> N
N --> O([Report complete])