Chapter 8: On-Box Automation — EEM, Guest Shell, and Python

Learning Objectives

Section 1: Embedded Event Manager (EEM)

Pre-Reading Quiz — Section 1: EEM

1. EEM applet action statements are executed in which order?

2. Which EEM event detector fires a policy repeatedly at a fixed time interval?

3. What is the default maximum execution time for an EEM policy?

4. Which EEM event trigger is useful for testing applets manually without waiting for a real network event?

5. How do you manually execute an EEM applet that uses event none?

1.1 Architecture and the Event-Action Model

The Embedded Event Manager (EEM) is a publish-subscribe subsystem built into IOS XE. Specialized event detectors monitor subsystems such as syslog, interfaces, SNMP, CLI input, and timers. When a defined condition is met, the detector publishes an event to the EEM server, which matches it against registered policies and dispatches the appropriate applet or Tcl script for execution.

IOS XE supports more than 20 event detectors, making EEM one of the broadest on-box policy engines in the industry. The architecture operates in three layers: detectors, the EEM server (publish/subscribe engine), and registered policies that issue actions.

flowchart TD subgraph Detectors["Event Detectors"] D1[syslog] D2[timer] D3[CLI] D4[interface] D5[SNMP] D6[OIR / hardware] end subgraph EEM["EEM Server"] ES[Publish / Subscribe Engine] end subgraph Policies["Registered Policies"] P1[Applets] P2[Tcl Scripts] end subgraph Actions["Action Execution"] A1[CLI commands] A2[Syslog messages] A3[guestshell run python3] A4[SNMP trap / email] end D1 -->|event published| ES D2 -->|event published| ES D3 -->|event published| ES D4 -->|event published| ES D5 -->|event published| ES D6 -->|event published| ES ES -->|pattern match| P1 ES -->|pattern match| P2 P1 -->|dispatches| A1 P1 -->|dispatches| A2 P1 -->|dispatches| A3 P2 -->|dispatches| A1 P2 -->|dispatches| A4

EEM Event Flow — Signal travels from detector through server to policy

Network Event
(e.g. intf down)
Syslog
Detector
EEM Server
Pub/Sub
Applet
Fires

1.2 Event Detectors Reference

DetectorTrigger ConditionCommon Use Case
event syslogMatches a syslog message by regexInterface down/up reactions, error pattern detection
event cliA specific CLI command is enteredAuditing, blocking unauthorized commands
event timer watchdogRecurring interval (fires repeatedly)Periodic health checks, heartbeat scripts
event timer countdownFires once after a delayDeferred configuration, one-time remediation
event interfaceInterface counter crosses a thresholdBandwidth alerting, error rate remediation
event snmpSNMP OID value crosses a thresholdPerformance-based automation
event oirHardware insertion or removalAutomatic port provisioning
event noneNever fires automaticallyPolicy testing, on-demand execution

1.3 Applets: Inline Event-Driven Policies

An applet is an EEM policy defined entirely within the IOS XE running configuration. Every applet has exactly one event trigger, one or more action steps (sorted alphanumerically by label), and optional set statements for EEM variables.

event manager applet INTERFACE_DOWN
 event syslog pattern ".*LINEPROTO-5-UPDOWN.*line protocol.*down"
 action 1.0 syslog msg "EEM: Interface down detected - attempting remediation"
 action 2.0 cli command "enable"
 action 3.0 cli command "configure terminal"
 action 4.0 cli command "interface GigabitEthernet0/1"
 action 5.0 cli command "no shutdown"
 action 6.0 cli command "end"

1.4 Action Label Ordering — A Critical Pitfall

Actions execute in alphanumeric sort order of their labels. Unpadded integer labels sort incorrectly: labels 1, 2, 10, 20 sort as 1, 10, 2, 20. Always use zero-padded labels to guarantee correct sequence.

flowchart LR subgraph WRONG["Unpadded — Wrong Order"] W1["action 1"] --> W2["action 10"] --> W3["action 2"] --> W4["action 20"] end subgraph RIGHT["Zero-Padded — Correct Order"] R1["action 01"] --> R2["action 02"] --> R3["action 10"] --> R4["action 20"] end WRONG --->|"Fix: add zero padding"| RIGHT

1.5 Key Parameters: maxrun and rate-limit

The default maximum execution time for any EEM policy is 20 seconds. Use maxrun <seconds> on the event line to extend this when a policy calls Guest Shell Python scripts. Use rate-limit <seconds> to prevent rapid re-execution when the trigger event fires in bursts (e.g., a flapping interface generating hundreds of syslog messages per second).

event manager applet OSPF_NEIGHBOR_DOWN
 event syslog pattern ".*OSPF-5-ADJCHG.*State to.*DOWN" maxrun 120
 action 1.0 syslog msg "EEM: OSPF neighbor down - invoking Python remediation"
 action 2.0 cli command "guestshell run python3 /flash/guest-share/ospf_remediation.py"

1.6 Tcl Scripts for Complex Logic

When applet action statements are insufficient — because the logic requires loops, conditionals, or complex string manipulation — EEM supports Tcl scripts stored on flash and registered with event manager policy. Tcl scripts use the ::cisco::eem namespace to register triggers and cli_open / cli_exec / cli_close to issue commands.

1.7 Verification Commands

show event manager policy registered    ! List all registered policies
show event manager history events       ! Recent event history (what fired, when)
debug event manager action cli          ! Real-time CLI action debug
event manager run APPLET_NAME           ! Manually trigger an applet

Key Points — Section 1: EEM

Post-Reading Quiz — Section 1: EEM

1. EEM applet action statements are executed in which order?

2. Which EEM event detector fires a policy repeatedly at a fixed time interval?

3. What is the default maximum execution time for an EEM policy?

4. Which EEM event trigger is useful for testing applets manually without waiting for a real network event?

5. How do you manually execute an EEM applet that uses event none?

Section 2: Guest Shell on IOS XE

Pre-Reading Quiz — Section 2: Guest Shell

6. What type of container technology is Guest Shell based on?

7. Which IOS XE command must be configured BEFORE enabling Guest Shell?

8. What is the shared filesystem path accessible from both IOS XE CLI and the Guest Shell container?

9. Starting with IOS XE Amsterdam 17.3.1, which Python version was removed from Guest Shell?

10. What privilege level is required to access Guest Shell on IOS XE?

2.1 Architecture: A Linux Container Inside Your Router

Guest Shell is a Linux Container (LXC) that runs inside Cisco IOS XE, managed by IOx — Cisco's application hosting framework. It provides a full Python 3.6+ interpreter, bash shell, pip, and standard Linux utilities, all running on the device hardware. It communicates with IOS XE via an internal loopback interface.

Guest Shell Architecture — Each layer builds on the one below it

Python Scripts + cli Module (your automation logic)
Guest Shell LXC Container (Python 3.6+, bash, pip)
IOx Application Hosting Framework (container lifecycle)
IOS XE Host OS (routing, switching, control plane)
Physical Hardware (CPU / RAM / Flash / NICs)
graph TD HW["Physical Hardware — CPU / RAM / Flash / NICs"] HW --> Kernel["Linux Kernel (shared with host OS)"] Kernel --> IOSXE["IOS XE Host OS"] IOSXE --> IOx["IOx Application Hosting Framework"] IOx --> GS["Guest Shell — LXC Container"] GS --> Py["Python 3.6+ Interpreter"] GS --> CLI_MOD["cli Python Module"] GS --> FS["/flash/guest-share/ (shared filesystem)"] CLI_MOD -->|"internal loopback"| IOSXE_CLI["IOS XE CLI Engine"] FS -->|"also visible as flash:guest-share/"| IOSXE

2.2 Enabling Guest Shell: Step-by-Step

! Step 1: Enable IOx (container management framework)
Router(config)# iox

! Step 2: Verify IOx is running (all 4 services must show Running)
Router# show iox-service

! Step 3: Enable Guest Shell
Router# guestshell enable

! Step 4: Verify Guest Shell state
Router# show app-hosting list
! Expected output:
! App id        State
! guestshell    RUNNING

! Step 5: Access the bash prompt
Router# guestshell
[guestshell@guestshell ~]$
flowchart TD A([Start]) --> B["Step 1: Enable IOx — Router config# iox"] B --> C{"show iox-service — All 4 services Running?"} C -- No --> D["Check platform support / Reload if needed"] D --> C C -- Yes --> E["Step 3: guestshell enable (~30-60 sec)"] E --> F{"show app-hosting list — guestshell = RUNNING?"} F -- No --> G["Check flash space and RAM / Review IOx logs"] G --> E F -- Yes --> H["Step 5: Router# guestshell"] H --> I(["guestshell@guestshell ~$ — Ready"])

2.3 Python Version and Shared Storage

From IOS XE Amsterdam 17.3.1 onward, Python 2.7 was removed. Always use python3 in scripts and EEM applets. The /flash/guest-share/ directory is visible from IOS XE as flash:guest-share/ — copy scripts via SCP/TFTP to this path to make them accessible inside the container.

2.4 Security Considerations

Guest Shell access requires privilege level 15. The guestshell Linux user has sudo rights within the container, and the cli module can issue any IOS XE configuration command. Treat Guest Shell access as equivalent to full privileged CLI access — do not leave sensitive scripts world-readable in guest-share.

Key Points — Section 2: Guest Shell

Post-Reading Quiz — Section 2: Guest Shell

6. What type of container technology is Guest Shell based on?

7. Which IOS XE command must be configured BEFORE enabling Guest Shell?

8. What is the shared filesystem path accessible from both IOS XE CLI and the Guest Shell container?

9. Starting with IOS XE Amsterdam 17.3.1, which Python version was removed from Guest Shell?

10. What privilege level is required to access Guest Shell on IOS XE?

Section 3: On-Box Python Automation

Pre-Reading Quiz — Section 3: On-Box Python

11. Which cli module function runs an exec-mode command and returns its output as a Python string?

12. In the EEM + Guest Shell integration pattern, what does the EEM applet contribute and what does Python contribute?

13. How does the cli Python module communicate with IOS XE from inside Guest Shell?

3.1 The cli Python Module

The cli module is pre-installed in Guest Shell and provides a clean API for issuing exec-mode and configuration commands to IOS XE. It communicates over the internal loopback between the container and the host OS.

FunctionModeReturnsDescription
cli.execute(cmd)ExecStringRun a show/exec command; return output as a string
cli.executep(cmd)ExecNoneSame as execute, but print to stdout
cli.configure(cmds)ConfigListRun config commands (newline-separated); return result list
cli.configurep(cmds)ConfigNoneSame as configure, but print to stdout
cli.clip(cmd)ExecNoneExecute and print directly to console (CLI-mode output)
import cli

# Read interface status and return as string
output = cli.execute("show ip interface brief")
print(output)

# Apply a configuration change
cli.configure("interface GigabitEthernet1\n description Configured by Python\n no shutdown")

# Conditional logic: check BGP state and react
bgp_status = cli.execute("show bgp summary")
if "Established" not in bgp_status:
    cli.configure("clear ip bgp * soft")

3.2 EEM + Guest Shell: The Canonical Closed-Loop Pattern

The most powerful on-box architecture combines EEM (event detection) with Guest Shell Python (complex logic and action). EEM handles the "what happened" layer; Python handles the "what to do about it" layer.

! EEM applet: detect event, invoke Python handler
event manager applet OSPF_NEIGHBOR_DOWN
 event syslog pattern ".*OSPF-5-ADJCHG.*State to.*DOWN" maxrun 120
 action 1.0 syslog msg "EEM: OSPF neighbor down - invoking Python remediation"
 action 2.0 cli command "guestshell run python3 /flash/guest-share/ospf_remediation.py"
 action 3.0 syslog msg "EEM: OSPF remediation script completed"
sequenceDiagram participant NW as Network Event (OSPF) participant IOS as IOS XE Syslog Engine participant EEM as EEM Server participant APP as EEM Applet participant GS as Guest Shell Python participant CLI as IOS XE CLI Engine NW->>IOS: OSPF adjacency drops IOS->>EEM: syslog: OSPF-5-ADJCHG...State to DOWN EEM->>APP: Pattern matched — dispatch applet APP->>IOS: action 1.0: syslog notification APP->>GS: action 2.0: guestshell run python3 ospf_remediation.py GS->>CLI: cli.execute("show ip ospf neighbor") CLI-->>GS: neighbor state output GS->>CLI: cli.configure("do clear ip ospf process") CLI-->>GS: process cleared GS-->>APP: script exits (return code 0) APP->>IOS: action 3.0: syslog completion message

3.3 Triggering Scripts from IOS XE CLI

! Run a script directly from IOS XE exec mode
Router# guestshell run python3 /flash/guest-share/health_check.py

! Enter Guest Shell for interactive work
Router# guestshell
[guestshell@guestshell ~]$ python3 /flash/guest-share/health_check.py

Key Points — Section 3: On-Box Python

Post-Reading Quiz — Section 3: On-Box Python

11. Which cli module function runs an exec-mode command and returns its output as a Python string?

12. In the EEM + Guest Shell integration pattern, what does the EEM applet contribute and what does Python contribute?

13. How does the cli Python module communicate with IOS XE from inside Guest Shell?

Section 4: Troubleshooting Device-Level Automation

Pre-Reading Quiz — Section 4: Troubleshooting

14. What is the first command you should run when troubleshooting NETCONF or RESTCONF failures on IOS XE?

15. A NETCONF client reports a lock-denied error when attempting to write configuration. What is the most likely cause?

16. What HTTP status code does RESTCONF return when authentication fails?

17. What is YANG model aliasing and what problem does it cause?

18. Which configuration line must be removed to fix a legacy NETCONF conflict on IOS XE?

4.1 The Model-Driven Programmability Stack

NETCONF, RESTCONF, and YANG form a layered stack. The confd daemon is the foundational process — if it is not running, nothing above it works. Understanding this hierarchy is the key to systematic troubleshooting.

graph TD CLIENT["Management Client — ncclient / curl / Ansible / NSO"] CLIENT -->|"TCP 830 / SSH"| NETCONF["NETCONF Protocol Layer (RFC 6241)"] CLIENT -->|"TCP 443 / HTTPS"| RESTCONF["RESTCONF Protocol Layer (RFC 8040)"] NETCONF --> CONFD["confd daemon — yang-management process group"] RESTCONF --> NGINX["nginx / dmiauthd — HTTPS termination + auth"] NGINX --> CONFD CONFD --> YANG["YANG Data Models — Cisco-IOS-XE-native / ietf-interfaces / openconfig-*"] YANG --> CFGDB["IOS XE Configuration Database"]

4.2 Primary Health Check

Always start here. Every yang-management process must show Running.

Router# show platform software yang-management process

confd            : Running
nesd             : Running
syncfd           : Running
ncsshd           : Running
dmiauthd         : Running
nginx            : Running
ndbmand          : Running
pubd             : Running

4.3 Common Issues and Fixes

IssueSymptomFix
Legacy NETCONF conflict NETCONF clients fail to connect; capabilities exchange fails no netconf legacy
Stuck session holding config lock <rpc-error> with lock-denied error-tag show netconf-yang sessions then clear netconf-yang session <id>
Candidate datastore restart All NETCONF sessions drop after enabling candidate datastore Schedule during maintenance window; pre-notify clients
YANG model side effects NSO reports device out-of-sync after non-destructive NETCONF operation Use <validate> RPC before <commit>; test in lab first
YANG model aliasing Out-of-sync alerts despite successful operations; phantom config diffs Standardize on one YANG module family per device type; do not mix native + ietf

4.4 NETCONF Troubleshooting Commands

show platform software yang-management process  ! Primary health check
show netconf-yang sessions                       ! List active sessions
show netconf-yang sessions detail                ! Full session details
show netconf-yang datastores                     ! running/candidate/startup state
clear netconf-yang session <id>                 ! Clear stuck session + release lock
show running-config | format netconf-xml         ! Translate config to XML for payload building
show running-config | format restconf-json       ! Translate config to JSON

4.5 RESTCONF HTTP Status Codes

HTTP CodeMeaningLikely Cause
401UnauthorizedWrong credentials or insufficient privilege level
404Not FoundYANG path incorrect; wrong module name or revision
409ConflictResource state does not permit the requested operation

4.6 YANG Model Discovery

Discover supported modules via RESTCONF before writing automation:

curl -k -u admin:Cisco123 \
  -H "Accept: application/yang-data+json" \
  https://192.168.1.1/restconf/data/ietf-yang-library:modules-state

Key Points — Section 4: Troubleshooting

Post-Reading Quiz — Section 4: Troubleshooting

14. What is the first command you should run when troubleshooting NETCONF or RESTCONF failures on IOS XE?

15. A NETCONF client reports a lock-denied error when attempting to write configuration. What is the most likely cause?

16. What HTTP status code does RESTCONF return when authentication fails?

17. What is YANG model aliasing and what problem does it cause?

18. Which configuration line must be removed to fix a legacy NETCONF conflict on IOS XE?

Your Progress

Answer Explanations