Chapter 8 — Metrics Pipeline: Bridging OpenTelemetry and Prometheus

Learning Objectives

Section 1 — OpenTelemetry Metrics Data Model

Pre-Reading Check — Sections 1 & 2

1. Which OpenTelemetry instrument type is best suited for measuring the number of in-flight HTTP requests at any moment?

A) Counter
B) UpDownCounter
C) Histogram
D) ObservableCounter

2. What is the primary advantage of an exponential histogram over a classic explicit-bucket histogram?

A) It uses no memory at all.
B) It automatically picks bucket boundaries via a scale parameter and downscales to stay bounded.
C) It supports more attribute keys than other histograms.
D) It works without a Meter.

3. In OpenTelemetry, what is a View?

A) A dashboard widget for visualizing metrics.
B) An SDK configuration that intercepts measurements and changes aggregation, name, or attributes before export.
C) A query language for Prometheus.
D) A type of OTel resource attribute.

4. Which aggregation temporality is the natural fit for Prometheus?

A) Delta — per-interval increments.
B) Cumulative — total since process start.
C) Both work equally well.
D) Neither — Prometheus computes temporality itself.

5. If a single OTLP export carrying delta points never arrives at the Collector, what happens?

A) The next cumulative sample makes the data recoverable.
B) The events in that interval are lost forever.
C) Prometheus automatically retries delivery.
D) The SDK retransmits all data since process start.

OpenTelemetry's metrics data model is intentionally richer than the Prometheus text exposition format. That richness — six instrument types, configurable aggregations via Views, exponential histograms, attributes-plus-resource-attributes — is precisely why the bridge is nontrivial.

1.1 The Six Core Instruments

Instruments are the API surface your code calls. They split along two axes: synchronous vs. observable (does the app push, or does the SDK pull a callback?) and monotonic vs. non-monotonic (can the value ever decrease?).

InstrumentSync?Monotonic?Typical use
CounterYesYesRequests served, bytes sent
UpDownCounterYesNoIn-flight requests, queue depth
HistogramYesn/aRequest duration, payload size
ObservableCounterNo (callback)YesCPU seconds, GC bytes
ObservableUpDownCounterNo (callback)NoMemory in use, thread pool size
ObservableGaugeNo (callback)n/aTemperature, load average

Think of synchronous instruments as ringing a bell on every event and observable instruments as a thermostat the SDK reads on a schedule. Both produce time series; the cost models differ.

1.2 Meter to Exporter — the OTel pipeline

flowchart TD Meter[Meter] Meter --> Sync[Synchronous Instruments] Meter --> Obs[Observable Instruments] Sync --> C[Counter] Sync --> UDC[UpDownCounter] Sync --> H[Histogram] Obs --> OC[ObservableCounter] Obs --> OUDC[ObservableUpDownCounter] Obs --> OG[ObservableGauge] C --> View["View
rename, filter,
change aggregation"] UDC --> View H --> View OC --> View OUDC --> View OG --> View View --> Agg["Aggregation
Sum / LastValue /
Histogram / ExpHistogram"] Agg --> DP["Data Point
value + attributes +
timestamp + temporality"] DP --> Exp["Exporter
OTLP / Prometheus / stdout"]

Figure 8.1 above shows the SDK pipeline: an instrument emits a measurement; a View may rewrite or re-aggregate it; the aggregation builds a data point with a temporality stamp; an exporter pushes or exposes that point.

1.3 Views — the SDK's escape hatch

A View intercepts measurements before export. Views let you:

Views are how you serve two backends from one instrument: an explicit-bucket histogram on the Prometheus scrape path, and an exponential histogram on the OTLP path.

// Go SDK: switch a histogram to exponential
sdkmetric.NewView(
    sdkmetric.Instrument{Name: "request_duration_seconds"},
    sdkmetric.Stream{
        Aggregation: sdkmetric.AggregationExponentialHistogram{
            MaxSize:  160,
            MaxScale: 10,
        },
    },
)

1.4 Exponential histograms

OTel's ExponentialHistogram uses base-2 buckets controlled by a scale s: buckets approximate [2^(i/2^s), 2^((i+1)/2^s)). Higher s = more buckets per power of two = more resolution.

Three properties make exponential histograms much better than fixed-bucket histograms:

  1. Automatic dynamic range — no need to pre-pick bucket boundaries.
  2. Bounded memory — when bucket count would exceed MaxSize, the aggregator downscales, merging neighbors. Memory stays constant; precision degrades gracefully.
  3. Separate positive, negative, and zero buckets — supports signed observations.

Contrast with a classic histogram: when latency suddenly grows past your last bucket, every observation piles into the +Inf overflow bucket and your quantiles become meaningless.

Key Points — Section 1

Section 2 — Aggregation Temporality

Aggregation temporality is the single most common source of OTel-to-Prometheus bugs. It is the difference between a counter that grows forever and one that resets every export — and Prometheus' query language assumes the former.

2.1 Cumulative vs. delta — what each means

Temporality applies to sums (counters) and histograms. Gauges are instantaneous and ignore temporality. Crucially, temporality is a property of the exported time series, not the instrument: the same Counter can be exported as cumulative to one backend and delta to another.

Animation: Cumulative vs Delta — same events, two shapes
Cumulative (Prometheus-friendly)
t1 t2 t3 100 220 350
Prometheus expects this
Delta (per-interval increment)
t1 t2 t3 +100 +120 +130
Causes negative rate() in Prom
Both panels show the same underlying event stream (100, then 120, then 130 events per interval). The cumulative series totals 100 → 220 → 350; the delta series reports each interval's increment independently. Prometheus' rate() and increase() assume cumulative.

2.2 What happens on restart and missed exports

AspectCumulativeDelta
Value at time TTotal since startChange since last export
Lost exportBackend computes a longer-window rateData for that interval is lost forever
Process restartBackend must detect resetEach export is independent — no reset concept
Aligns with Prometheus?Natural fitNot supported natively
Typical OTLP push guidanceSupported, less commonOften favored

2.3 The Collector as temporality translator

In real pipelines, you rarely want to pick one temporality and force every backend to live with it. The dominant 2025 pattern is to let the OpenTelemetry Collector convert temporality per exporter:

  1. App SDK exports OTLP with AggregationTemporality = DELTA.
  2. Collector receives delta points.
  3. Collector forwards delta to an OTLP vendor backend that prefers deltas.
  4. Collector accumulates delta into cumulative for prometheusremotewrite or the /metrics endpoint Prometheus scrapes.
flowchart LR App[Application SDK] App -->|OTLP delta| Coll[OpenTelemetry Collector] Coll -->|cumulative| PromExp[prometheus exporter] Coll -->|delta| OTLPExp[otlp exporter] PromExp -->|scrape| Prom[(Prometheus)] OTLPExp -->|push| Vendor[(Vendor OTLP backend)]

The Prometheus exporters inside the Collector maintain per-series state to sum deltas into a running cumulative total. This is what makes the “delta-from-apps, cumulative-to-Prometheus” pattern work end to end.

2.4 Symptoms of getting it wrong

Misconfigured temporality produces distinctive symptoms in Prometheus:

The fix is always the same: ensure the exporter that feeds Prometheus emits cumulative, regardless of what the SDK and intermediate Collectors use.

Key Points — Section 2

Post-Reading Quiz — Sections 1 & 2

1. Which OpenTelemetry instrument type is best suited for measuring the number of in-flight HTTP requests at any moment?

A) Counter
B) UpDownCounter
C) Histogram
D) ObservableCounter

2. What is the primary advantage of an exponential histogram over a classic explicit-bucket histogram?

A) It uses no memory at all.
B) It automatically picks bucket boundaries via a scale parameter and downscales to stay bounded.
C) It supports more attribute keys than other histograms.
D) It works without a Meter.

3. In OpenTelemetry, what is a View?

A) A dashboard widget for visualizing metrics.
B) An SDK configuration that intercepts measurements and changes aggregation, name, or attributes before export.
C) A query language for Prometheus.
D) A type of OTel resource attribute.

4. Which aggregation temporality is the natural fit for Prometheus?

A) Delta — per-interval increments.
B) Cumulative — total since process start.
C) Both work equally well.
D) Neither — Prometheus computes temporality itself.

5. If a single OTLP export carrying delta points never arrives at the Collector, what happens?

A) The next cumulative sample makes the data recoverable.
B) The events in that interval are lost forever.
C) Prometheus automatically retries delivery.
D) The SDK retransmits all data since process start.

Section 3 — Bridging to Prometheus

Pre-Reading Check — Sections 3 & 4

6. Which exporter pushes metrics into Cortex, Mimir, Thanos Receive, or VictoriaMetrics?

A) The SDK Prometheus exporter.
B) The Collector prometheus exporter.
C) The Collector prometheusremotewrite exporter.
D) The Prometheus OTLP receiver.

7. Can you point the prometheusremotewrite exporter at a vanilla Prometheus server?

A) Yes — all Prometheus servers accept remote-write by default.
B) No — vanilla Prometheus is a remote-write client, not a receiver.
C) Only after enabling --web.enable-remote-write-receiver.
D) Only over HTTPS.

8. According to the chapter, what is the recommended 2025 bridging pattern for a Prometheus shop adopting OTel?

A) SDK Prometheus exporter on every app, no Collector.
B) App → OTLP → Collector → prometheus exporter → Prometheus scrape.
C) Stop using Prometheus entirely.
D) Push directly into Prometheus' OTLP receiver from every app.

9. What does the OTel metric name http.server.request.duration with unit ms become in Prometheus?

A) http.server.request.duration.ms
B) http_server_request_duration_milliseconds
C) http_server_request_duration_seconds (values multiplied by 0.001)
D) httpServerRequestDurationMs

10. Why is it dangerous to attach k8s.pod.uid as a Prometheus label?

A) It violates UTF-8 encoding rules.
B) It can produce one new series per pod, exploding cardinality.
C) Prometheus refuses to scrape any series containing it.
D) It conflicts with the le histogram bucket label.

Four practical bridges get OTel metrics into a Prometheus-based stack. Each trades coupling, push-vs-pull semantics, and number of moving parts.

flowchart TD App1[App with OTel SDK
+ Prometheus exporter] App2[App with OTel SDK] App3[App with OTel SDK] App4[App with OTel SDK] Coll2[OTel Collector
prometheus exporter] Coll3[OTel Collector
prometheusremotewrite] App1 -->|expose /metrics| P1Slash[/metrics endpoint/] P1Slash -->|scrape| Prom1[(Prometheus)] App2 -->|OTLP push| Coll2 Coll2 -->|expose /metrics| P2Slash[/metrics endpoint/] P2Slash -->|scrape| Prom2[(Prometheus)] App3 -->|OTLP push| Coll3 Coll3 -->|remote_write push| Remote[(Mimir / Cortex /
Thanos / VictoriaMetrics)] App4 -->|OTLP push| Prom4[(Prometheus
OTLP receiver)]
Animation: Three bridge paths — one app, three destinations
App (OTel SDK) Prom SDK exporter Prometheus /metrics — scrape 1. SDK Prometheus exporter (pull) Collector Prometheus OTLP push scrape 2. OTLP → Collector prometheus exporter (pull) Collector Mimir/Cortex/ Thanos/VM OTLP push remote_write 3. OTLP → Collector prometheusremotewrite (push)
Three sequential metric flows: (1) SDK exposes /metrics and Prometheus scrapes directly; (2) app pushes OTLP to a Collector that exposes /metrics for Prometheus scrape; (3) app pushes OTLP to a Collector that pushes remote-write to a horizontally scalable backend.

3.1 Option 1 — SDK Prometheus exporter

Attach a Prometheus exporter directly to the OTel SDK inside your app. The SDK accumulates measurements internally and exposes them in Prometheus text format on an HTTP endpoint.

exporter, _ := prometheus.New()
provider := sdkmetric.NewMeterProvider(sdkmetric.WithReader(exporter))
otel.SetMeterProvider(provider)
http.Handle("/metrics", exporter)
go http.ListenAndServe(":9464", nil)

Pros: familiar; no Collector required. Cons: resource attributes flatten into labels (or are lost), exponential histograms are down-converted to explicit-bucket, every app couples to Prometheus' wire format.

3.2 Option 2 — Collector prometheus exporter (recommended)

App pushes OTLP to a Collector; the Collector hosts a Prometheus exporter on a port; Prometheus scrapes the Collector. You get OTLP push from apps, Prometheus pull at the storage boundary, and a Collector chokepoint to manage temporality, naming, and cardinality.

# Collector
exporters:
  prometheus:
    endpoint: "0.0.0.0:9464"
    namespace: otel
    const_labels:
      source: otel-collector
service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [prometheus]

3.3 Option 3 — prometheusremotewrite exporter

The Collector pushes metrics over the Prometheus remote-write protocol. The dominant pattern for shipping into Cortex, Mimir, Thanos Receive, and VictoriaMetrics. Cannot target vanilla Prometheus — vanilla Prometheus is a remote-write client, not a receiver.

This path can preserve OTel exponential histograms by mapping them to Prometheus native histograms over the wire, making it the highest-fidelity option in 2025.

3.4 Option 4 — Prometheus OTLP receiver

Recent Prometheus versions (2.47+, more complete in 3.x) include an OTLP receiver that accepts pushed OTLP metrics directly into the TSDB. Collapses the pipeline to two components, but you lose service-discovery and OTLP-specific semantics map to Prometheus equivalents with varying maturity.

otlp:
  http:
    endpoint: 0.0.0.0:4318
  grpc:
    endpoint: 0.0.0.0:4317

3.5 Choosing between the four

OptionModelBest forMain limitation
SDK Prom exporterPullSmall Prom shopsCouples apps to Prom wire format
Collector prometheusPush-in, pull-outProm shops adopting OTelExtra hop
prometheusremotewritePush to backendLarge multi-cluster Cortex/MimirCannot target vanilla Prom
Prom OTLP receiverPush into PromOTel-first, fewer componentsNewer; less mature mappings

Key Points — Section 3

Section 4 — Naming and Label Mapping

Once the wire path is sorted out, the next failure mode is semantic. OTel uses dotted, case-sensitive names plus explicit units. Prometheus uses underscored names with base-unit suffixes and the _total convention. A faithful bridge has to translate without quietly losing meaning.

4.1 Name conversion — the deterministic transform

  1. Replace . with _http.server.request.durationhttp_server_request_duration.
  2. Replace other invalid characters (hyphens, slashes) with _.
  3. Convert the value to the Prometheus base unit (e.g., ms → seconds), append the unit suffix.
  4. If the instrument is a monotonic counter, append _total.
Animation: http.server.request.duration (ms, Histogram) → Prometheus name
0 OTel name: http.server.request.duration unit: ms · type: Histogram
1 Replace . with _ http_server_request_duration
2 Sanitize other invalid chars http_server_request_duration (none here)
3 Unit ms → base unit seconds; values × 0.001; append _seconds
4 http_server_request_duration_seconds Histogram → no _total
Four-step deterministic transform. For a monotonic Counter, the exporter would also append _total. Setting the wrong unit at the instrument is a common silent bug — values land in Prometheus off by 1,000×.
OTel nameUnitInstrumentPrometheus name
http.server.request.durationsHistogramhttp_server_request_duration_seconds
http.server.active_requests{request}UpDownCounterhttp_server_active_requests
http.client.request.body.sizeByHistogramhttp_client_request_body_size_bytes
process.cpu.timesObservableCounterprocess_cpu_time_seconds_total
system.memory.usageByObservableUpDownCountersystem_memory_usage_bytes

4.2 Unit suffixes

OTel uses UCUM unit codes; Prometheus uses base-unit suffixes. The exporter does the conversion if you set the OTel unit correctly.

OTel unitProm suffixValue conversion
s_secondsNone
ms_seconds× 0.001
us / µs_seconds× 0.000001
By_bytesNone
KiBy_bytes× 1024
1, {request}(none)None

4.3 Attributes → labels

OTel attributes (per-measurement) and resource attributes (per-SDK) both become Prometheus labels. Attribute keys go through the same dot-to-underscore conversion. The Prom exporter typically promotes service.namejob and service.instance.idinstance to mirror Prometheus' service-discovery model.

OTel attributePrometheus label
service.namejob (and/or service_name)
service.instance.idinstance
http.response.status_codehttp_response_status_code
k8s.namespace.namek8s_namespace_name
net.peer.namenet_peer_name

4.4 UTF-8 names in Prometheus 3.x

Prometheus 3.x supports UTF-8 metric and label names through the OpenMetrics/native protocols, which means the OTel dotted form can in principle be preserved end-to-end by quoting the metric name. In practice, dashboards, alerting rules, and recording rules predating 2024 still expect underscored names. Treat UTF-8 names as forward-looking; expect translation back to underscored names anywhere a tool was written before 2024.

4.5 Common pitfalls

Key Points — Section 4

Post-Reading Quiz — Sections 3 & 4

6. Which exporter pushes metrics into Cortex, Mimir, Thanos Receive, or VictoriaMetrics?

A) The SDK Prometheus exporter.
B) The Collector prometheus exporter.
C) The Collector prometheusremotewrite exporter.
D) The Prometheus OTLP receiver.

7. Can you point the prometheusremotewrite exporter at a vanilla Prometheus server?

A) Yes — all Prometheus servers accept remote-write by default.
B) No — vanilla Prometheus is a remote-write client, not a receiver.
C) Only after enabling --web.enable-remote-write-receiver.
D) Only over HTTPS.

8. According to the chapter, what is the recommended 2025 bridging pattern for a Prometheus shop adopting OTel?

A) SDK Prometheus exporter on every app, no Collector.
B) App → OTLP → Collector → prometheus exporter → Prometheus scrape.
C) Stop using Prometheus entirely.
D) Push directly into Prometheus' OTLP receiver from every app.

9. What does the OTel metric name http.server.request.duration with unit ms become in Prometheus?

A) http.server.request.duration.ms
B) http_server_request_duration_milliseconds
C) http_server_request_duration_seconds (values multiplied by 0.001)
D) httpServerRequestDurationMs

10. Why is it dangerous to attach k8s.pod.uid as a Prometheus label?

A) It violates UTF-8 encoding rules.
B) It can produce one new series per pod, exploding cardinality.
C) Prometheus refuses to scrape any series containing it.
D) It conflicts with the le histogram bucket label.

Your Progress

Answer Explanations