Skip to content

Observe layer: forge-observe

GitHub

OpenTelemetry instrumentation for the reasoning layer. Every observability platform in 2026 captures tool call spans and agent graph nodes, but the actual thinking block content, budget consumption, and preserve_thinking utilization are invisible to all of them. forge-observe instruments that layer.

What it instruments

  • Thinking budget burn rate: how fast the 128K+ budget is being consumed
  • Think vs. response token split: what fraction of output tokens are thinking vs. user-facing
  • preserve_thinking utilization: is the model reusing prior thinking across turns, or regenerating
  • Mode switch events: when the router flips between thinking and instruct mode, and why
  • Backend flag normalization: which backend path was used (vLLM nested, DashScope top-level, llama.cpp server-side)
  • Sampling param swaps: logs when the atomic think/instruct parameter swap fires

Install

pip install forge-observe
# with OTLP exporter:
pip install forge-observe[otlp]

Usage

Auto-instrumentation

from forge_observe import instrument

instrument()  # patches qwen-think sessions automatically

from qwen_think import ThinkingSession
session = ThinkingSession(client=your_client)
response = session.chat("Implement a binary search tree")
# Spans and metrics are emitted to your configured OTel backend

Manual tracer

from forge_observe import ForgeTracer

tracer = ForgeTracer()
with tracer.thinking_span("complex-query") as span:
    span.set_attribute("forge.thinking.budget_remaining", 180_000)
    span.set_attribute("forge.thinking.mode", "thinking")
    # ... your inference call ...

OTel integration

Standard OTel SDK. Emits spans and metrics that can be collected by any OTel-compatible backend. If you already have observability infra, forge-observe plugs into it:

  • Jaeger
  • Grafana Tempo
  • Datadog

Example configs for each backend are included in the repo under examples/.

Status

forge-observe is built and functional. It is not yet published on PyPI. See roadmap for details.