Session Date: 2026-01-20
Project: Claude Code Dev Environment
Focus: Observability framework status and capabilities overview
Session Type: Completion verification

Executive Summary

The Claude Code observability framework is now production-ready with all four implementation phases complete. The system provides unified tracing, metrics, and logging for Claude Code hooks using OpenTelemetry as the foundation, Langtrace for LLM-specific instrumentation, and SigNoz Cloud as the observability backend.

The framework instruments 12 hook types across session, prompt, tool, stop, and error events. It exports telemetry via a dual export pattern—local JSONL files for offline analysis and OTLP to SigNoz Cloud for real-time dashboards. Key reliability features include circuit breaker protection, gzip compression, and configurable trace sampling.

Key Metrics:

MetricValue
Instrumented Hooks12
SigNoz Dashboards8
Implementation Phases4/4 Complete
Metrics Tracked15+
LLM Span Attributes11 auto-instrumented

Architecture

Claude Code Hooks
        |
        v
  HookMonitor (otel-monitor.ts)
  - Initializes OTel SDK
  - Creates root span
  - Records metrics/logs
        |
        v
+-------------------------------------------+
|         Dual Export Pattern               |
+-------------------------------------------+
|  Local File Export    |   Remote OTLP     |
|  (FileSpanExporter)   |   (SigNoz Cloud)  |
|  JSONL Format         |   TLS + Auth      |
|  ~/.claude/telemetry/ |   ingestion key   |
+-------------------------------------------+
        |                       |
        v                       v
   Local Cache           SigNoz Dashboard
   (JSONL files)         (traces, metrics, logs)

Core Components

1. OpenTelemetry Core (hooks/lib/otel.ts)

FeatureStatus
NodeSDK initialization✅ Complete
FileSpanExporter (JSONL)✅ Complete
FileLogExporter (JSONL)✅ Complete
OTLP trace/metric/log export✅ Complete
SigNoz auth headers✅ Complete
Graceful shutdown✅ Complete
Trace URL/ID helpers✅ Complete
OTLP gzip compression✅ Complete
Resource detectors✅ Complete
Debug mode✅ Complete
Span links✅ Complete
Circuit breaker✅ Complete
Trace sampling✅ Complete

Key Functions:

import {
  initTelemetry, shutdown, withSpan,
  recordMetric, recordGauge, logger,
  getTraceUrl, getTraceId
} from './lib/otel';

initTelemetry();
await withSpan('operation-name', { 'attr.key': 'value' }, async (span) => { ... });
recordMetric('operation.duration', 150, { 'operation.type': 'fetch' });
const traceUrl = getTraceUrl();  // https://tight-ladybird.us.signoz.cloud/trace/<id>
await shutdown();

2. Hook Monitor (hooks/lib/otel-monitor.ts)

FeatureStatus
HookMonitor class✅ Complete
HookContext interface✅ Complete
instrumentHook helper✅ Complete
Legacy log compat✅ Complete
Hook type inference✅ Complete

HookContext Methods:

  • addAttribute(key, value) / addAttributes(attrs)
  • recordEvent(name, attrs)
  • startChildSpan(name, attrs) / startLinkedSpan(name, linkedSpans, attrs)
  • recordMetric(name, value, attrs)
  • logger.{trace,debug,info,warn,error}

3. Langtrace Integration (hooks/lib/langtrace.ts)

FeatureStatus
SDK initialization✅ Complete
SigNoz Cloud routing✅ Complete
Local file export✅ Complete
Instrumentation toggles✅ Complete
recordLLMEvent()✅ Complete
withLLMTrace()✅ Complete
PII redaction✅ Complete
Custom processors✅ Complete

PII Redaction Patterns:

PatternReplacement
Email addresses[EMAIL]
Phone numbers[PHONE]
SSN[SSN]
Credit card numbers[CREDIT_CARD]
API keys[API_KEY]
AWS access keys[AWS_KEY]
IPv4 addresses[IP_ADDRESS]
JWT tokens[JWT_TOKEN]
Bearer tokensBearer [TOKEN]
Generic secrets[REDACTED_SECRET]

4. Token/Cost Metrics (hooks/lib/token-metrics.ts)

FeatureStatus
Token usage counter✅ Complete
Cost counter (USD)✅ Complete
Operation duration histogram✅ Complete
Operation counter✅ Complete
Model pricing table✅ Complete
LLMMetricsTracker class✅ Complete
GenAI semantic conventions✅ Complete

Instrumented Hooks

Hook FileEventStatus
session-start-otel.tsSessionStart
mcp-pre-tool-otel.tsPreToolUse (MCP)
mcp-post-tool-otel.tsPostToolUse (MCP)
plugin-pre-tool-otel.tsPreToolUse (Plugin)
plugin-post-tool-otel.tsPostToolUse (Plugin)
agent-pre-tool-otel.tsPreToolUse (Agent)
agent-post-tool-otel.tsPostToolUse (Agent)
skill-activation-prompt-otel.tsUserPromptSubmit
tsc-check-otel.tsPostToolUse
stop-build-check-otel.tsStop
error-handling-reminder-otel.tsStop
hook-runner.tsUnified router

Metrics Reference

Hook Metrics

MetricTypeAttributes
hook.durationHistogramhook.name, hook.status
hook.duration.gaugeUpDownCounterhook.name, hook.status
hook.executionsCounterhook.name, hook.status
hook.duration.maxGaugehook.name
hook.duration.minGaugehook.name

Tool/Agent Metrics

MetricTypeAttributes
mcp.invocationsCountermcp.server, mcp.tool
agent.invocationsCounteragent.type, agent.category, agent.model
plugin.invocationsCounterplugin.server, plugin.tool

Build Metrics

MetricTypeAttributes
build.check.durationHistogrambuild.repo
build.errorsGaugebuild.repo

GenAI Metrics

MetricTypeAttributes
gen_ai.client.token.usageCountergen_ai.request.model, gen_ai.token.type
gen_ai.client.costCountergen_ai.request.model
gen_ai.client.operation.durationHistogramgen_ai.request.model, gen_ai.operation.name

Langtrace Auto-Instrumented Attributes

AttributeTypeDescription
gen_ai.systemstringProvider (anthropic, openai, etc.)
gen_ai.request.modelstringModel identifier
gen_ai.request.max_tokensintMax tokens requested
gen_ai.request.temperaturefloatTemperature setting
gen_ai.request.top_pfloatTop-p sampling
gen_ai.usage.input_tokensintInput tokens used
gen_ai.usage.output_tokensintOutput tokens used
gen_ai.response.finish_reasonstringstop, length, tool_calls
llm.request.typestringchat, completion, embedding
gen_ai.promptstringPrompt (PII redacted)
gen_ai.completionstringResponse (PII redacted)

SigNoz Dashboards

DashboardUUIDDescription
Claude Code Hooks Observability019ba681-...Core hook performance
Claude Code Hooks Performance019bd87e-...Duration and status distribution
Token Usage & Cost Efficiency019bdcb8-...LLM consumption and costs
Tool & MCP Usage Analytics019bdcdd-...MCP server/tool usage
Error & Anomaly Detection019bdce0-...Error monitoring
Build & Type Check Performance019bddd3-...TSC/Python metrics
Subagent Analytics019bddd7-...Agent invocations
Session Health Overview019bddd8-...Session activity

Output Locations

Local Files:

~/.claude/telemetry/
  traces-YYYY-MM-DD.jsonl     # OpenTelemetry spans
  logs-YYYY-MM-DD.jsonl       # OpenTelemetry logs
  llm-events-YYYY-MM-DD.jsonl # Langtrace LLM events

~/.claude/logs/
  hook-performance.log        # Legacy performance log

Remote: https://tight-ladybird.us.signoz.cloud/

Environment Configuration

# Infrastructure paths
export CLAUDE_CONFIG_DIR="$HOME/.claude"
export CLAUDE_TELEMETRY_DIR="$CLAUDE_CONFIG_DIR/telemetry"

# OpenTelemetry
export OTEL_ENABLED="true"
export OTEL_EXPORTER_OTLP_ENDPOINT="https://ingest.us.signoz.cloud"
export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
export OTEL_EXPORTER_OTLP_COMPRESSION="gzip"
export OTEL_SERVICE_NAME="claude-code-hooks"

# SigNoz Cloud
export SIGNOZ_ENABLED="true"
export SIGNOZ_INGESTION_KEY="<from-doppler>"

# Langtrace
export LANGTRACE_API_KEY="<from-doppler>"
export LANGTRACE_WRITE_TO_FILE="true"
export LANGTRACE_PII_REDACTION="true"

Implementation Phases

Phase 1: Quick Wins - COMPLETE

  • OTLP gzip compression
  • getTraceUrl() and getTraceId() helpers
  • Debug mode documentation
  • Resource detectors for host/OS/process
  • startLinkedSpan() for operation correlation
  • Sampling configuration docs

Phase 2: Reliability - COMPLETE

  • Circuit breaker for OTLP export (3 failures, 60s reset)
  • Export timeout configuration (OTEL_EXPORTER_OTLP_TIMEOUT=5000)

Phase 3: Langtrace Enhancements - COMPLETE

  • PII redaction processor with 10 pattern types
  • Streaming instrumentation verified (SDK supports 6 providers)
  • Langtrace metrics documented in reference

Phase 4: Dashboards & Alerts - COMPLETE

  • 8 SigNoz dashboard templates
  • 5 alert rule types documented
  • MCP query format documented

Key Decisions

Decision 1: Dual Export Pattern

Choice: Export to both local JSONL files and SigNoz Cloud Rationale: Local files enable offline analysis and debugging; cloud enables real-time dashboards Trade-off: Slight storage overhead for redundancy

Decision 2: PII Redaction Default Enabled

Choice: Enable PII redaction by default in Langtrace Rationale: Prevents accidental sensitive data exposure in LLM traces Trade-off: Minor processing overhead, can be disabled with LANGTRACE_PII_REDACTION=false

Decision 3: Circuit Breaker for Exports

Choice: Fail fast after 3 consecutive export failures, reset after 60s Rationale: Prevents hook slowdown when SigNoz is unreachable Trade-off: May miss telemetry during outages (local files still capture)

References

Code Files

  • hooks/lib/otel.ts - OpenTelemetry core
  • hooks/lib/otel-monitor.ts - Hook instrumentation
  • hooks/lib/langtrace.ts - LLM tracing
  • hooks/lib/token-metrics.ts - Cost tracking

Documentation

  • docs/observability-framework-current.md - Full technical reference
  • docs/signoz-cloud-setup.md - SigNoz configuration