EU AI Act: Observability Requirements for LLM/GenAI Systems
EU AI Act: Observability Requirements for LLM/GenAI Systems
Document Version: 1.2 Created: 2026-01-29 Updated: 2026-01-31 Source: EU AI Act (Regulation 2024/1689)
Overview
The EU AI Act entered into force on August 1, 2024, with a phased implementation timeline. This document summarizes the observability, logging, and documentation requirements relevant to LLM and GenAI systems.
Implementation Timeline
| Date | Requirements |
|---|---|
| Aug 2024 | Act enters into force |
| Feb 2025 | Prohibited AI practices apply |
| Aug 2025 | GPAI obligations (Articles 53, 55) |
| Aug 2026 | High-risk AI system requirements (Articles 12, 19) |
General-Purpose AI (GPAI) Requirements
Article 53: GPAI Provider Obligations
Effective: August 2, 2025
All GPAI model providers must:
- Maintain technical documentation per Annex XI
- Provide information to downstream providers per Annex XII
- Establish copyright compliance policies
- Publish training data summaries
Article 55: Systemic Risk GPAI Obligations
Effective: August 2, 2025
Models trained with >10^25 FLOPs additionally require:
- Model evaluation using standardized protocols
- Adversarial testing (red teaming)
- Systemic risk tracking and mitigation
- Cybersecurity protection
- Incident reporting to EU AI Office
Annex XI: GPAI Technical Documentation Requirements
Applies to: All GPAI providers (including LLM providers)
Section 1: All GPAI Providers
1. General Description
| Element | Description | |———|————-| | Tasks | Intended tasks and AI system integration types | | Acceptable Use | Policies governing permitted uses | | Release Info | Date and distribution methods | | Architecture | Model architecture and parameter count | | I/O Format | Input/output modalities and formats | | License | Licensing terms |
2. Design & Training Process
| Element | Description | |———|————-| | Technical Means | Infrastructure, tools, usage instructions for integration | | Design Specifications | Training methodologies, key design choices, rationale, assumptions | | Optimization | What the model optimizes for, parameter relevance |
3. Data Documentation
| Element | Description | |———|————-| | Data Sources | Type and provenance of training/test/validation data | | Curation Methods | Cleaning, filtering, preprocessing techniques | | Data Points | Number, scope, and main characteristics | | Data Selection | How data was obtained and selected | | Bias Detection | Methods to identify unsuitable sources and biases |
4. Compute & Energy
| Element | Description | |———|————-| | Compute Resources | FLOPs used for training | | Training Time | Duration of training process | | Energy Consumption | Known or estimated (can estimate from compute) |
Section 2: Systemic Risk GPAI (Additional)
| Element | Description |
|---|---|
| Evaluation Strategies | Criteria, metrics, methodology for identifying limitations |
| Adversarial Testing | Red teaming, alignment, fine-tuning measures |
| System Architecture | Software component interactions, processing flow |
High-Risk AI System Requirements
Article 12: Record-Keeping
Effective: August 2, 2026
Core Requirements
- Automatic Logging Capability
- Systems must technically enable automatic event recording (logs)
- Logging must persist over the system’s entire lifetime
- Required Log Events
- Situations that may present risk (per Article 79(1))
- Substantial modifications to the system
- Events relevant to post-market monitoring (Article 72)
- Operational monitoring events (Article 26(5))
- Biometric Identification Systems (Annex III, point 1(a))
- Session timestamps (start/end of each use)
- Reference database against which input was checked
- Input data that produced matches
- Identity of humans who verified results (per Article 14(5))
Rationale (Recital 71)
“Having comprehensible information on how high-risk AI systems have been developed and how they perform throughout their lifetime is essential to enable traceability of those systems, verify compliance with the requirements under this Regulation, as well as monitoring of their operations and post market monitoring.”
Key points:
- Technical documentation must be kept up to date throughout lifetime
- Enables traceability and compliance verification
- Supports post-market surveillance
Article 19: Automatically Generated Logs
Effective: August 2, 2026
- Providers must retain logs generated by high-risk AI systems
- Minimum retention period: 6 months (unless otherwise specified by law)
- Deployers under provider control must also maintain logs
Observability Implementation Mapping
OTel GenAI Semantic Conventions Alignment
| EU AI Act Requirement | OTel GenAI Attribute/Event |
|---|---|
| Session timestamps | gen_ai.conversation.id + span timestamps |
| Model identification | gen_ai.response.model |
| Input logging | gen_ai.content.prompt event |
| Output logging | gen_ai.content.completion event |
| Tool/database references | gen_ai.tool.name, gen_ai.tool.call.id |
| Token usage | gen_ai.usage.input_tokens, gen_ai.usage.output_tokens |
| Request parameters | gen_ai.request.temperature, gen_ai.request.max_tokens |
| Finish reasons | gen_ai.response.finish_reasons |
| Provider identification | gen_ai.provider.name, gen_ai.system |
observability-toolkit Configuration
// Recommended settings for EU AI Act compliance
{
RETENTION_DAYS: 180, // 6+ months per Article 19
LOG_LEVEL: 'info', // Capture operational events
TRACE_CONTENT: true, // Enable input/output logging
SESSION_TRACKING: true, // Track conversation sessions
}
Compliance Checklist
- Enable automatic event logging for all AI system interactions
- Capture session start/end timestamps
- Log model version and configuration per request
- Record input data and corresponding outputs
- Track human verification events (if applicable) - see 1.8.6/BACKLOG.md
- Implement 6+ month log retention (
RETENTION_DAYSconfig) - Maintain technical documentation and keep it updated
- Enable traceability via trace IDs and session IDs
Penalties
| Violation | Fine |
|---|---|
| Prohibited AI practices | Up to 35M EUR or 7% global turnover |
| High-risk AI non-compliance | Up to 15M EUR or 3% global turnover |
| Incorrect information to authorities | Up to 7.5M EUR or 1% global turnover |
| GPAI provider violations | Up to 15M EUR or 3% global turnover |
References
Official Sources
Article References
- Article 12: Record-Keeping
- Article 19: Automatically Generated Logs
- Article 53: GPAI Provider Obligations
- Article 55: Systemic Risk GPAI Obligations
Annex References
- Annex XI: GPAI Technical Documentation
- Annex XII: GPAI Transparency Information
- Annex XIII: Systemic Risk Criteria
Recitals
Document History
| Version | Date | Changes |
|---|---|---|
| 1.0 | 2026-01-29 | Initial research compilation |
| 1.1 | 2026-01-29 | Added Appendix A (session telemetry) and Appendix B (toolkit compliance) |
| 1.2 | 2026-01-31 | Updated to v1.8.5; marked evaluation events complete; updated compliance checklist |
Appendix A: Session Telemetry Data
This appendix demonstrates telemetry data captured during the research session that produced this document, showing how observability-toolkit captures EU AI Act-relevant data.
Session Overview
| Attribute | Value |
|---|---|
| Session ID | a8a71f9f-58de-4733-b912-d677b14f1575 |
| Model | claude-opus-4-5-20251101 |
| Date | 2026-01-29 |
| Messages | 106 |
| Total Tokens | 85,385 |
| Context Utilization | 42.7% |
Token Breakdown
| Category | Tokens |
|---|---|
| System Prompt | 8,000 |
| System Tools | 15,000 |
| Messages | 62,385 |
| Cache Read | 85,123 |
| Cache Creation | 252 |
Cost Tracking
| Metric | Value |
|---|---|
| Input Cost | $0.0001 |
| Output Cost | $0.0006 |
| Total Cost | $0.0007 |
Sample Traces Captured
The following traces were captured during this session, demonstrating automatic event logging per Article 12 requirements:
1. MCP Tool Invocations
Trace ID: 464192682aa7f9cc25a9fa92bb136768
Span: hook:mcp-pre-tool
Duration: 4.35ms
Attributes:
- mcp.server: observability-toolkit
- mcp.tool: obs_query_traces
- session.id: a8a71f9f-58de-4733-b912-d677b14f1575
- service.name: claude-code-hooks
2. Web Research Tool Usage
Trace ID: d856db220dcee13d71c861488e76b9e4
Span: hook:mcp-post-tool
Duration: 2.38ms
Attributes:
- mcp.server: webresearch
- mcp.tool: visit_page
- mcp.success: true
- session.id: a8a71f9f-58de-4733-b912-d677b14f1575
3. File Operations
Trace ID: 2711401030067a7d545db286379692a7
Span: hook:builtin-post-tool
Duration: 4.50ms
Attributes:
- builtin.tool: Write
- builtin.category: file
- builtin.success: true
- session.id: a8a71f9f-58de-4733-b912-d677b14f1575
4. Token Metrics Extraction
Trace ID: 917fa2b09b9a4e4062bb5ad07737771c
Span: hook:token-metrics-extraction
Duration: 17.07ms
Attributes:
- tokens.input: 883
- tokens.output: 185
- tokens.cache_read: 3,523,578
- tokens.model: claude-opus-4-5-20251101
Historical Session Data
| Date | Avg Tokens | Sessions |
|---|---|---|
| 2026-01-27 | 60,000 | 3 |
| 2026-01-28 | 65,000 | 2 |
| 2026-01-29 | 128,580 | 8 |
Appendix B: observability-toolkit EU AI Act Compliance Assessment
Compliance Matrix
| EU AI Act Requirement | Article | observability-toolkit Capability | Status |
|---|---|---|---|
| Automatic event logging | Art. 12(1) | Automatic trace/span recording via OTel | Supported |
| Session timestamps | Art. 12(3)(a) | session.id + span start/end times | Supported |
| Tool/database references | Art. 12(3)(b) | mcp.server, mcp.tool, gen_ai.tool.name | Supported |
| Input data logging | Art. 12(3)(c) | Content events, request parameters | Supported |
| Human verification tracking | Art. 12(3)(d) | Custom span attributes | Extensible |
| Log retention (6+ months) | Art. 19 | RETENTION_DAYS configuration | Configurable |
| Model identification | Annex XI | gen_ai.response.model, tokens.model | Supported |
| Provider identification | Annex XI | gen_ai.provider.name, gen_ai.system | Supported |
| Token usage tracking | Annex XI | tokens.input, tokens.output, gen_ai.usage.* | Supported |
| Cost estimation | Annex XI | Session cost breakdown | Supported |
Tool Capabilities Summary
Query Tools
| Tool | EU AI Act Use Case | |——|——————-| | obs_query_traces | Retrieve logged events for compliance audits | | obs_query_logs | Search operational logs by severity/session | | obs_query_metrics | Aggregate usage metrics with percentiles | | obs_query_llm_events | Query LLM-specific events and token usage | | obs_query_evaluations | Query quality evaluation events with aggregations | | obs_context_stats | Session-level context and cost analysis |
Compliance Tools
| Tool | EU AI Act Use Case | |——|——————-| | obs_health_check | Verify telemetry system operational status | | obs_get_trace_url | Generate shareable trace URLs for audits | | obs_setup_claudeignore | Configure retention and exclusion policies |
OTel GenAI Semantic Conventions (v1.8.5)
observability-toolkit implements 10/10 OTel GenAI semantic convention attributes:
| Attribute | Implementation |
|---|---|
gen_ai.operation.name | chat, embeddings, invoke_agent, execute_tool |
gen_ai.provider.name | Fallback chain with gen_ai.system |
gen_ai.conversation.id | Session correlation |
gen_ai.response.model | Model version tracking |
gen_ai.response.finish_reasons | Completion status |
gen_ai.request.temperature | Request parameters |
gen_ai.request.max_tokens | Request parameters |
gen_ai.tool.name | Tool identification |
gen_ai.tool.call.id | Tool invocation tracking |
gen_ai.agent.id / gen_ai.agent.name | Agent identification |
Backend Support
| Backend | Traces | Metrics | Logs | Notes |
|---|---|---|---|---|
| Local JSONL | Yes | Yes | Yes | Default, file-based storage |
| SigNoz Cloud | Yes | Yes | Yes | OTLP export supported |
| Langfuse | Planned | Planned | N/A | Phase 4b roadmap |
Gaps & Roadmap
| Gap | EU AI Act Relevance | Status |
|---|---|---|
| Quality assurance | ✅ Implemented: obs_query_evaluations | |
| Langfuse export | External audit tools | Planned: Phase 4b OTLP export utility |
| LLM-as-Judge hooks | Automated evaluation | Planned: Phase 4c webhook integration |
| Human verification spans | Art. 12(3)(d) | Extensible via custom span attributes |
Recommended Configuration for EU AI Act Compliance
// Environment variables for EU AI Act compliance
{
// Retention (Article 19)
RETENTION_DAYS: 180, // Minimum 6 months
// Telemetry paths
TELEMETRY_DIR: '~/.claude/telemetry',
// SigNoz integration (optional)
SIGNOZ_URL: 'https://ingest.us.signoz.cloud',
SIGNOZ_API_KEY: '<your-key>',
// Cache settings
CACHE_TTL_MS: 60000, // Query cache TTL
}
Conclusion
observability-toolkit v1.8.5 provides substantial coverage for EU AI Act observability requirements:
- Article 12 (Record-Keeping): Full support for automatic event logging, session tracking, and tool invocation recording
- Article 19 (Log Retention): Configurable retention with
RETENTION_DAYS - Annex XI (Technical Documentation): Model, provider, and usage metrics captured automatically
v1.8.5 Security Enhancements (65+ commits since v1.8.0):
- Circuit breaker for local backend resilience
- SSRF protection with IPv6 zone ID handling
- Rate limiter overflow prevention
- Cloud environment detection warnings
- ~100 negative security test cases
- 2083 total tests (up from ~1700)