What Do You Do?
A chronological collection of what I’m working on, projects I’ve built, and things I’m learning. Newest updates first.
Catching AI Lies in Translation: An OTEL-Powered Quality Loop
When an AI translator invented a backstory about the Netherlands, our telemetry caught it. Here’s how OpenTelemetry-driven evaluation loops keep machine tran...
Catching Lies in AI Translation
An AI translator made up a story about the Netherlands. Our system caught it. Here’s how we keep machine translations honest.
Pegando Mentiras na Tradução por IA
Uma IA inventou uma história sobre a Holanda. Nosso sistema pegou na hora. Veja como mantemos as traduções automáticas honestas.
Implementation Plan: Specialized Translation and Voice-Matching Agents
This document outlines the design and implementation of three specialized agents to address performance gaps identified in the Translation Session Post-Morte...
Implementation Plan: Translation Pipeline Robustness and Regression Testing
Post-mortem analysis of session d1d142a6 (February 12, 2026) revealed four critical bugs and several systemic robustness gaps in the translation pipeline. Th...
Implementation Plan: Self-Optimizing Agent Performance with Session Telemetry
A comprehensive implementation plan for building a self-optimizing agent performance system that uses session telemetry data to track, analyze, and continuou...
Auditing an AI’s Honesty: How We Catch Hallucinations Before They Become Liabilities
An auditor-friendly walkthrough of an LLM evaluation pipeline that detects when AI systems fabricate information — what it measures, how it works, and what t...
Why All Your Fears About AI Are Right
Your AI fears are valid—they’re just aimed at the wrong target. The real crisis isn’t replacement. It’s drowning in mediocrity we can’t measure.
LLM Observability Best Practices: A Comparative Analysis
Technical white paper examining LLM observability standards, OpenTelemetry GenAI semantic conventions, agent tracking methodologies, and quality measurement ...
Claude Code Observability Framework Guide
A comprehensive guide to the production-grade observability system for Claude Code hooks using OpenTelemetry, Langtrace, and SigNoz.
Claude Code Observability Framework
Production-grade observability system for Claude Code hooks using OpenTelemetry, Langtrace, and SigNoz - featuring distributed tracing, metrics, and correlat...
Git Activity Report: PersonalSite Repository
Comprehensive analysis of 403 commits across the PersonalSite repository - spanning from initial creation in May 2017 through January 2026, with 94% of activ...
Complete Activity Tracking System - Final Status
Installation and verification of a complete activity tracking ecosystem monitoring websites, code, Claude tools, directories, and git commits.
ActivityWatch Complete Tracking System Setup
Complete setup and configuration of ActivityWatch tracking system with multiple watchers for comprehensive productivity monitoring.
Git Activity Report: July 7 - November 16, 2025
Comprehensive analysis of 1,007 commits across 33 repositories - tracking 4+ months of development work including newly discovered MCP servers, client projec...
Current Projects & Active TODO List
What I’m building, learning, and working on right now - from MCP servers to schema optimization, and everything in between.
September Sprint: Jekyll, Schemas, and MCP Servers
A whirlwind week of fixing build bugs, improving schema.org implementation, and diving deep into MCP server development.