What Do You Do?

A chronological collection of what I’m working on, projects I’ve built, and things I’m learning. Newest updates first.

Catching AI Lies in Translation: An OTEL-Powered Quality Loop

February 17, 2026 7 minute read

When an AI translator invented a backstory about the Netherlands, our telemetry caught it. Here’s how OpenTelemetry-driven evaluation loops keep machine tran...

Catching Lies in AI Translation

February 17, 2026 6 minute read

An AI translator made up a story about the Netherlands. Our system caught it. Here’s how we keep machine translations honest.

Pegando Mentiras na Tradução por IA

February 17, 2026 6 minute read

Uma IA inventou uma história sobre a Holanda. Nosso sistema pegou na hora. Veja como mantemos as traduções automáticas honestas.

Implementation Plan: Specialized Translation and Voice-Matching Agents

February 14, 2026

This document outlines the design and implementation of three specialized agents to address performance gaps identified in the Translation Session Post-Morte...

Implementation Plan: Translation Pipeline Robustness and Regression Testing

February 14, 2026

Post-mortem analysis of session d1d142a6 (February 12, 2026) revealed four critical bugs and several systemic robustness gaps in the translation pipeline. Th...

Implementation Plan: Self-Optimizing Agent Performance with Session Telemetry

February 14, 2026

A comprehensive implementation plan for building a self-optimizing agent performance system that uses session telemetry data to track, analyze, and continuou...

Auditing an AI’s Honesty: How We Catch Hallucinations Before They Become Liabilities

February 10, 2026

An auditor-friendly walkthrough of an LLM evaluation pipeline that detects when AI systems fabricate information — what it measures, how it works, and what t...

Why All Your Fears About AI Are Right

February 2, 2026 8 minute read

Your AI fears are valid—they’re just aimed at the wrong target. The real crisis isn’t replacement. It’s drowning in mediocrity we can’t measure.

LLM Observability Best Practices: A Comparative Analysis

January 29, 2026

Technical white paper examining LLM observability standards, OpenTelemetry GenAI semantic conventions, agent tracking methodologies, and quality measurement ...

Claude Code Observability Framework Guide

January 20, 2026

A comprehensive guide to the production-grade observability system for Claude Code hooks using OpenTelemetry, Langtrace, and SigNoz.

Claude Code Observability Framework

January 19, 2026

Production-grade observability system for Claude Code hooks using OpenTelemetry, Langtrace, and SigNoz - featuring distributed tracing, metrics, and correlat...

Git Activity Report: PersonalSite Repository

January 11, 2026

Comprehensive analysis of 403 commits across the PersonalSite repository - spanning from initial creation in May 2017 through January 2026, with 94% of activ...

Complete Activity Tracking System - Final Status

November 17, 2025

Installation and verification of a complete activity tracking ecosystem monitoring websites, code, Claude tools, directories, and git commits.

ActivityWatch Complete Tracking System Setup

November 17, 2025

Complete setup and configuration of ActivityWatch tracking system with multiple watchers for comprehensive productivity monitoring.

Git Activity Report: July 7 - November 16, 2025

November 16, 2025

Comprehensive analysis of 1,007 commits across 33 repositories - tracking 4+ months of development work including newly discovered MCP servers, client projec...

Current Projects & Active TODO List

November 16, 2025

What I’m building, learning, and working on right now - from MCP servers to schema optimization, and everything in between.

September Sprint: Jekyll, Schemas, and MCP Servers

September 7, 2025

A whirlwind week of fixing build bugs, improving schema.org implementation, and diving deep into MCP server development.