Essays
Notes from the practice.
Long-form writing on semantic infrastructure, retrieval, classification systems, and the engineering beneath operational AI.
What Four Hours of Focused Human-AI Engineering Actually Ships
May 30, 2026
On May 30th I published v0.1 of a Databricks integration pack for Swamp. By the end of the same evening it was at v0.13. Fifteen models, 100% A on every release, every model end-to-end smoke-tested. What shipped, what I learned, and why it matters if you have Databricks alongside anything else.
14 min read
Who Cares if Machines Understand?
May 16, 2026
Industry never required philosophical certainty before deploying complex systems. The AI deployment debate is no different. A case for treating intelligence as uncertainty compression.
4 min read
I Trained an 8B HTS Classifier at a Coffee Shop with an H100 for $43
May 9, 2026
A postmortem of fine-tuning an 8B model for U.S. tariff classification on a single rented H100. Sixteen hours, $43, and eight lessons earned the hard way.
14 min read
How Memory Actually Works on Databricks
May 2, 2026
Your Databricks cluster says 28 GB. Your Python code gets maybe 12. The JVM/Python split that explains why, and how to budget the rest.
7 min read
Storytelling Through Feature Engineering: Lossy Compression for Language Models
Apr 25, 2026
Feature engineering didn't die. It became lossy compression for language models, turning thousands of raw rows into ten metrics that carry the same story.
9 min read
Data as a Public Utility: The Problem We Pretend Doesn't Exist
Apr 18, 2026
Why public data deserves public infrastructure, and how patents on basic ML techniques applied to taxpayer-funded datasets are gatekeeping disguised as innovation.
17 min read
The Hidden Gaps in AI Deep Research: What Your Organization Needs to Know
Apr 11, 2026
Four kinds of blind spots in AI deep research outputs, and a working mental model for using these tools as starting points rather than finished products.
7 min read
Shazam and the Art of Classification: How to Solve the Impossible in 5 Seconds
Apr 4, 2026
How Shazam identifies one song out of 11 million in 5 seconds, by refusing to solve the actual problem.
4 min read
The Backward Index: How 1930s Lexicographers Built Vector Search with Index Cards
Mar 28, 2026
Long before vector databases, Merriam-Webster lexicographers built a 315,000-card index of words spelled backwards. The same insight powers modern AI: different organizational schemes reveal different patterns in the same data.
8 min read
The Architecture That Ate AI: How One Paper Changed Everything
Mar 21, 2026
How one 2017 paper on machine translation accidentally became the architecture beneath ChatGPT, Claude, Copilot, DALL-E, and almost every other AI system you use.
9 min read
Quantization: From Max Planck to Faster Vector Search
Mar 14, 2026
Max Planck discovered nature comes in discrete packets in 1900. 125 years later, the same idea lets vector databases, LLMs, and on-device AI work at scale.
6 min read
The Semantic Mesh: Why Runtimes Aren't Enough
Feb 28, 2026
Runtimes make knowledge work executable. But isolated execution doesn't compound. The semantic mesh is the layer that connects structured outputs into accumulated organizational intelligence.
13 min read
The Missing Runtime for Knowledge Work
Feb 21, 2026
Agentic AI thrives in software engineering because code already runs inside a repeatable execution environment. Most knowledge work doesn't. The ceiling isn't model intelligence, it's the absence of runtimes that can execute, validate, and provide feedback on structured work.
14 min read
Query Provenance Store: The Accountability Layer for AI Agent Decisions
Jan 17, 2026
When an agent makes a decision, you should be able to see exactly what it saw, and whether reality has changed since. Introducing QPS, a companion spec to TMS.
6 min read
Tabular Manifolds: The Cognitive Interface Layer Between Data and AI Agents
Jan 1, 2026
Dashboards were built for eyes. LLMs need something else: a structured, multi-resolution data format optimized for machine cognition. Enter Tabular Manifolds.
6 min read