The Semantic Mesh: Why Runtimes Aren't Enough

Runtimes make knowledge work executable. But isolated execution doesn't compound. The semantic mesh is the layer that connects structured outputs into accumulated organizational intelligence.

Feb 28, 2026·13 min read

In the first post in this series, I argued that the reason agentic AI works in software and stalls everywhere else isn't model intelligence. It's infrastructure. Software has a runtime. Most knowledge work doesn't. And until we build runtimes for other domains of work, agents will remain stuck in single-shot mode: generate, deliver, hope for the best.

I still believe that. But I've been sitting with an uncomfortable follow-up question.

Let's say we build the runtimes. Let's say procurement gets an execution environment with structured artifacts, validation rules, and feedback loops. Legal review gets one. Strategic planning gets one. Each domain of knowledge work becomes executable.

What then?

If every runtime operates in isolation, you've solved the single-task problem. An agent can execute a procurement analysis and get feedback on whether it's correct. That's a genuine improvement over the status quo. But you haven't solved the harder problem, which is: how does an organization actually learn over time?

Post 1 was about making work executable. This post is about making it cumulative.

What organizations know (and where it lives)

Here's something that's obvious once you see it but rarely stated explicitly: most organizational knowledge is relational, not atomic.

The valuable thing isn't any individual analysis, decision, or report. It's the relationships between them. Which analysis informed which decision. Which assumptions keep showing up across plans. Which vendor evaluations led to good outcomes and which ones didn't. Which forecasts turned out to be wrong, and what downstream work relied on them.

This connective tissue is the actual institutional knowledge of an organization. And today, almost without exception, it lives in people's heads.

"Sarah knows how we handled this last time." "Ask the team that did Project X." "I think we tried something like this in 2022, but I can't remember what happened." Every organization runs on these informal links. They're how context gets transferred, how precedent gets applied, how mistakes get avoided (sometimes).

When people leave, the graph walks out the door with them.

This is the real knowledge management problem. It was never about storage. We have more storage than we know what to do with. It was never about search. Search finds documents, not relationships. A search engine can tell you that a document about vendor evaluation exists. It can't tell you that the assumptions in that evaluation were later invalidated by a market shift that also affected three other workstreams.

The problem is that the connective tissue between work products has never been made explicit. It's never been structured. It's never been queryable. It exists as wetware, and wetware doesn't scale.

What changes when work becomes structured

This is where the runtime story from Post 1 connects to something bigger.

Once knowledge work runs through a domain runtime, you get something you've never had before: structured, typed, logged outputs with provenance. Each artifact has a known schema. It has known inputs. It has an execution history showing every step that produced it. It has validation results showing which checks it passed and which it didn't.

This is not a PDF. This is not a slide deck sitting in a shared drive. This is a data structure with metadata.

And that changes everything about what's possible at the layer above.

You can't meaningfully correlate unstructured documents. You can do vector similarity, which tells you "these two documents use similar words," but that's a blunt instrument. It can't tell you that two analyses share the same cost assumptions, or that a strategic plan depends on a forecast that was produced by a different team using different data.

But you can correlate structured artifacts with precision. When artifacts have schemas, typed fields, execution logs, and declared inputs, you can identify shared entities, shared assumptions, structural patterns, and causal dependencies. Not through fuzzy matching. Through actual structural analysis.

The runtime doesn't just make work executable. It makes work connectable.

That's the precondition for everything that follows.

The semantic mesh

I've been calling this layer the semantic mesh, because I think it captures what it actually does: it creates a web of typed, meaningful connections between the structured outputs of knowledge work.

Here's the basic idea. You have a graph. The nodes are structured artifacts produced by domain runtimes: analyses, decisions, evaluations, plans, reports. Anything a runtime produces. The edges are typed relationships between those artifacts, and "typed" is doing a lot of work in that sentence. Not just "related to." Specifically how they're related.

The correlation types we're exploring fall into a few natural groups.

Referential correlations link artifacts that reference the same real-world entities. Two procurement evaluations that assess the same vendor. A strategic plan and a financial model that reference the same market segment. A legal review and a compliance audit that involve the same contractual clause. These are the most straightforward relationships to detect, because entity extraction from structured artifacts is a relatively well-understood problem.

Structural correlations link artifacts that follow similar patterns, even when they're about different things. Two vendor evaluations that have the same risk profile shape. Two strategic plans that share a dependency structure. These are harder to detect but often more useful, because they're how precedent gets identified. "You've never evaluated this specific vendor before, but the structure of this evaluation looks like three others, and here's what happened with those."

Temporal correlations capture the fact that artifacts produced in the same planning cycle, decision context, or time window are often related in ways that matter. The Q3 forecast, the Q3 hiring plan, and the Q3 budget allocation are all connected by the planning context that produced them, even if they were created by different teams using different runtimes.

Causal correlations are the most valuable and the hardest to establish. This analysis informed this decision. This decision led to this outcome. This forecast was an input to this plan, which drove these actions, which produced these results. Causal edges close the feedback loop at the organizational level. They're what let you ask: "when we made decisions like this in the past, what happened?"

The graph is append-only and versioned. Artifacts don't get deleted. Relationships accumulate. The mesh gets richer over time. This is a feature, not an implementation detail. You want the full history, because the history is where the patterns live.

Why this isn't knowledge management

If you've been in enterprise technology for any amount of time, you're probably feeling a familiar skepticism right now. "This sounds like a knowledge graph." "Isn't this just Confluence with better metadata?" "We tried knowledge management in 2008 and it was a disaster."

Fair. Let me explain why I think this is genuinely different.

Knowledge management systems failed for one consistent reason across thirty years of attempts: they required humans to do the linking. Tag your documents. Categorize your work. Link related items. Fill in the metadata. Nobody does this, because the overhead is brutal and the payoff is distant. You're asking someone who just finished a complex analysis to spend thirty minutes tagging it for the benefit of a hypothetical future colleague. The incentives never worked. They still don't.

The semantic mesh is different for two reasons, and they're both structural.

First, the inputs are already structured. This is the key inheritance from the runtime layer. When a domain runtime produces an artifact, that artifact already has a schema, typed fields, execution history, and validation results. The mesh doesn't need anyone to add structure after the fact. The structure is a byproduct of execution. It comes for free.

Second, the correlation engine is probabilistic. It doesn't rely on exact entity matching or manually curated taxonomies. It uses the same LLM capabilities that power the runtime's front-end to identify relationships semantically. "These two artifacts share cost assumptions" isn't detected by keyword matching. It's detected by a model that understands what a cost assumption is and can recognize when two differently-worded artifacts rely on the same one.

This isn't "better tagging." It's emergent structure from the intersection of structured artifacts and probabilistic reasoning. The mesh builds itself as a side effect of work being done. No one has to maintain it. No one has to remember to update it. It grows because work is happening, and work produces artifacts, and artifacts have structure, and structure creates connections.

What the mesh actually enables

I want to be concrete about this, because it's easy for graph-talk to stay abstract. Here are the capabilities that a functioning semantic mesh would give an organization.

Precedent surfacing. An agent is producing a new strategic plan. The mesh identifies that this plan has a dependency structure similar to one from 2023 that failed at the third milestone because a key assumption about market timing was wrong. The agent surfaces this precedent before the plan is finalized, not as a vague "you might want to look at this" but with the specific structural parallels and the specific point of failure.

Assumption tracking. A cost forecast appears in seven different analyses across three departments this quarter. The mesh tracks this. When the forecast gets revised (or invalidated), the mesh can identify every downstream artifact that relied on it. Instead of someone discovering six months later that half the company was planning against a number that turned out to be wrong, the impact propagates immediately.

Cross-domain visibility. The procurement team is evaluating a vendor based on one set of market assumptions. The strategy team is building a plan based on a different set of assumptions about the same market. Neither team knows about the other's work. The mesh surfaces the contradiction because it can see both artifacts and identify the conflicting inputs.

Organizational learning at scale. Across fifty vendor evaluations over two years, what patterns correlate with successful outcomes? Not "what does one procurement analyst remember," but what does the actual data show? The mesh makes this query possible because the evaluations are structured artifacts with outcome linkage. You can run the analysis.

Institutional memory. You're about to make a decision that's structurally identical to one made eighteen months ago. The mesh shows you the full context: what was decided, what the reasoning was, what happened afterward. Not because someone remembered to document it in a wiki. Because the runtime produced the artifacts and the mesh connected them.

Each of these capabilities exists today only in the form of human memory and tribal knowledge. They work when the organization is small enough and the people stay long enough. They break as organizations scale, as people turn over, and as the volume of work exceeds what anyone can hold in their head.

Version 0: how we're thinking about architecture

I want to share how we're thinking about building this, not because the architecture is settled, but because I think the design constraints are instructive.

The core principle is decoupling. The mesh is a separate layer that observes the outputs of domain runtimes. It doesn't require changes to the runtimes themselves. It doesn't impose requirements on how runtimes structure their artifacts beyond a minimal envelope: here's the artifact, here's its schema, here's its provenance, here's when it was produced.

This matters because we don't want the mesh and the runtimes to evolve in lockstep. Runtimes will change. New ones will come online. Existing ones will restructure their schemas. The mesh needs to absorb all of this without breaking.

The correlation engine is modular by design. Different correlation strategies for different relationship types. Entity extraction for referential links. Structural comparison for pattern matching. Embedding similarity as a fallback when structure is sparse. Causal inference for the hardest and most valuable edges. Each strategy produces edges with confidence scores, because we're not pretending that automated correlation is always right. The mesh doesn't claim truth. It surfaces relationships with uncertainty, so humans and agents can reason over them.

The graph store is append-only and versioned. Every artifact and every edge has a timestamp. You can query the mesh as it existed at any point in time. This is essential for organizational learning, because you need to ask questions like "what did we know when we made this decision" rather than just "what do we know now."

We're also thinking about bootstrap. The mesh doesn't need to start empty. Most organizations have years of existing knowledge work products. They're unstructured, but an LLM can extract approximate structure from them. Enough to build an initial graph that's useful even before purpose-built runtimes are producing natively structured artifacts. This matters for adoption, because "it gets useful after you've rebuilt all your processes" is a non-starter. "It gets useful the week you turn it on, and gets more useful as runtimes come online" is a different proposition entirely.

One thing I want to be direct about: we don't know what "related" means yet in the general case. We have strong intuitions about specific relationship types, and we have early results that validate some of them. But the right correlation primitives, the right confidence thresholds, the right way to balance precision against recall in organizational context. These are open problems. The architecture has to let us figure that out empirically, which is why modularity isn't a nice-to-have. It's the central design constraint.

The uncomfortable question

There's something I'd be dishonest not to address.

Once the mesh gets dense enough, it starts surfacing things that organizations might not want surfaced. Contradictions between departments that were previously invisible. Assumptions that leadership presented as facts but that the data doesn't support. Decisions that were made despite evidence pointing the other way. Patterns of failure that got buried under narrative.

The mesh doesn't have politics. It just has structure.

This is, in the long run, a feature. Organizations that can see their own patterns clearly make better decisions than organizations that can't. But in the short run, it has real adoption implications. The technical architecture is honestly the easier part of this problem. The harder part is organizational willingness to make institutional knowledge explicit and queryable, including the parts that are uncomfortable.

I don't have a clean answer for this. I think the honest approach is to acknowledge it as a real constraint and design for it, rather than pretending it won't come up.

From execution to accumulation

Let me bring this back to the big picture.

Post 1 argued that the binding constraint on agentic AI isn't model intelligence. It's the absence of runtimes: execution environments that give agents the feedback loops they need to work reliably.

This post argues that runtimes alone aren't sufficient. A runtime solves the single-task problem: an agent can execute a piece of work and get feedback on whether it's correct. But if every execution is isolated, the organization isn't learning. It's just doing individual tasks faster.

The semantic mesh is the layer that turns isolated executions into accumulated knowledge. It connects the outputs of domain runtimes into a graph of typed relationships. It makes organizational knowledge explicit, structured, and queryable. It surfaces precedent, tracks assumptions, identifies contradictions, and closes feedback loops at the organizational level.

The runtime is the compiler and the execution environment. The mesh is the linker and the package manager. You need both.

Without the runtime, you can't build the mesh. There are no structured artifacts to correlate. Without the mesh, runtimes are isolated improvements. Better at individual tasks, but no compounding. No institutional learning. No organizational memory.

Together, they're something more than the sum of their parts. They're the beginning of an infrastructure stack for knowledge work that actually learns.

Not a chatbot. Not a search engine. Not a dashboard.

A system that executes work, connects it, and gets smarter over time.

That's what organizational intelligence could actually look like. And for most organizations, every layer of this stack still needs to be built.