Before You Trust an Agent's Decision, Ask What It Resolved

The current generation of enterprise AI pitches is more sophisticated than the skeptics give them credit for. The orchestration vendors have done real work on making agent behavior inspectable.

What most of them have not done is push that inspectability upstream of the model. For all the work on reasoning traces and evaluator frameworks, the trace begins when the agent starts reasoning. It rarely covers the step that happens before — the resolution of noisy, multi-source signals into the specific entity the agent is reasoning about. That's the step where most enterprise AI systems quietly inherit the data quality problems they were built to transcend.

This is the entity resolution problem, and it deserves a seat at the architecture table next to the model, the orchestrator, and the vector store.

What entity resolution actually is, and why it isn't NER

A quick distinction, because the terms get conflated.

Named entity recognition (NER) is the natural language task: given a span of unstructured text, label it as a person, organization, location. It tells you that something is an entity. It doesn't tell you which one.

Entity resolution (ER) is the question that comes next. Given multiple records — across a CRM, a compliance database, a third-party feed, a regulatory filing — which ones refer to the same real-world entity? Same customer across three product lines. Same supplier under four legal names. Same beneficial owner behind two shell companies. ER produces unique identifiers and, critically, the evidence that supports each merge or non-merge decision.

The difference matters because the failure modes are different. NER errors produce confused text. ER errors produce confused actions. An agent that retrieves the wrong customer record and acts on it isn't hallucinating — it's executing correctly against the wrong identity. The trace will look clean. The outcome will be wrong.

The opacity problem ER is positioned to solve

The deeper concern with current AI orchestration patterns isn't that they're wrong. It's that when they're wrong, you can't tell why, and you can't defend the decision after the fact.

Decision-trace capture — the discipline of producing immutable, queryable records of how AI-driven decisions were made — has to start somewhere upstream of the model. It has to start at the data layer, because that's where the ground truth of "who is this about" gets established. If your agent decides to flag a transaction, deny a claim, route an escalation, or update a record, the first defensible question in any audit is: did the system correctly identify the entity in question? If you can't answer that with evidence, the rest of the trace is decorative.

This is what makes mature ER tooling architecturally interesting. The good implementations don't just produce identifiers. They produce:

Provenance for each merge: which source records contributed, which fields matched, which features triggered the decision.

Lineage that survives updates: when a record changes upstream, the resolution decision can be re-evaluated and the change propagated with a record of why.

Explainability at the merge level: not a black-box similarity score, but a configurable, inspectable rule set with evidence attached.

These are the same properties you want from your decision-trace layer at large. ER is, in effect, decision-trace capture for the identity question — and it's a domain where the patterns are more mature than in agent orchestration.

A two-tier pattern worth borrowing

There's an architectural pattern emerging in this space that's worth pulling out, because it generalizes beyond ER.

The pattern: separate your data graph from your knowledge graph. The data graph tier sits low, tracks provenance, and answers the high-resolution question of which records map to which entities, with what evidence. The knowledge graph tier sits above, carries the semantics — relationships, ontology, business meaning — and treats the resolved entities as stable references.

Why this matters: it lets you update the resolution layer without rewriting the semantic layer. New evidence arrives, a merge is revised, an entity splits — the data graph absorbs the change and the knowledge graph sees a clean, versioned reference. Provenance lives where it belongs, semantics live where they belong, and audit becomes tractable.

The same shape applies to AI orchestration more broadly. Decision-trace capture should be a tier, not a feature bolted onto an agent's logging output. It should sit underneath the reasoning layer, immutable and queryable, with the reasoning layer treating it as a stable reference. The architectures that get this right will be the ones that can defend their AI decisions in five years. The ones that don't will be re-platforming.

What to ask your teams and your vendors

If you're evaluating where ER fits in your architecture — or evaluating an incumbent vendor's claims about AI-readiness — a few questions cut through the noise:

On the data layer: When two records are merged into one entity, can you produce the evidence trail in under a second? Can you re-evaluate that decision when new data arrives, and propagate the change with a record of why? If the answer is "we can run a job overnight," you don't have decision-trace infrastructure, you have batch reporting.

On the boundary: Where does entity resolution end and knowledge graph semantics begin? If your team can't draw that line on a whiteboard, the layers are tangled, and the audit story will be tangled with them.

On the agent layer: When an agent acts on an entity, is the resolution decision part of the trace, or is it assumed? If it's assumed, you have a defensibility gap that won't surface until your first regulatory inquiry or material incident.

On vendors: Incumbents will increasingly claim AI-readiness. The honest version of that claim has an answer to "show me how a merge decision is audited." The marketing version doesn't.

“An agent that retrieves the wrong customer record and acts on it isn't hallucinating — it's executing correctly against the wrong identity.”

The risk with the current wave of enterprise AI isn't that it fails. It's that it succeeds at producing decisions faster than organizations can defend them. Speed without traceability is a compounding liability — every confident action against a misresolved entity becomes a record you can't unwind.

Entity resolution isn't the whole answer. But it's the part of the answer that's furthest along, most underrated, and most directly connected to the question CIOs will be asked when AI decisions get challenged: can you show your work?

Build the layer. Make it auditable. Treat it as infrastructure, not a feature. The organizations that do will find the rest of the AI stack a lot easier to govern.

Before You Trust an Agent's Decision, Ask What It Resolved

What entity resolution actually is, and why it isn't NER

The opacity problem ER is positioned to solve

A two-tier pattern worth borrowing

What to ask your teams and your vendors

Building decision-trace infrastructure into your AI stack?

The Practitioner's Briefing