Trace-Indexed GraphRAG: Precedent Retrieval over the Decision Category

1. What document RAG misses for architectural work

Retrieval-augmented generation, as it is usually configured, indexes the prose of a document corpus and returns chunks topically similar to a query. The configuration is well-suited to question-answering over textual sources, and recent work has extended it to community-summarisation over knowledge graphs [@edge2024graph]. It is poorly suited to the work an enterprise architect does when consulting prior decisions.

The architect’s question is rarely “find me a document about pricing”. It is “find me a decision that touched the pricing capability under an elasticity mechanism with a v3-style product schema”. The two questions look superficially similar. They are not. The first is topical; the second is compositional. A document-level index can answer the first because topical similarity is a property of the document. It cannot answer the second because compositional structure is a property of the decision, and the decision is not the document.

The preceding article in this series argued that the right primitive for capturing the decision’s compositional structure is a decision trace: a typed walk through the multi-layer graph $\mathcal{G} = (G_C, G_M, G_K, \mathcal{B})$, with binding edges $\mathcal{B}$ that span the layers and a type signature recording which layer each visited vertex belongs to. Traces compose. They admit a provenance-equivalence relation, and the equivalence classes form a category whose morphisms are the architecturally-meaningful identity claims about decisions. This article takes the category as given and asks how to index it.

2. The trace category as an index

The trace category has two features that the document corpus lacks. First, its objects (decision contexts) are typed and structured, so equality of objects is decidable rather than approximate. Second, its morphisms (provenance-equivalence classes of traces) are decomposable: a trace can be sub-trace-contained, prefix-matched, or compared by type signature, and each of those operations is a query primitive in its own right.

These two features together make the category queryable in a way that the underlying documents are not. The category does not need to be embedded into a vector space to be searched, although embeddings remain useful for the natural-language content of individual nodes. The category itself is the index.

3. Four retrieval patterns

The retrieval problems that arise in architectural work fall into four patterns. Each is a different query on the trace category; each is recoverable from a small number of property-graph operations.

Pattern A — Exact context match. Given a starting context $c$, return all traces whose source is provenance-equivalent to $c$. This is the cheap, narrow query. It answers “have we decided in this exact situation before?” and is recoverable from a simple equality predicate on the starting vertex and its incident binding edges. In practice this is the least common query: enterprise decisions rarely repeat under identical contexts, which is partly why pure case-based reasoning has had limited traction in EA tooling.

Pattern B — Sub-trace containment. Given a partial trace $\sigma$, return all traces $\tau$ such that $\sigma$ is a contiguous sub-walk of $\tau$. This answers “have we made a decision that included this binding step?” — for example, “have we made a decision that bound the pricing capability to the elasticity mechanism $M_e$?”. Sub-trace containment is the workhorse query of the four. It captures the architect’s most common precedent question and is recoverable from a path-pattern query on the property graph.

Pattern C — Type-signature matching. Given a sequence of layer-tags $(\ell_0, \ldots, \ell_k)$ — say, context → binding → causal → binding → knowledge — return all traces whose type signature matches the sequence (or matches it modulo a tolerance). This answers a strictly architectural question: “have we made a decision that followed this shape of compositional reasoning?” It is the only one of the four patterns that has no straightforward analogue in document-RAG; the shape of reasoning is invisible at the document level.

Pattern D — Structural similarity. Given a query trace $\tau_q$, return traces ranked by similarity to $\tau_q$ under a structural metric — typically a hybrid of (a) Jaccard overlap on the vertex set, (b) longest-common-subsequence on the type signature, and (c) cosine similarity on embeddings of the natural-language content of each node. This is the most permissive query and the one most likely to surface unexpected precedent. It is also the only one of the four patterns in which embeddings carry significant weight, and even there the embeddings are over node content, not over the trace itself.

4. Implementation: property graph plus selective embedding

The trace category projects naturally to a labelled property graph. Each visited vertex in a trace becomes a Step node whose properties include layer, position_in_trace, and a foreign key to the underlying vertex in $G_C$, $G_M$, or $G_K$. Each transition becomes a NEXT relationship between consecutive Step nodes, with the layer-or-binding tag stored as an edge property. Each whole trace is a Trace node with a STARTS_AT relationship to its first Step and a :HAS_STEP relationship to all of its steps.

This encoding makes the four retrieval patterns each a small Cypher query. Pattern A is an equality match on STARTS_AT.binding_signature. Pattern B is a path pattern MATCH (t:Trace)-[:HAS_STEP]->(s1)-[:NEXT*1..]->(s2) with predicates on s1 and s2. Pattern C is the same path pattern with predicates on s.layer and ordering. Pattern D combines pattern B with a final ORDER BY that ranks the matched traces by a similarity score computed in the query.

Embeddings enter the picture only where the corresponding question is genuinely about meaning rather than structure. The natural-language abstract on a Concept node, the prose rationale on a Step annotation — these admit semantic comparison and are appropriate sites for dense vectors. The trace structure itself does not. Embedding a typed walk into $\mathbb{R}^d$ as a single vector throws away exactly the compositional information the index was designed to preserve. The right pattern is structural retrieval first, semantic re-ranking second: pattern B or C narrows the candidate set, then optional embedding-similarity on the abstracts of the visited concepts re-ranks within the narrowed set.

5. What this gives the architect, the auditor, and the architecture

Three operational consequences follow.

For the architect, true precedent retrieval becomes a one-step operation rather than a research project. The category is the index; the query is structural; the result is a small set of prior traces whose compositional shape matches the present situation and which can be inspected directly. The architect spends less time looking for precedent and more time deciding whether the precedent applies.

For the auditor, the trace category supplies explanations of a kind that flat reasoning logs cannot. A regulator asking “why was this decision made?” receives not a paragraph but a trace — a typed walk grounded in the architecture’s own graph layers, replayable under the $\mathrm{do}(\cdot)$ operator [@pearl2009causality] on its causal portion, and citing concrete entities in the knowledge layer at every step. The explanation is the trace; the trace is structurally what the decision was.

For the architecture itself, the trace corpus becomes the empirical material against which the layered graphs can be revised. A binding edge that recurs in many high-confidence traces but is absent from $\mathcal{B}$ is a candidate addition. A sub-trace shape that frequently appears with poor downstream outcomes is a candidate anti-pattern. The architecture learns its own structure from the decisions it serves, in a sense that is mechanical rather than aspirational. This is the line of work the next article in the series takes up.

Embedding-only RAG remains useful for the document corpus that sits around the architecture — the policy texts, the regulatory filings, the published papers. It is not the right primitive for the architecture itself. The category is.