Decision Traces as Multi-Layer Provenance Objects

1. The compositional question

When an enterprise architect or an agent makes a business decision — to launch a product variant, to depreciate a capability, to route a customer segment through a new channel — the decision is rarely the conclusion of a single inference. It is assembled from heterogeneous material: facts about the world (what entities exist, what attributes they carry), claims about mechanism (if we change $X$, $Y$ will follow), and judgements about scope (what part of the business this concerns, at what level of abstraction).

The standard documentary apparatus of enterprise architecture flattens this assembly. Architecture Decision Records record the conclusion, the rationale, and a few alternatives, but they do not carry a structured trace of how the decision composes from its sources. As a result, two activities that ought to be cheap are expensive: replaying a decision under counterfactual assumptions, and retrieving prior decisions that share their compositional structure rather than merely their topic.

The thesis advanced here is that the right primitive for capturing this assembly is not a free-text rationale but a decision trace: a typed graph walk that traverses the context, causal, and knowledge graphs of the architecture, picking up the local material at each step that the decision actually rests on. The trace is a first-class provenance object in the sense of W3C PROV [@moreau2013prov], but typed by the layer it traverses — and that typing is what makes the trace useful.

2. Three layers, three vocabularies

The site’s broader model posits three coexisting graph layers, each with its own vertex and edge semantics:

The context graph $G_C$ records what is related to what in the business at a given moment — capabilities, value streams, customer segments, channels, processes. Its edges are largely associational.
The causal graph $G_M$ records mechanism: directed edges that admit interventionist semantics in Pearl’s sense [@pearl2009causality]. A walk in $G_M$ is a candidate causal pathway.
The knowledge graph $G_K$ records ontological commitments: schema-style entity definitions, taxonomic relations, named-entity assertions, and citations [@hogan2021knowledge]. It is where the decision’s vocabulary is grounded.

A decision that touches only one layer is rare. A pricing recommendation, for instance, sits in $G_C$ (which segment, which capability owns the decision), depends on $G_M$ (the price-elasticity mechanism, the substitution structure), and is grounded in $G_K$ (what counts as a “product variant”, which catalogue is canonical). Capturing the decision requires capturing the trace that visits all three.

3. The decision trace as a typed walk

Let $\mathcal{G} = (G_C, G_M, G_K, \mathcal{B})$ denote the multi-layer graph, where $\mathcal{B}$ is the set of binding edges between layers. A binding asserts identity-of-reference: that the capability node pricing in $G_C$ is the same business object whose elasticity is modelled by the mechanism node $M_e$ in $G_M$, and whose canonical schema is the entity schema:Product in $G_K$.

A decision trace $\tau$ is then a sequence

$$\tau = (n_0, e_0, n_1, e_1, \ldots, e_{k-1}, n_k)$$

where each $n_i$ is a vertex of $\mathcal{G}$ and each $e_i$ is either an intra-layer edge of one of $G_C, G_M, G_K$, or a binding edge in $\mathcal{B}$. The trace carries a type signature — the sequence of layer-tags $(\ell_0, \ell_1, \ldots, \ell_k)$ recording which layer each $n_i$ belongs to. Two traces with the same vertices but different type signatures are different traces: the layer the architect was operating in when they touched a node is part of what the decision rests on.

This formalism is deliberately spare. It is closer to PROV’s notion of an activity-and-derivation graph than to a process-mining trace through a single process model [@vanderaalst2016process]. The closeness to PROV is intentional: PROV’s wasDerivedFrom relation is layer-agnostic and therefore cannot distinguish a context dependency from a causal one. The layer-tagged trace fixes that, and does so without inventing new vocabulary — prov:Activity can be subtyped by layer in a small extension ontology.

4. Trace equivalence and the algebra of decisions

Defining the trace as a typed walk admits a natural equivalence relation, and the equivalence is where the theory earns its keep. Two traces $\tau$ and $\tau’$ may be considered:

Outcome-equivalent, if they terminate at the same decision node;
Provenance-equivalent, if they additionally traverse the same set of binding edges (i.e. they rest on the same cross-layer identifications);
Structurally identical, if they coincide as sequences.

Outcome equivalence is the coarsest and the least useful: it collapses the space of decisions back to its conclusions. Structural identity is the finest and rarely interesting in practice — two architects will not, in general, walk a multi-layer graph in identical order even when they make the same decision. Provenance equivalence is the productive middle. It captures the intuition that two decisions are the same decision precisely when they rest on the same compositional commitments, even when they walked between those commitments in a different order.

The provenance-equivalence classes inherit a partial order from sub-trace containment, and they compose under a sequencing operation $\tau_1 \cdot \tau_2$ in which $\tau_2$ takes a terminal vertex of $\tau_1$ as its starting context. The resulting structure is a trace algebra — a small category whose objects are decision contexts and whose morphisms are provenance-equivalence classes of traces between them. The architect’s question “have we made this decision before” becomes, formally, “is there a morphism in the trace category whose source matches the present context”.

5. Operational consequences

Three consequences of this framing matter in practice and motivate building the trace data structure into the site’s content model and the GraphRAG examples it carries.

Replay under intervention. Because the causal portion of a trace lives in $G_M$, the trace can be replayed under a $\mathrm{do}(\cdot)$ intervention by mutilating the relevant generator edges in $G_M$ and recomputing forward. The context and knowledge portions of the trace are unaffected by interventions that target mechanism alone; this is precisely the locality property the layered model exists to provide.

Precedent retrieval. GraphRAG over a corpus of decision traces — rather than over a corpus of decision documents — produces retrieval that respects compositional structure. A query such as “decisions that touched the pricing capability under the elasticity mechanism with the v3 product schema” has an answer in the trace category that no document-level retrieval can construct.

Architecture as a learning system. When traces are captured at runtime — when an agent or a process records the walk it actually performed — the aggregate of traces is empirical material against which the layered graphs themselves can be revised. Edges that appear in many traces but are absent from the architect’s $G_C$ are candidate omissions. Bindings that frequently underwrite mutually inconsistent traces are candidate misalignments. The architecture becomes self-correcting in a sense that is more than rhetorical: it learns the structure of the decisions it is actually called upon to support.

The hard work ahead, taken up in subsequent articles, is twofold. First: a workable schema for the trace itself — the minimum data required to make the trace category navigable and queryable in a typical graph database. Second: a calculus of trace refinement — what it means for a trace to be elaborated as the architecture’s graphs themselves grow. Both questions reduce, after appropriate setup, to questions about functors between graph layers; both will recur.