Self-knowledge. Institutional knowledge. Environmental awareness. What comes after pipeline maturity.
Part 2 of the 9 Stages framework
Stages 7–9: an expanding sphere of awareness — self, organisation, environment
Stages 1–6 describe sequential, testable pipeline maturity. Each stage is a concrete capability you either have or you don’t. The transitions are sharp and the benchmarks are real.
Stages 7–9 are different. They do not describe a sequence of capabilities layered one atop the other. They describe an expanding sphere of awareness that develops concurrently — three concentric rings of knowledge, each ring encompassing the ones inside it.
The boundaries between these stages blur in practice. Steward agents — the thread that runs through all three — appear at Stage 7 in limited form and mature throughout 8 and 9 as the available knowledge deepens.
Knowledge accumulation · Current-state descriptor · Steward agents
Stage 7 is where the pipeline stops treating each production run as isolated. It develops inner awareness: it knows what it built, what happened during production, what failed, what was remediated, and what the current state of the system is.
Every pipeline run writes structured data to a persistent store: specs, deficiency records, severity classifications, resolution paths, stall events, model performance metrics, scores per phase, tickets, review artefacts, test outcomes, and remediation history.
“Every pipeline run adds to what the system knows. Nothing is forgotten.”
Without governed knowledge, all of this is scattered across JSON state files, review artefacts in directories, and git trailers on commits. Stage 7 is where that changes. The knowledge is classified, maintained, versioned, and made available to agents and pipeline phases that need it.
The pipeline accumulates governed knowledge across every production run
All stored knowledge must be explicitly categorised. Without classification, retrieval quality collapses and the knowledge base becomes a dump rather than a resource.
Business goals, product purpose, domain requirements. Why this system exists.
Regulatory rules, security policies, organisational policies. What rules govern it.
Architecture, invariants, interfaces, ADRs with rationale and expiration conditions.
The current-state system descriptor. What it actually is right now — not what was planned.
Test results, incidents, deficiency records, failure patterns, remediation history.
Pipeline profiles, production rules, workflow definitions. How it is built and operated.
Concurrency behaviour, failure modes, transaction boundaries, cache semantics.
Schemas, migrations, lifecycle constraints, storage ownership. How data evolves.
Trust boundaries, threat models, authentication semantics, dependency risk.
Implementation always discovers things original documents did not know. The Reality layer captures what was actually built, not what was planned.
Why the system exists. Business goals, product purpose, domain requirements.
How the system was planned. Architecture, interfaces, invariants, acceptance criteria.
What the system actually is right now. Components, runtime behaviour, deployment topology, invariants.
The current-state system descriptor
The system must maintain a continuously updated, as-built description of what the system actually is right now. Separate from PRDs (intent) and specs (planned design) — the descriptor describes reality.
The descriptor requires its own lifecycle: evidence gathering, synthesis, validation, versioning. If maintained too manually, it goes stale. If too automatically generated, it becomes polished fiction. The hard problem is not generating a description — it is keeping it truthful.
Every statement must carry an explicit epistemic status: intended, implemented, observed, inferred, validated, disputed, deprecated, or provisional. Many facts are only true in specific contexts — behind a feature flag, after a migration, in one tenant but not another.
A system with mature Stage 7 capabilities can take everything learned from a messy development history — every deficiency, edge case, security finding, performance issue, behavioural contract, architectural lesson — synthesise it into a new specification set, and produce a clean codebase built to target quality with all hard-won knowledge baked in from day one.
The specs are not re-derived from scratch. They inherit the full accumulated context: anti-requirements from every failed edge case, invariants from every incident, security constraints from every finding.
The organisation’s investment is not in the code. It is in the spec history, the deficiency history, the organisational context, and the behavioural contracts. The code is regenerated. Nothing learned is lost.
These are not restrictions to be relaxed. They are a constitutive feature of governed autonomy.
Organisational knowledge · Regulatory reasoning · Institutional correctness
Self-knowledge from Stage 7 is now mature. Stage 8 adds a new knowledge domain: understanding the institution the system serves and the regulatory landscape around it.
When generating a new artefact, the pipeline applies the organisation’s coding conventions automatically, applies security rules without being told, uses approved infrastructure patterns without the spec needing to enumerate them. Institutional correctness becomes a property of output, not a checklist to verify after the fact.
“A hospital in Toronto implies PHIPA. The system knows that without being told.”
Specifications become shorter and more focused because all boilerplate institutional requirements are inferred rather than stated. Specifiers describe intent and behaviour — the pipeline applies institutional and regulatory constraints as a function of knowing who it is working for.
Artefacts are no longer generic — they are institutionally correct
A worked example of Stage 8 inference
A hospital in Toronto implies PHIPA privacy obligations, Ontario healthcare regulations, provincial reporting requirements, healthcare audit logging, healthcare data standards, and hospital-specific operational policies. A specification for a patient intake workflow does not need to enumerate the privacy requirements — those are part of what it means to build that kind of system for that kind of organisation.
Stage 8 is not about having access to more documents. Most organisations have captured far more institutional knowledge than they realise. The problem is three things.
The organisation does not know what it knows. The information is captured but scattered across dozens of systems and formats. A security policy exists somewhere. An infrastructure standard exists somewhere else. The knowledge is there but effectively invisible at the moment it’s needed.
Even when information is found, it must be combined, reconciled, and interpreted. A security policy says one thing. An infrastructure standard says something different. A regulatory requirement adds a third constraint. A human expert synthesises these — but that synthesis happens in the expert’s head, is not recorded, and must be repeated every time.
Even when knowledge is discovered and synthesised, applying it consistently across all artefact production is a process problem. A developer building a new service may or may not check the security policy wiki. Stage 8 builds the machinery to find knowledge, combine it coherently, and apply it reliably without human intervention at each step.
Every gate applies organisational and regulatory constraints automatically
Environmental monitoring · Goal-driven origination · Steward coordination
Stage 9 adds forward-looking, outward-facing awareness. The system monitors technology evolution, ecosystem shifts, user behaviour, dependency health, and external change.
Most significantly: the system proactively proposes work rather than waiting for specifications to be submitted. Steward agents reach full maturity, operating with comprehensive autonomy within governance constraints.
“The system doesn’t wait to be asked. It monitors, anticipates, proposes.”
Steward agents monitor continuously across four signal domains
Steward agents at Stage 9 continuously monitor four signal domains, each with distinct character and distinct implications for what work should be proposed.
Performance degradation, crashes, runtime exceptions, error logs, failing validators, infrastructure warnings, resource usage spikes, latency increases, service outages. At Stage 9 these are evaluated with full institutional and environmental context — a latency increase may have regulatory implications.
Outdated documentation, diagrams inconsistent with code, tutorials no longer working, dataset drift, cross-artefact inconsistencies, stale descriptor sections. Frequently missed in systems without dedicated stewardship — the system keeps running while documentation slowly diverges from reality.
Regulatory updates, dependency upgrades and security advisories, API changes from external systems, infrastructure changes, new technologies, industry standard updates, organisational policy changes, branding updates. Stage 9 stewards propose artefact updates before the environment forces them.
Bug reports, feature requests, user suggestions, support tickets, survey responses, internal feedback, forum discussions, and conversational AI interactions. Unlike the other three domains, human signals are unstructured, sometimes contradictory, and require aggregation and pattern detection before they can drive meaningful work.
The most significant capability introduced at Stage 9 is not signal monitoring. Stage 9 introduces a qualitative shift in how work is originated.
Every proposed piece of work passes through the governance layer — impact evaluation, budget assessment, approval thresholds, institutional compliance checks — before entering production. The system does not execute on goals unilaterally. It generates the map of what execution would require and submits that map for human review.
Once multiple steward agents are active — interacting through shared knowledge and dependency graphs — the system begins exhibiting properties that no individual component was designed to produce.
The system mirrors what Minsky described as a society of mind: intelligence emerging from the interaction of specialists, none of which is intelligent in isolation. No single steward understands the whole system — but collectively they produce system-level behaviour that is adaptive and self-correcting. There is no central controller. System-level behaviour is emergent, not centrally directed.
Each steward acts locally — monitoring its artefact, proposing changes within its scope, responding to its own signals. The aggregate effect is system-level maintenance, improvement, and evolution that no one designed at the global level. Feedback loops drive adaptation: steward actions change the system, changed state produces new signals, other stewards respond.
The system likely has a phase transition around knowledge density. Below a certain density, the knowledge base is too sparse to support useful steward reasoning. Above that threshold, stewards begin producing genuinely useful proposals. Understanding where that transition lies and how to cross it efficiently is a practical priority for organisations building toward these capabilities.
Governance design is the most consequential activity in the entire system. Not pipeline design, not steward agent design, not knowledge engineering. How the governance layer is constructed determines what the system tends toward. The system self-organises toward whatever configuration the constraints make stable.
The framework described in Stages 7–9 is coherent and directionally correct, but several of its constitutive problems remain genuinely unsolved.
Stages 7–9 require building and maintaining a knowledge ontology, promotion rules, truth tracking, contradiction handling, formal reasoning infrastructure, descriptor maintenance pipelines, cross-artefact consistency checking, and self-reference control boundaries. Each is a serious engineering problem on its own. The overhead is only justified at scale.
The current-state system descriptor is the anchor for all reasoning above Stage 6. If it drifts into fiction — describing what the system was designed to be rather than what it actually is — the entire knowledge and reasoning layer degrades. Keeping the descriptor truthful without it becoming stale documentation or LLM-generated sludge is the single hardest unsolved problem in the framework.
Deciding what generalises from a local incident to institutional knowledge is one of the hardest judgment calls in engineering. The same edge case can be a quirk of one deployment configuration or a fundamental constraint that applies everywhere. Getting promotion rules right may require the rules themselves to be learned rather than specified.
Earlier higher-level production models — CASE tools, model-driven engineering, software factories — often failed for recognisable reasons: model maintenance exceeded productivity gains, models drifted from reality. The strongest answer here: AI generation capability has changed the economics of artefact production so dramatically that the knowledge governance layer — always the right idea but previously too expensive — is now viable.
The trajectory described here is not a prediction that all organisations will reach Stage 9. It is a map of what becomes possible as AI generation capabilities mature and organisations invest in the knowledge infrastructure to exploit them.
AI orchestration consulting. From strategy to working system. Thirty years of engineering discipline applied to making AI agents reliable.