← All posts

AI Agent Factory: building a governed software factory control plane

I’ve got a project on my hands.


I’m building what I can only describe as a “software factory control plane”. Not another chatbot. Not a demo that looks good until it touches real delivery. A proper system that accepts a request, turns it into an executable plan, dispatches jobs to workers, and keeps the whole thing durable and auditable.


It’s genuinely fun to build, because it forces all the right questions early: what counts as state, what counts as an artifact, what requires approval, and what happens when things crash mid-flight.


What problem am I solving?


Most “AI for engineering” approaches are execution-first. Someone prompts an agent, it runs tools, it changes stuff, and you get a result. Sometimes it’s brilliant. Sometimes it’s chaos. Almost always it’s hard to answer basic questions afterwards:

  • What exactly ran?
  • What changed?
  • Who approved it?
  • Can we reproduce it?
  • Can we recover if it dies halfway through?


If the work has any blast radius, that’s not good enough. So the purpose here is to make AI-assisted delivery feel like a platform capability, not a gamble.


The shape of the system


At the core is a control plane that models work as a lifecycle:

  1. Sessions and requestsA user creates a session, then submits a request (a task). That request becomes the root record for everything that follows.
  2. Planning as a first-class outputA planning worker takes a request and writes a plan artifact. Not vibes. Not chat logs. A persisted artifact you can retrieve later.
  3. Dispatch with real semanticsJobs don’t get “fired off”. They are queued and processed with proper lifecycle semantics: claim, renew, complete, fail.


Dispatch uses Azure Service Bus queues (not Event Hubs). It’s a job system, not a telemetry firehose.

  1. SQL-first durabilitySQL is the source of truth for sessions, requests, jobs, artifacts, approvals, and installations. If the service restarts, state is retained. That’s the bar.
  2. Governance is not a bolt-onAnything that touches real systems (repo changes, pipelines, infra) is brokered behind policy and approvals. The control plane is the place where you encode the guardrails.


Performance posture (because I refuse to ship novels as JSON)


One of my hard rules: I’m not building a system that shuffles kilobytes and kilobytes of nested JSON around as the default behaviour.


The API should be lean:

  • return IDs, status, hashes, sizes, and references
  • store large outputs as encrypted artifacts
  • use deltas/patches where it makes sense, especially for repeat outputs and diffs


Fast by default, and predictable under load.


What I’m taking inspiration from


I’m not building this in a vacuum. There’s a growing body of open-source work that’s pushing “agentic engineering” in the right direction, and I’m borrowing ideas where they fit, then hardening them for the reality of governed environments.


A few repos that have influenced the thinking:

  • Daniel Miessler’s “Personal AI Infrastructure (PAI)”Great framing for treating agent workflows like a real system you install, evolve, and operate.
  • anomalyco’s “opencode”Solid examples of repo-integrated agent execution patterns and “agent changes code” workflows.
  • Azure “AI Landing Zones”Useful reference for what “enterprise-ready” infrastructure expectations look like when you’re building anything AI-adjacent on Azure.
  • Microsoft’s “Deploy Your AI Application In Production”A good reminder that production posture is its own discipline: identity, ops, deployment hygiene, not just “it works on my laptop”.


The important bit: I’m not trying to clone any of these. This is a different product with a different centre of gravity. I’m taking patterns, not copying architectures.


Is this basically “enterprise PAI / OpenCode”?


Conceptually, yes. It’s in that family: agentic workflows that can operate on real engineering systems.


The difference is posture. This is governance-first:

  • durable state machine
  • queue-correct job execution semantics
  • approvals and audit trail
  • controlled execution behind policy


That’s the “enterprise version” bit. Same ambition, different constraints.


What I’m shipping first


I’m focusing on a vertical slice that proves the whole loop works end-to-end:

  • create session/request
  • create an initial job
  • dispatch to a Service Bus queue
  • worker claims/renews/completes
  • plan artifact is retrievable
  • install plan/apply/status is durable
  • restart the service and nothing mysteriously disappears


Once that’s solid, scaling the worker plane and expanding capabilities becomes a straightforward evolution rather than a rewrite.


If this goes where I think it goes, it becomes a platform primitive: turning “do this engineering task” into a durable, policy-boxed workflow that you can audit, replay, and trust.


And yes, I think it’s cool.

arch
lots of fun to build