Engineering doctrine that doesn't go stale

Engineering standards go stale in one of two ways. They stay too abstract and nobody knows how to apply them. Or they hard-code specific tools and become obsolete the next time the stack changes.

I've been building a reference library that tries to avoid both failure modes by keeping principles and tooling in separate layers that evolve at different rates.

The core split

Principles describe outcomes, constraints, and trade-offs. Tooling shows one possible way to satisfy them with concrete products. Swapping tooling should never require rewriting principles.

A principle like 'validate before packaging, verify after deployment' is true regardless of whether you're on GitHub Actions, Azure DevOps, Jenkins, or a custom deploy script. The order of operations is the principle. The specific jobs and YAML are tooling.

A principle like 'use GitHub Actions with these specific action versions' is not a principle — it's tooling pretending to be a principle. Teams on different platforms can't adopt it, and it needs updating every time the toolchain moves.

Principles vs tooling — how to tell them apart
Principle	Tooling equivalent (don't write this)
Validate before packaging, verify after deployment	Use the deploy-prod GitHub Action with wait-for-health set to 120s
Contracts must be validated in CI	Run scripts/validate_contracts.py in the quality.yml workflow
Build evidence must be traceable to a commit	Pin all GitHub Actions to their full SHA digest
Caches are accelerators, not dependencies	Configure cache: keys in GitHub Actions to use the lockfile hash

The three-layer structure

Repository structure
Layer	Path	Changes how often
Timeless principles	doctrine/principles/	Rarely — only when we learn something structural
Illustrative tooling	doctrine/tooling/	When the stack, platform, or team context changes
Estate supplements	doctrine/tooling/estates/	Per org, per cloud, per team — optional overrides

Estate supplements are the key to making this usable in practice. An org running entirely on Azure has different tooling examples than one running on AWS. Neither invalidates the principles. The estate supplement captures the org-specific mapping without polluting the canonical principle files.

Build surfaces — making the implicit explicit

One of the more useful concepts in the doctrine is named build surfaces. Every repository owns a set of surfaces. The problem isn't missing surfaces — it's hidden ones.

Build surfaces every repo should name explicitly
Surface	What it is
Local developer entrypoint	The one command a new contributor runs to build and test locally
Quality gate	The CI job that must pass before merge — lint, tests, contracts, deny
Release surface	How the artefact is built and packaged — reproducible, from a tagged commit
Deploy surface	How the artefact reaches a running environment — promotion, not rebuild
Verification surface	How you know the deployed thing is healthy — not just that the deploy succeeded
Execution surface	Scheduled scans, queued automation, recurring runbooks — first-class, not hidden in deploy pipelines

The principle is: define the surfaces you own, not the ones you wish you had. Hidden surfaces are the problem. A repo with no declared verification surface usually has no verification — or has it buried in an oncall runbook that nobody reads.

The adoption playbook

The playbook is dependency-aware, not dogmatic. Skip phases that are already healthy.

Suggested adoption order
Phase	Focus	Why first
1	Quality gate — one command that fails on fmt/lint/tests for main	Creates safety to change process; reproducible failures
2	Trunk-oriented integration — short-lived branches, PR review, green main	Reduces drift and batch risk; smaller PRs merge faster
3	Contracts at boundaries — API or event schemas validated in CI	Stops tribal JSON; catches contract breaks before production
4	Observability baseline — correlated logs/traces for main paths	Makes incidents diagnosable without archaeology
5	Reliability habits — incident severity, blameless reviews, error budgets	Ties delivery cadence to actual risk

The adoption order exists because dependencies are real. You can't have useful observability if you don't have contract-validated events to observe. You can't have meaningful reliability metrics if your quality gate doesn't catch regressions before they ship.

What it's for

The library is designed to be forked or referenced by teams. Take the principles wholesale — they're platform-agnostic. Replace the tooling examples with your stack. Add an estate supplement for your org's specific constraints. Hand new leaders the one-pager (minimum-viable-doctrine template) before the full tree.

The goal is doctrine that stays useful as the stack evolves, rather than becoming a historical artifact that everyone ignores because it still references Jenkins pipelines from 2019.