White Papers · WP-001 · v1.0 · April 15, 2026 · 28 PAGES

Platform Architecture Reference

The end-to-end technical architecture of the Writ platform — control plane, data plane, runtime, supply chain, and deployment profiles.

AUDIENCE

Platform engineers · Security architects · Cloud operators

ABSTRACT

A full technical reference covering every tier of the Writ platform: the OpenAPI control plane, the inference runtime (vLLM, Triton, KServe), the data plane (PostgreSQL + pgvector, OpenSearch, Valkey, MinIO, NATS), the supply-chain lane (Tekton, Chains, Sigstore, Harbor), and the three deployment profiles (dev, edge, cluster). Intended for the reader who needs to evaluate, install, or operate the platform.

ArchitectureKubernetesReference

This paper is the canonical architecture reference for the Writ platform. It documents each tier, names the upstream components, enumerates the substitutions we made to avoid license-trapped components, and describes how the three deployment profiles map onto real hardware.

§ PLATFORM MAP — hover a box or a line

▸ View as list

CLIENT

App / SDK / CLI

Applications you build. Your mobile app, your web app, your Python or TypeScript service. They import the Writ toolkit and call the platform over the network.

CLIENT

AI Coding Tool

Claude Desktop, Cursor, Windsurf, and other tools that speak the Model Context Protocol. A developer runs one command and their editor can use every Writ capability.

EDGE

Edge Gateway

The front door. Terminates encrypted traffic, enforces rate limits, and passes validated requests to the control plane. Speaks hybrid future-proof encryption (X25519 + ML-KEM-768).

CONTROL PLANE

API Gateway

The single OpenAPI 3.1 surface at /v1/*. Every capability — chat, prediction, search, speech, agents — lives here. One auth model, one error model, one audit schema, one rate-limit policy.

CONTROL PLANE

MCP Director

The tool plane. Exposes every Writ capability as a Model Context Protocol tool. One endpoint; namespaced tools; OAuth 2.1 scopes; everything audited.

IDENTITY

Keycloak

OIDC identity broker. Handles CAC / PIV / enterprise SSO, federation to external IdPs, and token issuance.

IDENTITY

SPIRE

Workload identity. Every service in the cluster gets a SPIFFE identity issued by SPIRE; internal mTLS is verified against it.

IDENTITY

OpenBao

Secrets and key management. HSM-backed. Replaces HashiCorp Vault under a permissive license.

POLICY

OPA

Policy decisions. For every request, OPA evaluates tenant, classification, purpose, and release markings against the rules your admins wrote.

AI RUNTIME

vLLM

Generative model server. Runs large language models with efficient batching, streaming, and multi-tenant isolation.

AI RUNTIME

KServe

Inference server for predictive, classical-ML, and computer-vision models. Scales to zero when idle; spins up when work arrives.

AI RUNTIME

Triton

High-throughput inference for vision and multimodal models, with TensorRT-LLM backends for NVIDIA GPUs.

AI RUNTIME

LangGraph

Agent runtime. Implements plan / act / reflect loops, tool calling, and multi-step workflows with persistent checkpoints.

DATA PLANE

PostgreSQL

The main database. Stores conversations, agent checkpoints, user records, and vector embeddings for AI search (via the pgvector extension).

DATA PLANE

OpenSearch

Full-text and hybrid search. Documents with classification tags; policy-enforced retrieval.

DATA PLANE

MinIO

Object storage. Model weights, training data, evidence bundles. Works offline, synchronizes when a link opens.

DATA PLANE

NATS JetStream

Messaging and event streaming. Feeds audit records, model-request queues, and cross-service events.

SUPPLY CHAIN

Sigstore

Signing and transparency. Every container image, every release, every audit bundle is signed with post-quantum-safe keys and logged to a public record.

OBSERVABILITY

Observability

Metrics, logs, and traces — the operations view of the platform. Dashboards your team already knows how to read. Every service reports; nothing is black-boxed.

§ PLATFORM STACK — LAYERS

CLIENT

Client & Tools

SDKs · Web apps · CLI · MCP clients (Claude Desktop, Cursor, …)

EDGE

Edge

TLS with hybrid X25519+ML-KEM-768 · rate-limit · DDoS protection · Envoy

CONTROL

Control Plane (/v1/* OpenAPI)

One contract for every capability · audit emission · error model · idempotency

IDENTITY

Identity & Policy

Keycloak OIDC · SPIFFE workload identity · OpenBao secrets · OPA policy · Kyverno

RUNTIME

AI Runtime

vLLM · Triton · KServe · Ray · LangGraph

DATA

Data Plane

PostgreSQL + pgvector · OpenSearch · Valkey · MinIO · NATS JetStream · Trino + Iceberg

SUPPLY

Supply Chain & Signing

Tekton + Chains · Sigstore (cosign v3 + ML-DSA-65) · Harbor + Iron Bank · Syft · Grype

INFRA

Infrastructure

RKE2 Kubernetes (FIPS) · Istio 1.24 · Cilium · Velero · Prometheus + Grafana + Loki + Tempo + Falco + NeuVector

EVERY LAYER SIGNED · EVERY LAYER EVIDENCE-EMITTING · EVERY LAYER SWAPPABLE

1. Control plane

The control plane is a single OpenAPI 3.1 surface exposed at /v1/*. In Phase 2 (AI MVP) it is a Python / FastAPI implementation called the Unified Control Plane API; migration to a Go / Rust v1 is pinned to Phase 5 (Scale) per ADR-0010.

1.1 Unified endpoints

Every paradigm is addressable by a predictable endpoint naming convention — /v1/{paradigm}/{verb}:

/v1/chat/completions — generative, OpenAI-shape compatible
/v1/predict — tabular / predictive
/v1/vision/detect — computer vision (detection, segmentation, classification)
/v1/rag/query · /v1/rag/graph — retrieval-augmented generation, including GraphRAG
/v1/speech/asr · /v1/speech/tts — automatic speech recognition and text-to-speech
/v1/multimodal/infer — vision-language inference
/v1/timeseries/forecast · /v1/timeseries/anomaly — forecasting and anomaly detection
/v1/rank — recommender and learn-to-rank
/v1/rl/act — reinforcement learning and RLHF-style preference feedback
/v1/sim/run — simulation and digital-twin
/v1/agents/run — agentic workflows via LangGraph

Every call uses one auth model (OIDC via Keycloak), one audit schema (NIST AU-12 compatible), one rate-limit policy, and one error model (RFC 7807).

§ REQUEST LIFECYCLE — 11 STEPS

Client sends request

SDK / MCP client

HTTPS POST /v1/{paradigm}/{verb}

Edge terminates TLS

Envoy + oqs-provider

Hybrid X25519 + ML-KEM-768 · classical fallback monitored

Identity verified

Keycloak OIDC · SPIFFE

Token signature, expiry, scopes, trust domain

Policy evaluated

OPA · OpenFGA

Tenant, classification, purpose, release markings

Audit event opened

evidence-collector

AU-12 record initialized · request ID assigned

Routed to paradigm

Control plane

Predictive → KServe · generative → vLLM · vision → Triton · etc.

Enrichment (optional)

pgvector · OpenSearch

RAG retrieval · feature store lookup · graph hop

Inference runs

AI runtime

Streaming via SSE or gRPC where supported

Response attested

cosign · Rekor

Model hash, CBOM, SSP controls, SLSA provenance

Audit finalized

evidence-collector

AU-12 closed · stored signed · WORM retention

Response returned

Edge

TLS-encrypted response · attestation attached

1.2 MCP Director

A single Model Context Protocol endpoint fronts every Writ API as namespaced tools — writ.identity.*, writ.data.*, writ.embed.*, writ.fhe.*, writ.llm.*, writ.agents.*, writ.k8s.*, writ.obs.*, writ.compliance.*, writ.federation.*, writ.deploy.*. OAuth 2.1 scopes map 1:1 onto namespaces. Every tool invocation is OPA-gated, mTLS-wrapped with hybrid X25519+ML-KEM-768, and audited.

2. Identity and policy

Keycloak — OIDC broker. CAC / PIV certificate authentication. Device-flow for headless environments. External IdP trust via federation endpoints.
OpenBao (MPL 2.0) — secrets and key management, HSM-backed. Replaces HashiCorp Vault (BSL).
Kyverno + OPA — admission-chain ClusterPolicies and data-path policy. Policy-as-code.
SPIFFE / SPIRE — workload identity issuance per enclave. Each workload gets a SPIFFE ID scoped to its trust domain.

3. Inference runtime

vLLM + Triton — LLM serving. Paged attention, continuous batching, tensor parallelism, speculative decoding. TensorRT-LLM backend for NVIDIA GPUs.
KServe — InferenceService abstraction for predictive, computer vision, and classical ML workloads. Scales to zero when idle.
Ray — distributed compute for batch, training, and tuning.
LangGraph — agent runtime. Plan / act / reflect primitives. Checkpointing to PostgreSQL.

The runtime is hardware-abstracted: x86, ARM, and Apple Silicon are all first-class for the dev and edge profiles. GPU classes supported include NVIDIA A100 / H100 / L40S, AMD MI250 / MI300, and Apple Silicon via MLX for development.

4. Data plane

Every data-plane component is OSI-approved OSS. The substitutions below avoid SSPL, BSL, ELv2, and RSALv2 licensing traps:

Original (trapped)	License	Writ uses	Governance
MongoDB	SSPL	PostgreSQL + pgvector	PostgreSQL License
Elasticsearch	ELv2 / SSPL	OpenSearch	Apache 2.0
Redis 7.4+	RSALv2	Valkey	BSD-3 · Linux Foundation
CockroachDB	BSL	PostgreSQL + Patroni	Apache 2.0
HashiCorp Vault	BSL	OpenBao	MPL 2.0 · Linux Foundation
HashiCorp Terraform	BSL	OpenTofu	MPL 2.0 · Linux Foundation
Kafka (heavy)	Apache 2.0	NATS JetStream	Apache 2.0 · CNCF

Object storage is MinIO. Analytics and cross-enclave federation query is Trino with Iceberg catalogs.

5. Supply chain

The supply-chain lane is SLSA Level 3 compliant end-to-end:

Tekton Pipelines + Triggers + Chains — pipeline engine, Git-triggered, SLSA provenance emitted per PipelineRun.
Sigstore — Rekor transparency log, Fulcio certificate authority, cosign for signing. We run cosign v3 with ML-DSA-65 keys.
Syft + Grype + VEX — SBOM generation (CycloneDX 1.6), vulnerability scanning, vulnerability exceptions documented and cosign-attested.
Harbor + Iron Bank — container registry with an Iron Bank mirror. All base images are STIG-hardened Iron Bank images.
Argo Workflows — long-running build jobs and AI training pipelines where Tekton’s ephemeral model is wrong.

Every artifact ships with:

A cosign signature using ML-DSA-65 (post-quantum)
An SLSA v1.0 L3 provenance attestation logged to Rekor
A CycloneDX 1.6 SBOM
A CBOM (Crypto Bill of Materials, CycloneDX 1.6 crypto profile)
An OSCAL evidence snapshot

6. Service mesh and network

Cilium — eBPF data path, L7 network policy, ClusterMesh for cross-enclave federation.
Istio 1.24 (ambient mode) — mTLS mesh. Each pod gets a SPIFFE-issued identity. mTLS peer verification is mandatory; defaults deny.
Envoy + oqs-provider — north-south ingress with hybrid X25519+ML-KEM-768 TLS, ML-DSA-65 signing on ingress certificates.

Network policy is default-deny. Every allowed flow is declared in GitOps-managed NetworkPolicy and CiliumNetworkPolicy resources, reviewed in the same PR as the application that needs them.

7. Observability

Prometheus + Grafana — metrics, SLOs, burn-rate alerts.
Loki + Vector — log aggregation.
Jaeger + Tempo — distributed tracing.
OpenSearch Security — SIEM-compatible audit store.

All telemetry is tagged with the SPIFFE identity of the emitting workload, the tenant, and the classification level.

8. Runtime security

Falco — syscall-level runtime detection.
NeuVector — runtime workload firewall, CIS benchmark automation, L7 DLP.
Kyverno (runtime policy) — continuous admission checks for drift.

9. Backup and recovery

Velero — cluster-state backup.
CNPG (CloudNativePG) — PostgreSQL backup schedules with WAL archival to MinIO.

10. Deployment profiles

Three profiles, one Helm chart (charts/writ). The profile selects which subsystems deploy and at what replica counts.

10.1 `dev`

Single-binary. Works offline with synthetic responses. k3d or kind. CPU inference. 2 vCPU / 8 GB RAM minimum. Day-one smoke-test, no GPUs required. Seed data and reference apps included.

10.2 `edge`

Single-node. K3s + Big Bang Lite. 8–16 vCPU. 1–2 GPUs optional (an L4 or L40 is plenty). x86, ARM, or Apple Silicon. Air-gap ready. Full /v1/* gateway surface, smaller model fleet, smaller retention.

10.3 `cluster`

Full RKE2 with FIPS mode + Big Bang. 24+ cores per node. 4–16 GPUs. Iron Bank images only. IL4 / IL5 target. Multi-tenant. Cross-enclave federation via MCP peering.

10.4 Profile progression

A typical customer installs dev in an afternoon, advances to edge in the first week, and spends 6–12 weeks bringing up cluster with their accreditor in parallel with mission-frontend development.

11. What ships per release

Every release produces, in addition to the container images and Helm charts:

Signed offline bundle (OCI image tar, cosign ML-DSA attested)
Full OSCAL evidence pack (SSP, component definitions, continuous control snapshots)
CycloneDX 1.6 SBOM and CBOM
SLSA v1.0 L3 provenance attestation
Release notes and upgrade runbook
Iron Bank mirror update

12. References

ADR-0010 — Control plane v1 migration to Go / Rust pinned to Phase 5 Scale
SSDF narrative — /docs/02-components/supply-chain-narrative.md (in customer bundle)
Federation architecture — see Federation Architecture white paper (WP-004)
PQC posture — see Post-Quantum Cryptography white paper (WP-002)
Accreditation pathway — see Accreditation Dossier white paper (WP-003)

1. Control plane

1.1 Unified endpoints

1.2 MCP Director

2. Identity and policy

3. Inference runtime

4. Data plane

5. Supply chain

6. Service mesh and network

7. Observability

8. Runtime security

9. Backup and recovery

10. Deployment profiles

10.1 dev

10.2 edge

10.3 cluster

10.4 Profile progression

11. What ships per release

12. References

10.1 `dev`

10.2 `edge`

10.3 `cluster`