// WRIT // SOVEREIGN AI BACKEND · // NIST 800-53 REV 5 // OSCAL COMPONENT DEFINITIONS · // HYBRID X25519+ML-KEM-768 TLS · // 100% APACHE / MIT / BSD / MPL · // CNSA 2.0 ALIGNED · // ONE OPENAPI CONTRACT · // IL4 / IL5 TARGET · // AIR-GAP READY · // WRIT // SOVEREIGN AI BACKEND · // NIST 800-53 REV 5 // OSCAL COMPONENT DEFINITIONS · // HYBRID X25519+ML-KEM-768 TLS · // 100% APACHE / MIT / BSD / MPL · // CNSA 2.0 ALIGNED · // ONE OPENAPI CONTRACT · // IL4 / IL5 TARGET · // AIR-GAP READY · // WRIT // SOVEREIGN AI BACKEND · // NIST 800-53 REV 5 // OSCAL COMPONENT DEFINITIONS · // HYBRID X25519+ML-KEM-768 TLS · // 100% APACHE / MIT / BSD / MPL · // CNSA 2.0 ALIGNED · // ONE OPENAPI CONTRACT · // IL4 / IL5 TARGET · // AIR-GAP READY · // WRIT // SOVEREIGN AI BACKEND · // NIST 800-53 REV 5 // OSCAL COMPONENT DEFINITIONS · // HYBRID X25519+ML-KEM-768 TLS · // 100% APACHE / MIT / BSD / MPL · // CNSA 2.0 ALIGNED · // ONE OPENAPI CONTRACT · // IL4 / IL5 TARGET · // AIR-GAP READY · // WRIT // SOVEREIGN AI BACKEND · // NIST 800-53 REV 5 // OSCAL COMPONENT DEFINITIONS · // HYBRID X25519+ML-KEM-768 TLS · // 100% APACHE / MIT / BSD / MPL · // CNSA 2.0 ALIGNED · // ONE OPENAPI CONTRACT · // IL4 / IL5 TARGET · // AIR-GAP READY · // WRIT // SOVEREIGN AI BACKEND · // NIST 800-53 REV 5 // OSCAL COMPONENT DEFINITIONS · // HYBRID X25519+ML-KEM-768 TLS · // 100% APACHE / MIT / BSD / MPL · // CNSA 2.0 ALIGNED · // ONE OPENAPI CONTRACT · // IL4 / IL5 TARGET · // AIR-GAP READY ·
White Papers · WP-001 · v1.0 · April 15, 2026 · 28 PAGES

Platform Architecture Reference

The end-to-end technical architecture of the Writ platform — control plane, data plane, runtime, supply chain, and deployment profiles.

AUDIENCE

Platform engineers · Security architects · Cloud operators

ABSTRACT

A full technical reference covering every tier of the Writ platform: the OpenAPI control plane, the inference runtime (vLLM, Triton, KServe), the data plane (PostgreSQL + pgvector, OpenSearch, Valkey, MinIO, NATS), the supply-chain lane (Tekton, Chains, Sigstore, Harbor), and the three deployment profiles (dev, edge, cluster). Intended for the reader who needs to evaluate, install, or operate the platform.

ArchitectureKubernetesReference

This paper is the canonical architecture reference for the Writ platform. It documents each tier, names the upstream components, enumerates the substitutions we made to avoid license-trapped components, and describes how the three deployment profiles map onto real hardware.

§ PLATFORM MAP hover a box or a line
Writ platform architecture graph CLIENT App / SDK / CLI Writ SDKs (Python · TS · Go · Rust) CLIENT AI Coding Tool Model Context Protocol (MCP) EDGE Edge Gateway Envoy + oqs-provider CONTROL PLANE API Gateway FastAPI · OpenAPI 3.1 at /v1/* CONTROL PLANE MCP Director Custom · MCP spec-compliant IDENTITY Keycloak Keycloak (Apache 2.0) IDENTITY SPIRE SPIRE / SPIFFE (CNCF) IDENTITY OpenBao OpenBao (MPL 2.0 · Linux Foundation) POLICY OPA Open Policy Agent (CNCF) AI RUNTIME vLLM vLLM (Apache 2.0) AI RUNTIME KServe KServe (CNCF) AI RUNTIME Triton NVIDIA Triton (BSD-3) AI RUNTIME LangGraph LangGraph (MIT) DATA PLANE PostgreSQL PostgreSQL + pgvector extension DATA PLANE OpenSearch OpenSearch (Apache 2.0) DATA PLANE MinIO MinIO (AGPLv3 · OSI) DATA PLANE NATS JetStream NATS (CNCF) SUPPLY CHAIN Sigstore Sigstore / cosign v3 (Apache 2.0) OBSERVABILITY Observability Prometheus · Grafana · Loki · Tempo
▸ View as list
CLIENT
App / SDK / CLI

Applications you build. Your mobile app, your web app, your Python or TypeScript service. They import the Writ toolkit and call the platform over the network.

CLIENT
AI Coding Tool

Claude Desktop, Cursor, Windsurf, and other tools that speak the Model Context Protocol. A developer runs one command and their editor can use every Writ capability.

EDGE
Edge Gateway

The front door. Terminates encrypted traffic, enforces rate limits, and passes validated requests to the control plane. Speaks hybrid future-proof encryption (X25519 + ML-KEM-768).

CONTROL PLANE
API Gateway

The single OpenAPI 3.1 surface at /v1/*. Every capability — chat, prediction, search, speech, agents — lives here. One auth model, one error model, one audit schema, one rate-limit policy.

CONTROL PLANE
MCP Director

The tool plane. Exposes every Writ capability as a Model Context Protocol tool. One endpoint; namespaced tools; OAuth 2.1 scopes; everything audited.

IDENTITY
Keycloak

OIDC identity broker. Handles CAC / PIV / enterprise SSO, federation to external IdPs, and token issuance.

IDENTITY
SPIRE

Workload identity. Every service in the cluster gets a SPIFFE identity issued by SPIRE; internal mTLS is verified against it.

IDENTITY
OpenBao

Secrets and key management. HSM-backed. Replaces HashiCorp Vault under a permissive license.

POLICY
OPA

Policy decisions. For every request, OPA evaluates tenant, classification, purpose, and release markings against the rules your admins wrote.

AI RUNTIME
vLLM

Generative model server. Runs large language models with efficient batching, streaming, and multi-tenant isolation.

AI RUNTIME
KServe

Inference server for predictive, classical-ML, and computer-vision models. Scales to zero when idle; spins up when work arrives.

AI RUNTIME
Triton

High-throughput inference for vision and multimodal models, with TensorRT-LLM backends for NVIDIA GPUs.

AI RUNTIME
LangGraph

Agent runtime. Implements plan / act / reflect loops, tool calling, and multi-step workflows with persistent checkpoints.

DATA PLANE
PostgreSQL

The main database. Stores conversations, agent checkpoints, user records, and vector embeddings for AI search (via the pgvector extension).

DATA PLANE
OpenSearch

Full-text and hybrid search. Documents with classification tags; policy-enforced retrieval.

DATA PLANE
MinIO

Object storage. Model weights, training data, evidence bundles. Works offline, synchronizes when a link opens.

DATA PLANE
NATS JetStream

Messaging and event streaming. Feeds audit records, model-request queues, and cross-service events.

SUPPLY CHAIN
Sigstore

Signing and transparency. Every container image, every release, every audit bundle is signed with post-quantum-safe keys and logged to a public record.

OBSERVABILITY
Observability

Metrics, logs, and traces — the operations view of the platform. Dashboards your team already knows how to read. Every service reports; nothing is black-boxed.

§ PLATFORM STACK — LAYERS
01
CLIENT
Client & Tools
SDKs · Web apps · CLI · MCP clients (Claude Desktop, Cursor, …)
02
EDGE
Edge
TLS with hybrid X25519+ML-KEM-768 · rate-limit · DDoS protection · Envoy
03
CONTROL
Control Plane (/v1/* OpenAPI)
One contract for every capability · audit emission · error model · idempotency
04
IDENTITY
Identity & Policy
Keycloak OIDC · SPIFFE workload identity · OpenBao secrets · OPA policy · Kyverno
05
RUNTIME
AI Runtime
vLLM · Triton · KServe · Ray · LangGraph
06
DATA
Data Plane
PostgreSQL + pgvector · OpenSearch · Valkey · MinIO · NATS JetStream · Trino + Iceberg
07
SUPPLY
Supply Chain & Signing
Tekton + Chains · Sigstore (cosign v3 + ML-DSA-65) · Harbor + Iron Bank · Syft · Grype
08
INFRA
Infrastructure
RKE2 Kubernetes (FIPS) · Istio 1.24 · Cilium · Velero · Prometheus + Grafana + Loki + Tempo + Falco + NeuVector
EVERY LAYER SIGNED · EVERY LAYER EVIDENCE-EMITTING · EVERY LAYER SWAPPABLE

1. Control plane

The control plane is a single OpenAPI 3.1 surface exposed at /v1/*. In Phase 2 (AI MVP) it is a Python / FastAPI implementation called the Unified Control Plane API; migration to a Go / Rust v1 is pinned to Phase 5 (Scale) per ADR-0010.

1.1 Unified endpoints

Every paradigm is addressable by a predictable endpoint naming convention — /v1/{paradigm}/{verb}:

  • /v1/chat/completions — generative, OpenAI-shape compatible
  • /v1/predict — tabular / predictive
  • /v1/vision/detect — computer vision (detection, segmentation, classification)
  • /v1/rag/query · /v1/rag/graph — retrieval-augmented generation, including GraphRAG
  • /v1/speech/asr · /v1/speech/tts — automatic speech recognition and text-to-speech
  • /v1/multimodal/infer — vision-language inference
  • /v1/timeseries/forecast · /v1/timeseries/anomaly — forecasting and anomaly detection
  • /v1/rank — recommender and learn-to-rank
  • /v1/rl/act — reinforcement learning and RLHF-style preference feedback
  • /v1/sim/run — simulation and digital-twin
  • /v1/agents/run — agentic workflows via LangGraph

Every call uses one auth model (OIDC via Keycloak), one audit schema (NIST AU-12 compatible), one rate-limit policy, and one error model (RFC 7807).

§ REQUEST LIFECYCLE — 11 STEPS
01
Client sends request
SDK / MCP client
HTTPS POST /v1/{paradigm}/{verb}
02
Edge terminates TLS
Envoy + oqs-provider
Hybrid X25519 + ML-KEM-768 · classical fallback monitored
03
Identity verified
Keycloak OIDC · SPIFFE
Token signature, expiry, scopes, trust domain
04
Policy evaluated
OPA · OpenFGA
Tenant, classification, purpose, release markings
05
Audit event opened
evidence-collector
AU-12 record initialized · request ID assigned
06
Routed to paradigm
Control plane
Predictive → KServe · generative → vLLM · vision → Triton · etc.
07
Enrichment (optional)
pgvector · OpenSearch
RAG retrieval · feature store lookup · graph hop
08
Inference runs
AI runtime
Streaming via SSE or gRPC where supported
09
Response attested
cosign · Rekor
Model hash, CBOM, SSP controls, SLSA provenance
10
Audit finalized
evidence-collector
AU-12 closed · stored signed · WORM retention
11
Response returned
Edge
TLS-encrypted response · attestation attached

1.2 MCP Director

A single Model Context Protocol endpoint fronts every Writ API as namespaced tools — writ.identity.*, writ.data.*, writ.embed.*, writ.fhe.*, writ.llm.*, writ.agents.*, writ.k8s.*, writ.obs.*, writ.compliance.*, writ.federation.*, writ.deploy.*. OAuth 2.1 scopes map 1:1 onto namespaces. Every tool invocation is OPA-gated, mTLS-wrapped with hybrid X25519+ML-KEM-768, and audited.

2. Identity and policy

  • Keycloak — OIDC broker. CAC / PIV certificate authentication. Device-flow for headless environments. External IdP trust via federation endpoints.
  • OpenBao (MPL 2.0) — secrets and key management, HSM-backed. Replaces HashiCorp Vault (BSL).
  • Kyverno + OPA — admission-chain ClusterPolicies and data-path policy. Policy-as-code.
  • SPIFFE / SPIRE — workload identity issuance per enclave. Each workload gets a SPIFFE ID scoped to its trust domain.

3. Inference runtime

  • vLLM + Triton — LLM serving. Paged attention, continuous batching, tensor parallelism, speculative decoding. TensorRT-LLM backend for NVIDIA GPUs.
  • KServe — InferenceService abstraction for predictive, computer vision, and classical ML workloads. Scales to zero when idle.
  • Ray — distributed compute for batch, training, and tuning.
  • LangGraph — agent runtime. Plan / act / reflect primitives. Checkpointing to PostgreSQL.

The runtime is hardware-abstracted: x86, ARM, and Apple Silicon are all first-class for the dev and edge profiles. GPU classes supported include NVIDIA A100 / H100 / L40S, AMD MI250 / MI300, and Apple Silicon via MLX for development.

4. Data plane

Every data-plane component is OSI-approved OSS. The substitutions below avoid SSPL, BSL, ELv2, and RSALv2 licensing traps:

Original (trapped)LicenseWrit usesGovernance
MongoDBSSPLPostgreSQL + pgvectorPostgreSQL License
ElasticsearchELv2 / SSPLOpenSearchApache 2.0
Redis 7.4+RSALv2ValkeyBSD-3 · Linux Foundation
CockroachDBBSLPostgreSQL + PatroniApache 2.0
HashiCorp VaultBSLOpenBaoMPL 2.0 · Linux Foundation
HashiCorp TerraformBSLOpenTofuMPL 2.0 · Linux Foundation
Kafka (heavy)Apache 2.0NATS JetStreamApache 2.0 · CNCF

Object storage is MinIO. Analytics and cross-enclave federation query is Trino with Iceberg catalogs.

5. Supply chain

The supply-chain lane is SLSA Level 3 compliant end-to-end:

  • Tekton Pipelines + Triggers + Chains — pipeline engine, Git-triggered, SLSA provenance emitted per PipelineRun.
  • Sigstore — Rekor transparency log, Fulcio certificate authority, cosign for signing. We run cosign v3 with ML-DSA-65 keys.
  • Syft + Grype + VEX — SBOM generation (CycloneDX 1.6), vulnerability scanning, vulnerability exceptions documented and cosign-attested.
  • Harbor + Iron Bank — container registry with an Iron Bank mirror. All base images are STIG-hardened Iron Bank images.
  • Argo Workflows — long-running build jobs and AI training pipelines where Tekton’s ephemeral model is wrong.

Every artifact ships with:

  • A cosign signature using ML-DSA-65 (post-quantum)
  • An SLSA v1.0 L3 provenance attestation logged to Rekor
  • A CycloneDX 1.6 SBOM
  • A CBOM (Crypto Bill of Materials, CycloneDX 1.6 crypto profile)
  • An OSCAL evidence snapshot

6. Service mesh and network

  • Cilium — eBPF data path, L7 network policy, ClusterMesh for cross-enclave federation.
  • Istio 1.24 (ambient mode) — mTLS mesh. Each pod gets a SPIFFE-issued identity. mTLS peer verification is mandatory; defaults deny.
  • Envoy + oqs-provider — north-south ingress with hybrid X25519+ML-KEM-768 TLS, ML-DSA-65 signing on ingress certificates.

Network policy is default-deny. Every allowed flow is declared in GitOps-managed NetworkPolicy and CiliumNetworkPolicy resources, reviewed in the same PR as the application that needs them.

7. Observability

  • Prometheus + Grafana — metrics, SLOs, burn-rate alerts.
  • Loki + Vector — log aggregation.
  • Jaeger + Tempo — distributed tracing.
  • OpenSearch Security — SIEM-compatible audit store.

All telemetry is tagged with the SPIFFE identity of the emitting workload, the tenant, and the classification level.

8. Runtime security

  • Falco — syscall-level runtime detection.
  • NeuVector — runtime workload firewall, CIS benchmark automation, L7 DLP.
  • Kyverno (runtime policy) — continuous admission checks for drift.

9. Backup and recovery

  • Velero — cluster-state backup.
  • CNPG (CloudNativePG) — PostgreSQL backup schedules with WAL archival to MinIO.

10. Deployment profiles

Three profiles, one Helm chart (charts/writ). The profile selects which subsystems deploy and at what replica counts.

10.1 dev

Single-binary. Works offline with synthetic responses. k3d or kind. CPU inference. 2 vCPU / 8 GB RAM minimum. Day-one smoke-test, no GPUs required. Seed data and reference apps included.

10.2 edge

Single-node. K3s + Big Bang Lite. 8–16 vCPU. 1–2 GPUs optional (an L4 or L40 is plenty). x86, ARM, or Apple Silicon. Air-gap ready. Full /v1/* gateway surface, smaller model fleet, smaller retention.

10.3 cluster

Full RKE2 with FIPS mode + Big Bang. 24+ cores per node. 4–16 GPUs. Iron Bank images only. IL4 / IL5 target. Multi-tenant. Cross-enclave federation via MCP peering.

10.4 Profile progression

A typical customer installs dev in an afternoon, advances to edge in the first week, and spends 6–12 weeks bringing up cluster with their accreditor in parallel with mission-frontend development.

11. What ships per release

Every release produces, in addition to the container images and Helm charts:

  1. Signed offline bundle (OCI image tar, cosign ML-DSA attested)
  2. Full OSCAL evidence pack (SSP, component definitions, continuous control snapshots)
  3. CycloneDX 1.6 SBOM and CBOM
  4. SLSA v1.0 L3 provenance attestation
  5. Release notes and upgrade runbook
  6. Iron Bank mirror update

12. References