Runtime Verification¶

Static checks (pytest + ruff + ty) prove the code is internally consistent. They do not prove the system runs. A green suite routinely ships:

Templates rendering broken HTML (multi-line {# … #}, undefined context vars).
URL patterns that resolve in tests but 302/404 in the web container because middleware redirects first.
Workers that crash on startup because an env var or queue name changed.
Docker images built from a stale base because nobody rebuilt after the Dockerfile changed.
Streaming endpoints that buffer instead of streaming token-by-token, and uploads whose presigned URLs never actually reach MinIO.

So for any change that affects what runs in the container — UI, views, URL routes, settings, middleware, management commands, the worker, Dockerfiles, docker-compose, env vars, dependencies, infrastructure — you have to drive the actual running surface, not just run the unit tests.

How to do it¶

The runnable, per-surface procedure lives in the runtime_validation skill (.claude/skills/runtime_validation/) so there's a single source of truth for the commands. It covers:

A "what changed → which check" decision table — match your change to the minimum set of checks.
Bringing the stack up clean and rebuilding when image inputs change.
Real-HTTP checks with curl/httpx — why curl against :8000 (not Django's in-process Client()) is the right tool for streaming, uploads, CSRF, and "is the server even up", and where Client() is still fine.
UI / view / template render checks — fetch the endpoint and assert on the body the user actually sees.
Agent / chat SSE — confirm tokens stream incrementally and the right tools fired.
File upload — drive the real presigned-URL → MinIO → confirm → worker → ready pipeline.
Heavier flows (live ingestion traffic, management commands, Terraform) in references/pipeline-and-infra.md inside the skill.

In Claude Code the skill triggers automatically for container-affecting changes; you can also invoke it explicitly with /runtime_validation.