Sentry Setup¶

Operator playbook for getting Sentry receiving errors, traces, and logs for a CampusCore tenant. End state: each tenant's ECS containers ship telemetry to Sentry; deploy failures and runtime errors page the on-call channel in Slack within seconds.

For the architecture (what goes to Sentry vs CloudWatch vs the in-app agent admin), see Observability Stack.

1. Project structure: one project per tenant¶

Always create one Sentry project per tenant (e.g., campuscore-vsu, campuscore-howard). Do not try to share a single Sentry project across tenants — each tenant's data stays isolated, alert rules are scoped to one university, and per-tenant Slack channels stay clean.

Multiple deployment environments share the same tenant project. A tenant may have several GitHub Environments — vsu-troy-pilot, vsu-staging, vsu-prod — all of which deploy from this repo. They all point to the same tenant Sentry project (campuscore-vsu). The Sentry environment tag (auto-populated from the GitHub Environment name; see Step 5) is what differentiates them inside that one project.

GitHub Environment           Sentry project       Sentry environment tag
─────────────────────        ────────────────     ───────────────────────
vsu-troy-pilot           ┐
vsu-staging              ┼─► campuscore-vsu  ─►   vsu-troy-pilot / vsu-staging / vsu-prod
vsu-prod                 ┘                        (whichever fired the deploy)

howard-staging           ┐
howard-prod              ┼─► campuscore-howard ─► howard-staging / howard-prod
                         ┘

So: a new tenant = create one new Sentry project + a new DSN secret per GitHub Environment they have (each env uses the same DSN). A new environment for an existing tenant = reuse the existing tenant's DSN secret, just add it to the new GitHub Environment.

2. Create the Sentry project¶

In sentry.io:

Settings → Projects → Create Project
Platform: Python · Django
Name: campuscore-<tenant> (no env suffix — environments live inside the project). E.g., campuscore-vsu.
Team: assign to the CampusCore engineering team (create the team if needed)
Create

Sentry generates an initial DSN for the project. Don't bother with the auto-generated install instructions — our SDK setup lives in campus_core/observability/sentry_setup.py and is already wired.

3. Get the DSN¶

Settings → Projects → <your project> → Client Keys (DSN) → copy the public DSN. It looks like:

https://abc123def456@o7891234.ingest.us.sentry.io/9876543

This is the value you'll paste into the SENTRY_DSN GitHub secret. The DSN is a public-write key — Sentry only allows event ingestion with it, not read access — but treat it as sensitive anyway to avoid quota abuse if someone scrapes it.

4. Wire the DSN into the tenant's GitHub Environment¶

Use the gh CLI from the CampusCore repo:

gh secret set SENTRY_DSN \
  --env <tenant-env> \
  --repo CampusCoreAI/campuscore \
  --body 'https://abc123def456@o7891234.ingest.us.sentry.io/9876543'

Replace <tenant-env> with the GitHub Environment name (e.g., vsu-troy-pilot). The quotes around --body aren't strictly needed for a Sentry DSN (no shell metacharacters) but they're safer if the value ever changes shape.

Confirm it landed (lists names only — secret values are never readable):

gh api repos/CampusCoreAI/campuscore/environments/<tenant-env>/secrets \
  --jq '.secrets[] | .name' | grep SENTRY_DSN

If you don't have permission to list secret names, verify via the GitHub UI: https://github.com/CampusCoreAI/campuscore/settings/environments/<tenant-env> → Environment secrets section.

5. (Optional) override sample rates per environment¶

You don't need to set SENTRY_ENVIRONMENT. The Terraform layer falls back to var.environment (the GitHub Environment name) whenever the variable is empty, so each deploy is automatically tagged with environment=<github-env-name> in Sentry. Setting it explicitly is only useful if you want a label different from the GitHub Environment name — which is almost never. Skip it.

The other three vars also have workflow-level defaults documented in GitHub Environment Variables. Set them only when you want a different value than the default:

# Traces sample rate — default 0.2 (20%). Bump to 1.0 in staging for full visibility.
gh variable set SENTRY_TRACES_SAMPLE_RATE --env <tenant-env> --repo CampusCoreAI/campuscore --body '0.2'

# Logs to Sentry's Logs product — default true.
gh variable set SENTRY_ENABLE_LOGS --env <tenant-env> --repo CampusCoreAI/campuscore --body 'true'

# CPU profiling sample rate — default 0.0 (off). Costs PAYG; leave off unless investigating perf.
gh variable set SENTRY_PROFILES_SAMPLE_RATE --env <tenant-env> --repo CampusCoreAI/campuscore --body '0.0'

Common override pattern: a staging env with SENTRY_TRACES_SAMPLE_RATE=1.0 for full visibility while debugging, and prod left at the default 0.2 to keep quota in check.

6. Install Sentry org integrations (once per Sentry org)¶

These are org-level operations — do them once, they apply to every project in the org.

Slack¶

Settings → Integrations → Slack → Install
Authorize the CampusCore Slack workspace
Create the receiving channels (if they don't exist):
#cc-errors — all new error issues across tenants
#cc-agent-failures — agent-loop failures (max_iterations, agent exceptions)
Add Slack alert rules inside the tenant project (since each tenant has its own project, rules are already scoped to one university):
Project → Alerts → Create alert → Issue alert
Trigger: A new issue is created
Conditions: Environment contains prod (so staging/pilot noise doesn't page on-call) — adjust per tenant's env naming
Action: Send a notification to Slack → #cc-errors
Save
Optionally add a second rule with condition: Tag surface equals agent → notification to #cc-agent-failures

If you want different alert routing per deployed environment (e.g., staging issues go to a quieter channel), add additional rules with the appropriate Environment condition. The Sentry environment tag is auto-populated from the GitHub Environment name (Step 5), so it's always the discriminator.

GitHub¶

Settings → Integrations → GitHub → Install
Authorize on the CampusCoreAI org
Select the campuscore repo
Once installed, Settings → Code Mappings — link the Sentry project to the repo so commit metadata flows through (suspect commits, "regressed in PR #X" attribution)

7. Set a PAYG spend cap¶

Settings → Subscription → Spend Caps — set the on-demand budget to $20/month initially. Sentry stops accepting events past the cap rather than charging more, which is what you want during the pilot when volume is unknown. Raise after two weeks of observed volume.

8. Verify the deploy ships the DSN¶

Trigger a deploy. In the workflow log for the deploy-app job, find the "Verify Sentry config landed in TF_VAR_sentry_dsn" step. Expected output:

::notice::SENTRY_DSN secret is present (length=95, prefix=https://...)

If you see this instead:

::warning::SENTRY_DSN secret resolved to empty for env '<tenant-env>'. The ECS task definition will ship with SENTRY_DSN unset.

…then the secret didn't land — see Troubleshooting below.

9. Verify the running container sees the DSN¶

After the deploy completes (~3–5 minutes including ECS rolling), check the latest CloudWatch log stream for /ecs/campuscore-<tenant>:

fields @timestamp, level, message, logger
| filter logger like /sentry_setup|tracer|instrumentation/
| sort @timestamp desc
| limit 20

You should see three lines on every new task startup:

Sentry initialized: env=<tenant-env> release=<sha> traces_sample_rate=0.20 logs=True instrumenter=otel
OTel tracing initialized — Postgres exporter + Sentry bridge (OTel is single trace source)
OTel auto-instrumentation enabled for: psycopg, httpx, logging

If you see Sentry disabled: SENTRY_DSN is empty, the env var didn't reach the container — see Troubleshooting.

10. End-to-end smoke test¶

Trigger a real event from the running app and confirm it reaches Sentry:

TASK_ARN=$(aws ecs list-tasks --cluster campuscore-<tenant> \
  --service-name campuscore-<tenant> --query 'taskArns[0]' --output text)

aws ecs execute-command \
  --cluster campuscore-<tenant> \
  --task "$TASK_ARN" \
  --container campuscore-web \
  --interactive \
  --command "python -c 'import logging, sentry_sdk; logging.getLogger(\"smoke\").error(\"PROD_SENTRY_SMOKE_TEST\"); sentry_sdk.flush(timeout=5); print(\"flushed\")'"

In Sentry within ~30 seconds: - Issues tab → filter by environment:<tenant-env> → see PROD_SENTRY_SMOKE_TEST - Logs tab → same filter → see the same log entry (separate from the issue)

Both should carry the release SHA from the most recent deploy.

Troubleshooting¶

"Sentry disabled: SENTRY_DSN is empty" in CloudWatch but `gh` says the secret is set¶

The deploy job didn't see the secret. Two known causes:

The secret is scoped to the wrong place. Repo-level and env-level secrets are different. The workflow reads ${{ secrets.SENTRY_DSN }} from the environment tied to the job (needs.resolve-env.outputs.client). If you set the secret at repo level instead of env level, it won't be available. Re-set with --env <tenant-env>.
The secret was set after the most recent deploy. Sentry env vars are baked into the ECS task definition at apply time. Trigger a new deploy (git commit --allow-empty -m "redeploy" + push) so a new task definition revision is created and ECS rolls fresh tasks.

"Verify Sentry config landed" notice fires but Sentry shows no events¶

Two known causes:

You're filtered to the wrong environment in Sentry. The UI defaults to environment:All or environment:production depending on org config. Set the filter to <tenant-env> (e.g., vsu-troy-pilot) explicitly. Same applies to the Logs tab — it has its own environment filter.
The Sentry project doesn't have the Logs product enabled. Project → Settings → Features → ensure "Logs" is on. On older org plans this requires opt-in.

Logs show in Issues but not in the Logs tab¶

logger.error(...) produces an Issue via EventHandler. Sentry Logs is a separate product fed by SentryLogsHandler, gated on enable_logs=True at SDK init time. Our SDK sets this via the top-level kwarg (sentry-sdk ≥ 2.36). If you see _experiments={"enable_logs": True} anywhere in code, it's the old API and was silently ignored on the current SDK version — see commit history of sentry_setup.py.

The boot log says `Sentry initialized` but trace_id is empty in CloudWatch JSON logs¶

The ASGI-level OTel middleware (OpenTelemetryMiddleware from opentelemetry-instrumentation-asgi) wraps the app in campus_core/asgi.py. If it's not active, request-scoped logs won't have a trace_id. Verify the boot log line:

OTel auto-instrumentation enabled for: psycopg, httpx, logging

If you see enabled for: with no items, the [opentelemetry] extra on sentry-sdk or the opentelemetry-instrumentation-asgi package didn't install. Check pyproject.toml against the deployed image.

Out-of-quota messages from Sentry¶

The PAYG cap stopped event ingestion. Either: - Bump the cap in Sentry → Settings → Subscription → Spend Caps - Reduce SENTRY_TRACES_SAMPLE_RATE (set the GitHub variable to 0.05 for 5% sampling) - Set SENTRY_ENABLE_LOGS=false to stop the Logs product ingest while keeping errors + traces

METRICS_LOG_FALLBACK_ENABLED=true in any cloud env. This is a local-debug-only switch on the MetricsService (see campus_core/observability/metrics.py). When true, every metric flush emits a structured log per metric — Sentry's LoggingIntegration picks them up and floods the Logs quota with low-signal records. Leave unset/false in cloud.
Sharing one Sentry project across tenants. Tempting for the first one or two universities, but it makes per-tenant Slack routing awkward, blocks ever granting a tenant read-only access to their own data, and lets one noisy tenant clutter every other tenant's view. One project per tenant is the model — see Step 1.
Setting SENTRY_ENVIRONMENT explicitly. Terraform falls back to the GitHub Environment name. Setting it manually creates two sources of truth that drift the first time someone renames an env.
The getsentry/action-release@v1 step at first. It marks each deploy as a Sentry release for "regressed since release X" attribution. Worth adding once issues start accumulating (~2 weeks in), but defer initially to keep the deploy workflow simple.