04 — Post-Deployment¶

Scope: Per-client. Configure the application, set up integrations, and verify everything works.

Part A: With the Client¶

These steps are done together with the client or require their input.

4a. Complete Setup Wizard¶

Time: ~10 minutes

Navigate to the deployed URL (e.g., https://ai.howard.edu)
Log in with the superuser account
You'll be redirected to the Setup Wizard (since is_configured=False)
Fill in:
University name
University abbreviation
Assistant name
Logo URL (optional)
Welcome message (optional)
Brand colors (optional)
Submit — the workspace is now configured and accessible to users

4b. Configure SSO¶

Time: ~15 minutes

SSO is configured via the Django admin panel — no redeployment needed.

Gather from University IT¶

Item	Description
IdP Entity ID	Identity Provider entity identifier
IdP Metadata URL	Metadata endpoint (optional)
SSO URL	Single Sign-On endpoint
SLO URL	Single Logout endpoint
X509 Certificate	IdP signing certificate (PEM)
Attribute mappings	email, firstName, lastName
Login button name	e.g., "Howard SSO"

Create SSOProvider in Django Admin¶

Log in to https://{domain}/admin/ with the superuser account
Go to Authentication > SSO Providers > Add
Fill in:
Name: Display name for the login button (e.g., "Howard SSO")
Protocol: SAML 2.0 (or OIDC)
Client ID: Organisation slug (e.g., howard-sso)
IdP Entity ID, SSO URL, SLO URL: From university IT
X.509 Certificate: Paste the PEM cert (encrypted at rest)
Attribute Mapping: {"email": "mail", "first_name": "givenName", "last_name": "sn"}
Check Is enabled
Save — the SSO provider is immediately active (no deploy needed)

Give SP Metadata to University IT¶

After saving, the admin panel shows read-only SP URLs: - SP Entity ID: https://{domain}/accounts/saml/{client_id}/metadata/ - ACS URL: https://{domain}/accounts/saml/{client_id}/acs/

University IT configures their IdP with these values plus the released attributes (email, firstName, lastName).

4c. Set Up Connectors¶

Connectors (e.g., Google Drive) are configured post-deployment via the Django admin panel.

Google Drive: See Google Drive Setup for the full guide.

Additional connector guides will be added to references/ as integrations are built.

4d. Upload Initial Knowledge Base¶

Log in as admin
Go to Settings > Knowledge Admin
Create knowledge folders for different content categories
Upload documents (PDF, DOCX, etc.) to each folder
Documents are automatically processed and indexed

Part B: Internal¶

These steps are done by us without client involvement.

4e. Configure Scraper (if needed)¶

Time: ~30 minutes

If the university needs web content scraped:

Create a scraper config file at campuscore_app/scrapers/configs/{client}.py
Define the scraping targets (URLs, selectors, schedules)
Commit and push with a deployment

4f. Run First Scrape¶

After the scraper config is deployed:

# Find the running task
TASK_ARN=$(aws ecs list-tasks --cluster campuscore-howard \
  --service-name campuscore-howard --query 'taskArns[0]' --output text)

# Run the scraper
aws ecs execute-command \
  --cluster campuscore-howard \
  --task $TASK_ARN \
  --container campuscore-web \
  --interactive \
  --command "python manage.py scrape_webpages"

4g. Verify Sentry is Receiving Events¶

Time: ~5 minutes

If the tenant has a SENTRY_DSN set in its GitHub Environment (see Sentry Setup for the full operator playbook), confirm telemetry is flowing before declaring the deploy done.

Check the deploy workflow log. Find the deploy-app job → Verify Sentry config landed in TF_VAR_sentry_dsn step. Expected:

::notice::SENTRY_DSN secret is present (length=…, prefix=https://...)

If you see ::warning::SENTRY_DSN secret resolved to empty, stop and fix the secret before continuing — the ECS task will boot with Sentry disabled.

Check the running container's boot log in CloudWatch (/ecs/campuscore-<tenant>). Filter to the most recent task start. You should see three lines on startup:

Sentry initialized: env=<tenant-env> release=<sha> traces_sample_rate=0.20 logs=True instrumenter=otel
OTel tracing initialized — Postgres exporter + Sentry bridge (OTel is single trace source)
OTel auto-instrumentation enabled for: psycopg, httpx, logging

Fire a smoke event from inside the running task to confirm round-trip:

aws ecs execute-command \
  --cluster campuscore-<tenant> \
  --task $TASK_ARN \
  --container campuscore-web \
  --interactive \
  --command "python -c 'import logging, sentry_sdk; logging.getLogger(\"smoke\").error(\"DEPLOY_VERIFY_SMOKE\"); sentry_sdk.flush(timeout=5); print(\"flushed\")'"

Confirm in Sentry: open the tenant's Sentry project, set the environment filter to <tenant-env>, and within ~30 seconds you should see:
Issues tab: a DEPLOY_VERIFY_SMOKE event
Logs tab: the same log record (separate product, same event)

If anything is missing, see the Troubleshooting section of Sentry Setup.

4h. Verify Slack workflow_runs is Receiving Events¶

Time: ~5 minutes

If the tenant has SLACK_CHANNEL_WORKFLOW_RUNS set in its GitHub Environment (see Slack Setup for the full operator playbook including bot install + channel creation + invite), confirm the bot is posting before declaring the deploy done.

Check the deploy workflow log. Find the deploy-app job → Export Terraform variables step. Expected:

echo "TF_VAR_slack_bot_token=***"
echo "TF_VAR_slack_channel_workflow_runs=C0XXXXXXXX"

=*** (masked, non-empty) on the bot token and =C0XXXXXXXX (visible, non-empty) on the channel ID. If either is blank, fix the GitHub Environment secret/variable before continuing — the ECS task will run with Slack disabled.

Confirm the running container has the env vars (length check only — never print the token):

aws ecs describe-task-definition \
  --task-definition campuscore-<tenant-env> \
  --query 'taskDefinition.containerDefinitions[0].environment[?name==`SLACK_BOT_TOKEN` || name==`SLACK_CHANNEL_WORKFLOW_RUNS`].{name: name, has_value: length(value) > `0`}' \
  --output table \
  --profile <tenant>

Both rows should show has_value: True.

Trigger a real Slack event from the dashboard:
Open https://<tenant-domain>/admin/observability/vector/
Switch to the Maintenance tab
Click Run check now
Confirm in Slack within ~10 seconds:
Open #{client}_workflow_runs (e.g., #vsu_pilot_workflow_runs)

Expected pair of messages from @campuscoreplatform (the CampusCorePlatform app's bot user):

▶ Index health check starting
✓ Index health check — all metrics ok   (or ⚠ / ✗ depending on state)

If the message doesn't arrive, see the Troubleshooting section of Slack Setup. The most common cause is the channel value set as a secret instead of a variable, or the bot not invited to a private channel.

4i. (Optional) Enable the Auto-Rebuild Schedule¶

Time: ~2 minutes

The EventBridge auto-rebuild schedule runs daily and conditionally triggers REINDEX when an index-health metric trips. It's off by default so you can observe ~2 weeks of IndexMaintenanceLog data before trusting the thresholds to fire automatically. See Vector Index Observability for the metrics + thresholds.

When you're ready to enable for this tenant:

gh variable set ENABLE_INDEX_MAINTENANCE_SCHEDULE \
  --env <tenant-env> \
  --repo CampusCoreAI/campuscore \
  --body 'true'

Re-deploy. After the deploy completes, verify the schedule exists:

aws scheduler get-schedule \
  --name "campuscore-<tenant-env>-auto-rebuild" \
  --profile <tenant>

The schedule fires at 06:00 UTC daily by default. Override with INDEX_MAINTENANCE_SCHEDULE_CRON if a different cron expression makes sense for the tenant's traffic pattern.

4j. Delete Onboarding Superuser¶

Important: Never grant is_superuser to an institution admin. CampusCore reserves the superuser flag exclusively for CampusCore staff. The superuser distinction is what protects CampusCore-controlled feature flags (the is_available field on FeatureState) from being flipped by the institution. Institution IT admins must always be provisioned as is_staff=True, is_superuser=False.

The bootstrap superuser created by ensure_superuser is for CampusCore use during onboarding. Never share its credentials with the institution.

Once SSO is confirmed working and a university admin has been granted staff access (is_staff=True, NOT superuser) via their SSO account or via Django admin, delete the onboarding superuser:

aws ecs execute-command \
  --cluster campuscore-howard \
  --task $TASK_ARN \
  --container campuscore-web \
  --interactive \
  --command "python manage.py shell -c \"from django.contrib.auth.models import User; User.objects.filter(username='admin').delete()\""

If CampusCore staff need ongoing platform-admin access for support, create a dedicated CampusCore-owned superuser account before deleting the bootstrap one.

Part C: Verification¶

Verification Checklist¶

Time: ~15 minutes

Troubleshooting¶

Deployment fails at "Configure AWS (admin) with OIDC"¶

Verify ADMIN_AWS_ROLE_ARN repo-level secret is correct
Check that the admin role's OIDC trust policy allows the correct repo

Deployment fails at "Configure AWS (client) for ECR"¶

Verify AWS_ROLE_ARN variable is correct in the GitHub Environment
Check that the client deploy role trusts our admin role ARN

ECS tasks keep crashing¶

Check CloudWatch Logs: /ecs/campuscore-{client}
Common issues: missing environment variables, database connection failure
Verify all secrets are set in the GitHub Environment

Setup wizard doesn't appear¶

Check that init_app_config management command was run (it's in the entrypoint)
Verify the AppConfig has is_configured=False

SSO not working¶

Check the SSO Provider in Django Admin (/admin/campuscore_auth/ssoprovider/) — ensure it's enabled
Verify the IdP's ACS URL matches {domain}/accounts/saml/{client_id}/acs/
Check that the X.509 certificate is valid PEM format
Review Django logs for SAML assertion errors

Database migration errors¶

ECS entrypoint runs python manage.py migrate automatically
Check CloudWatch Logs for migration output
If stuck, use ECS Exec to run migrations manually

Sentry not receiving events after a successful deploy¶

Run through the boot-log check in Step 4g above. If you see Sentry disabled: SENTRY_DSN is empty, the env var didn't reach the container.
Most common cause: the SENTRY_DSN secret is set at repo-level instead of environment-level. Re-set with gh secret set SENTRY_DSN --env <tenant-env> ....
See Sentry Setup → Troubleshooting for the full diagnosis tree.