Slack Setup¶
Operator playbook for getting Slack notifications wired for a CampusCore tenant. End state: the per-client workflow_runs channel receives start/complete/fail messages from the index health check, HNSW rebuild, and scrape pipelines — so engineers can spot a stuck workflow without opening AWS.
For the architecture (what posts to Slack, when, and how slack_utils is wired into call sites), see Observability Stack and the per-workflow doc Vector Index Observability.
The Slack app already exists. The
CampusCorePlatformapp is installed in the CampusCore Slack workspace with the right scopes. This playbook assumes that's a given — there's no per-tenant or per-deploy step for "create the app." If you're starting a workspace from scratch, see the App Setup Reference at the bottom.
1. Structure: one workspace, one bot, per-client channels¶
One Slack app installed in the CampusCore Slack workspace. The app is CampusCorePlatform; the bot user shows up in Slack as @campuscoreplatform. It's the same bot identity regardless of which tenant fires a notification. We don't run one bot per tenant — that would multiply admin work without buying isolation we actually want.
One channel per tenant. Each GitHub Environment (vsu-troy-pilot, howard-prod, …) posts to its own dedicated channel. That keeps cross-tenant traffic from cluttering any one client's feed, and lets you point per-tenant alert rules at different Slack rooms later if needed.
Slack workspace: CampusCore
├── App: CampusCorePlatform (single bot, one token at the repo level)
│ └── bot user: @campuscoreplatform
├── #vsu_pilot_workflow_runs ◄── vsu-troy-pilot GitHub Environment
├── #howard_pilot_workflow_runs ◄── howard-pilot GitHub Environment
└── #cc-errors ◄── Sentry-driven, separate concern
So: a new tenant = create one new channel, invite the bot, set one variable. The bot token is set once at the repo level and never per-tenant unless you need credential isolation (see Step 3 for that override).
2. Bot token storage (already done — referenced for verification)¶
The bot token (the xoxb-... value from CampusCorePlatform → OAuth & Permissions → Bot User OAuth Token) is stored as a GitHub repository-level secret named SLACK_BOT_TOKEN. This is a one-time setup that's already in place.
Verify it's present (lists names only — secret values are never readable):
That one secret is available to every deploy job for every environment. The workflow line that reads it (echo "TF_VAR_slack_bot_token=${{ secrets.SLACK_BOT_TOKEN }}" in .github/workflows/deploy-aws.yml) resolves to the same value for every tenant.
If you ever need to rotate the token (compromised, app reinstalled, …):
- In api.slack.com → CampusCorePlatform → OAuth & Permissions → Reinstall to Workspace to get a fresh token.
bash gh secret set SLACK_BOT_TOKEN \ --repo CampusCoreAI/campuscore \ --body 'xoxb-<new-token>'- Re-deploy each tenant. Old token stops working as soon as Slack issues the new one; re-deploying just re-injects the new value into ECS task definitions.
3. (Optional) override the bot token per environment¶
If a client requires their own isolated bot identity for compliance — separate audit trail, ability to revoke without affecting other tenants — set an environment-level secret of the same name. Env-level wins over repo-level.
gh secret set SLACK_BOT_TOKEN \
--env <tenant-env> \
--repo CampusCoreAI/campuscore \
--body 'xoxb-<tenant-specific>'
This is rare. Most deployments share the one repo-level bot.
4. Per-client setup¶
These five steps run once per GitHub Environment. The user-facing checklist for onboarding a new tenant:
4a. Create the channel¶
In Slack:
- Channel name:
{client}_workflow_runs(e.g.,vsu_pilot_workflow_runs,howard_workflow_runs) - Visibility:
- Private if only the engineering team reads it (default — keeps workflow noise out of client-facing channels)
- Public if the client team also wants visibility into deploys and index health
- Description: "CampusCore workflow notifications for {client}: index health checks, HNSW rebuilds, scrape runs."
4b. Invite the bot¶
In the channel:
Required for private channels. Public channels technically work without an explicit invite thanks to the chat:write.public scope, but inviting anyway makes channel membership auditable and matches what Slack admins expect to see.
If you skip this step on a private channel, the deploy will succeed, the env vars will land in the ECS task, the running task will think Slack is configured — but every chat.postMessage will return channel_not_found and silently no-op.
4c. Copy the channel ID¶
The Slack API requires the channel ID (C0XXXXXXXX), not the #name. Two ways to grab it:
- Slack desktop: click the channel name → "About" pane → bottom shows
Channel ID: C0XXXXXXXX. Copy. - Slack URL: open the channel, look at
slack.com/.../archives/C0XXXXXXXX. The trailingC0XXXis the ID.
⚠ Do not paste the #channel-name. The variable resolves to text, the SDK calls chat.postMessage(channel="#workflow-runs"), and Slack returns channel_not_found — even when a channel with that exact name exists. Use the ID.
4d. Set the channel ID as a GitHub VARIABLE¶
The channel ID is not sensitive (it's an internal workspace identifier — knowing the ID alone gets you nothing without the bot token). Store it as a variable, not a secret.
gh variable set SLACK_CHANNEL_WORKFLOW_RUNS \
--env <tenant-env> \
--repo CampusCoreAI/campuscore \
--body 'C0XXXXXXXX'
⚠ Common mistake: gh secret set SLACK_CHANNEL_WORKFLOW_RUNS …. The deploy workflow reads ${{ vars.SLACK_CHANNEL_WORKFLOW_RUNS }} — a value in the secrets namespace will resolve to empty even though it exists. The two namespaces don't fall back to each other.
If you accidentally set it as a secret:
gh secret delete SLACK_CHANNEL_WORKFLOW_RUNS --env <tenant-env> --repo CampusCoreAI/campuscore
gh variable set SLACK_CHANNEL_WORKFLOW_RUNS --env <tenant-env> --repo CampusCoreAI/campuscore --body 'C0XXXXXXXX'
4e. Trigger a redeploy¶
Push to the deploy branch, or run the workflow manually. The next ECS task definition revision will have SLACK_BOT_TOKEN and SLACK_CHANNEL_WORKFLOW_RUNS populated. Within ~3 minutes the running tasks roll over to the new revision and Slack posts start firing.
5. Verify¶
After the redeploy completes:
5a. Confirm the deploy actually carried the values¶
In the workflow log for the deploy-app job, find the "Export Terraform variables" step. Expected:
echo "TF_VAR_slack_bot_token=***" ◄── non-empty (masked)
echo "TF_VAR_slack_channel_workflow_runs=C0XXXXXXXX" ◄── non-empty (channel IDs are not sensitive)
If the bot-token line is =*** and the channel-id line is blank, you missed step 4d. If both are blank, the repo-level SLACK_BOT_TOKEN secret is also missing — verify it exists with gh secret list --repo CampusCoreAI/campuscore.
5b. Confirm the running container sees the env vars¶
aws ecs describe-task-definition \
--task-definition campuscore-<tenant-env> \
--query 'taskDefinition.containerDefinitions[0].environment[?name==`SLACK_BOT_TOKEN` || name==`SLACK_CHANNEL_WORKFLOW_RUNS`].{name: name, has_value: length(value) > `0`}' \
--output table \
--profile <tenant>
Both rows should show has_value: True. The values themselves stay hidden — we only ever check lengths.
5c. Smoke-test from the dashboard¶
- Open
https://<tenant-domain>/admin/observability/vector/ - Switch to the Maintenance tab
- Click Run check now
- Within ~10 seconds,
#{client}_workflow_runsshould show:
If both messages appear, Slack is wired end-to-end.
Troubleshooting¶
Log line Slack not configured (SLACK_BOT_TOKEN empty); skipping notification: … in CloudWatch¶
The diagnostic logging in campus_core/shared_utils/slack_utils.py emits a more detailed line right before this one:
Slack token resolution failed: settings.SLACK_BOT_TOKEN attribute <STATE>,
settings value length <N>, os.environ['SLACK_BOT_TOKEN'] length <N>
Decode:
| Diagnostic | Meaning | Fix |
|---|---|---|
attribute MISSING |
The deployed image is older than commit 6448fd7 (the one that added the settings.py line). Stale image. |
Push a fresh deploy. |
attribute PRESENT, settings 0, os.environ 0 |
ECS env var actually isn't set on the running container. Either the deploy didn't carry the secret (workflow-side problem) or the running task is on an older task-def revision than you think. | Step 5a + 5b above; if both pass, the running tasks haven't rolled yet — wait or force a new deployment. |
attribute PRESENT, settings 0, os.environ N |
The OS has it but Django settings lost it. Most likely cause: .env file got loaded with overwrite=True somewhere, blanking the OS value. |
Inspect campus_core/settings.py around the env.read_env(...) call. |
attribute PRESENT, settings N, … |
This path shouldn't fire — token is non-empty. If you still see it, you're reading logs from before the redeploy. | Check log timestamps; trigger a fresh request and re-read. |
Log line Channel key 'workflow_runs' not mapped in NOTIFICATION_CHANNELS; skipping notification: …¶
The bot token is fine, but settings.NOTIFICATION_CHANNELS["workflow_runs"] is empty. Caused by step 4d going wrong:
- The variable is set as a secret instead of a variable — the most common mistake. Verify with
gh variable list --env <tenant-env>. - A different env var name was used. The exact name the workflow expects is
SLACK_CHANNEL_WORKFLOW_RUNS.
Posts succeed via the bot but the channel doesn't see them¶
You're posting to a channel ID, but the bot isn't a member of that channel and the channel is private — chat:write.public doesn't apply to private channels. Slack returns channel_not_found because, from the bot's perspective, the private channel doesn't exist.
/invite @campuscoreplatform in the channel.
Posts succeed but show as a bare username instead of the bot's display name¶
The CampusCorePlatform app already has a display name configured, so this shouldn't happen for our workspace. If you do see it (e.g., after reinstalling the app from scratch): api.slack.com/apps → CampusCorePlatform → App Home → Edit display info — give the bot a name and avatar. Cosmetic only; doesn't affect functionality.
users_conversations returns an empty list even though the bot was invited¶
Slack's users_conversations API only returns channels with members the OAuth user can see. If the bot was invited but the workspace owner restricted the bot's discovery, it can post to a channel it doesn't appear in via this API. Not a bug — just an artifact of the API's scoping. Posts will still work.
Slack channel ID changed (e.g., archived and recreated)¶
Channel IDs are stable for the lifetime of a channel — archive + unarchive keeps the same ID, but delete + recreate produces a new one. If you ever recreate a channel:
Redeploy. The old ID stops resolving.
What we deliberately don't recommend¶
- Sharing one Slack channel across tenants. Tempting for the first one or two universities, but it makes routing per-tenant alerts impossible later — you can't filter the channel feed by tenant since the message body is the only discriminator. One channel per GitHub Environment is the model.
- Setting
SLACK_CHANNEL_WORKFLOW_RUNSas a secret. Channel IDs aren't credentials. Putting them in the secrets namespace also breaks the workflow (which reads them asvars.). Use variables. - Using
#channel-nameinstead ofC0XXXXXXXX. The Slack API requires the ID. Names look stable but they aren't (channels can be renamed); IDs are. - Creating a separate Slack app per tenant. One workspace, one
CampusCorePlatformapp, one token. Per-tenant isolation comes from the channel boundary, not the bot identity. - Embedding the bot token in
.envcommitted to the repo. Even though.gitignoreexcludes.env, a developer runninggit add -Aonce is enough to leak it. Always inject via the GitHub Environment, never via the local.env_sampleshape.
Adding more notification channels later¶
Today the only channel we use is workflow_runs. If you later add (say) index_alerts for trigger-only notifications:
- Add to
settings.NOTIFICATION_CHANNELS(one line): - Add the corresponding
TF_VAR_*echo in.github/workflows/deploy-aws.ymldeploy-appjob - Add a
variable "slack_channel_index_alerts"declaration ininfrastructure/app/variables.tf - Add the new env var in
infrastructure/app/ecs.tfshared_env - Per tenant: create the channel, invite the bot, copy the ID,
gh variable set SLACK_CHANNEL_INDEX_ALERTS --env <tenant-env> --body 'C0YYY'
The settings dict is the single source of truth — any call site that wants to post somewhere new just calls notify_workflow_event(channel_key="index_alerts", …) and resolution flows through settings.NOTIFICATION_CHANNELS.
Appendix: App setup reference¶
The CampusCorePlatform Slack app is already installed and shouldn't need to be recreated. This appendix exists only as a reference for "if we ever start a fresh workspace" — read in case of total workspace rebuild, never as a routine setup step.
In api.slack.com/apps:
- Create New App → From scratch
- App name:
CampusCorePlatform(the display name shown in channel members) - Workspace: CampusCore
- OAuth & Permissions → Scopes → Bot Token Scopes, add:
chat:write— required, post messages to channels the bot is a member ofchat:write.public— post to public channels without being invited (convenience; we still recommend explicit invites for audit)channels:read— list public channels (used by diagnostic scripts)groups:read— list private channels the bot is a member of (used by diagnostic scripts)- App Home → Edit the bot display name and (optionally) avatar
- Install to Workspace → Authorize
- Copy the Bot User OAuth Token from OAuth & Permissions. Format:
xoxb-<numbers>-<numbers>-<random>. Store as the repo-levelSLACK_BOT_TOKENsecret per Step 2 above.
No user-token scopes are needed. The app never acts as a real user.