Agent & Retrieval System¶
The agentic RAG pipeline is the core of CampusCore's chat experience. A ReAct-style reasoning agent decides what tools to call, observes results, and generates a grounded response. This document covers the agent architecture, retrieval tools, and attachment handling.
Architecture Overview¶
User message
│
▼
ChatService.generate_response()
│
├── 1. Get/create conversation
├── 2. Link file attachments
├── 3. Save user message
├── 4. Build system prompt (AppConfig + guardrails)
└── 5. agent.run()
│
├── Fast-path check (greetings, thanks → skip agent loop)
├── Knowledge routing (auto-discover relevant folders)
├── Build attachment context (files + folders → system prompt)
├── Wire tools with user/conversation context
└── ReAct loop:
├── LLM call → tool calls or text response
├── Execute tools → observe results
├── Token management (prevent context overflow)
└── Repeat until done or max iterations
Key Files¶
| File | Role |
|---|---|
services/chat_service.py |
Orchestrates chat flow, bridges views and agent |
services/agent/agent_core.py |
ReAct reasoning loop, tool wiring, message building |
services/agent/tools/search_attachments.py |
Search/retrieve user-provided files and folders |
services/agent/tools/search_knowledge.py |
Search the institutional knowledge base |
services/agent/tools/search_knowledge_folder.py |
Direct folder search (used when auto-routing is inactive) |
services/agent/tools/utilities.py |
GetCurrentDateTool |
services/agent/tools/connector_tools.py |
ConnectorActionTool (external integrations) |
services/agent/tools/base.py |
BaseTool, ToolRegistry, create_default_registry() |
services/agent/middleware/ |
Token management, fast-path detection, knowledge routing |
services/agent/events.py |
Streaming event types for real-time UI |
prompts/agent_system_prompt.py |
Agent system prompt template |
prompts/prompt_builder.py |
Reads AppConfig, formats the template |
Two Retrieval Tools¶
The agent has two separate retrieval tools that map to two fundamentally different concerns:
search_attachments — User-Provided Context¶
Handles everything the user explicitly brought into the conversation: uploaded files and selected knowledge folders.
Parameters:
| Param | Type | Description |
|---|---|---|
query |
str |
Search query (required) |
retrieve_all |
bool |
When True, return all content in document order instead of semantic search. For summaries/overviews. Default False. |
scope |
str \| None |
Optional filename or folder name to target. Omit for all attachments. |
num_results |
int |
Max results for search mode (1-20, default 10). |
Two modes:
- Search mode (
retrieve_all=False): Hybrid semantic + full-text search within attachment scope. UsesRetrievalService.retrieve()with conversation and/or folder scoping, then reranks results. - Retrieve-all mode (
retrieve_all=True): Direct ORM query returning all chunks ordered by document grouping + position. No semantic ranking. Capped at 50 chunks (~25k tokens) with a truncation note if exceeded.
Scoping logic:
- No scope → search/retrieve across ALL attachments (files + folders)
- scope = filename → filter to that file's chunks
- scope = folder name → filter to that folder's chunks
Builder methods (called by agent_core.py before each run):
- with_user(user) — access control
- with_conversation_id(id) — scope to conversation's file attachments
- with_folder_ids(ids, names) — scope to user-selected folders
- with_attachment_filenames(filenames) — for scope name matching
search_knowledge — Institutional Knowledge Base¶
Searches the broader knowledge pool (website data, public/personal knowledge folders). No attachment awareness.
Parameters:
| Param | Type | Description |
|---|---|---|
query |
str |
Search query (required) |
num_results |
int |
Max results (default 10) |
Two execution paths:
- With auto-routed folders: KnowledgeRoutingMiddleware identifies relevant folders via embedding similarity → parallel folder search + broad search → rerank combined results.
- Broad search: No folder scoping, searches the full index.
Builder methods:
- with_user(user) — access control
- with_routed_folders(folder_ids, folder_names) — auto-routing context
- with_query_embedding(embedding) — pre-computed embedding for efficiency
When the agent uses which tool¶
| Scenario | Tool | Notes |
|---|---|---|
| No attachments, normal question | search_knowledge |
Auto-routing scopes to relevant folders |
| File attached, "summarize this" | search_attachments(retrieve_all=True) |
No scope → all attachments |
| File attached, specific question | search_attachments(query="...") |
Falls back to search_knowledge if insufficient |
| Folder attached, "give me a summary" | search_attachments(retrieve_all=True) |
No scope → all attachments |
| "What's in report.pdf?" | search_attachments(scope="report.pdf") |
Scoped to specific file |
| Attachment doesn't have answer | search_knowledge |
Agent calls as second tool |
Attachment Handling¶
File Attachments¶
Files are conversation-scoped (not per-message). The ConversationAttachment model tracks each file with a processing status pipeline: uploading → extracting → indexing → ready.
The uploaded_with_message FK records which message a file arrived with (display only, not a scope boundary). Once a file is ready, it's available for all subsequent messages in that conversation.
Folder Attachments¶
Folder selections are ephemeral — sent as query parameters per message, not persisted. The user selects which KnowledgeFolder IDs to scope to for each message.
Recency Context¶
When files are attached with the current message, the system prompt labels them distinctly:
## User Attachments
### Files
- report.pdf **<-- attached with this message**
- syllabus.docx
### Folders
- Financial Aid (id: 42)
This helps the agent distinguish "this file" (the newly attached one) from earlier attachments. The recency label only applies to deictic references ("this file", "the file I attached") — generic requests like "give me a summary" should cover all attachments without scoping.
Data flow for recency:
1. Frontend sends attachment_ids (IDs of files uploaded with this message) as a query param
2. ChatAttachmentService.get_filenames_by_ids() resolves IDs → display names
3. chat_service.py passes both attachment_filenames (all) and new_attachment_filenames (this message) to agent.run()
4. _build_attachment_context() labels new files and adds a recency hint
Document Indexing for Attachments¶
File attachments use lightweight metadata (filename as title, first 200 chars as summary) instead of LLM-generated metadata. This speeds up processing from ~12-18s to ~5s per file.
Attachment chunks are isolated via metadata filters:
- metadata.source_type = "conversation_attachment"
- metadata.conversation_id — scopes to the conversation
- metadata.user_id — scopes to the user
Knowledge Routing¶
KnowledgeRoutingMiddleware auto-discovers relevant knowledge folders for each query using embedding similarity. This runs on every non-fast-path query and wires matched folders into search_knowledge via with_routed_folders().
When auto-routing finds relevant folders, search_knowledge_folder is removed from the tool list (the folders are already wired into search_knowledge). When no folders match, search_knowledge falls back to broad index search.
System Prompt Design¶
The system prompt uses a two-message pattern for caching efficiency:
- Static system message — the full agent prompt template (cached across requests, ~5k tokens saved on cache hits)
- Dynamic system message — per-request context:
- Attachment context (files, folders, recency labels)
- Knowledge routing context (matched folders)
- Today's date
Streaming Events¶
The agent emits typed events during execution for real-time UI updates:
| Event | Purpose |
|---|---|
ThinkingEvent |
Agent reasoning step indicator |
ToolCallEvent |
Tool invocation with name and arguments |
ToolResultEvent |
Tool result (success/failure) |
TextEvent |
Response text (delta or full) |
DoneEvent |
Agent completed with summary |
ErrorEvent |
Error with recovery info |
Events are streamed as SSE-formatted strings through chat_service.py to the frontend, where agent-stream.js processes them.