Skip to content

Agent & Retrieval System

The agentic RAG pipeline is the core of CampusCore's chat experience. A ReAct-style reasoning agent decides what tools to call, observes results, and generates a grounded response. This document covers the agent architecture, retrieval tools, and attachment handling.

Architecture Overview

User message
ChatService.generate_response()
    ├── 1. Get/create conversation
    ├── 2. Link file attachments
    ├── 3. Save user message
    ├── 4. Build system prompt (AppConfig + guardrails)
    └── 5. agent.run()
            ├── Fast-path check (greetings, thanks → skip agent loop)
            ├── Knowledge routing (auto-discover relevant folders)
            ├── Build attachment context (files + folders → system prompt)
            ├── Wire tools with user/conversation context
            └── ReAct loop:
                 ├── LLM call → tool calls or text response
                 ├── Execute tools → observe results
                 ├── Token management (prevent context overflow)
                 └── Repeat until done or max iterations

Key Files

File Role
services/chat_service.py Orchestrates chat flow, bridges views and agent
services/agent/agent_core.py ReAct reasoning loop, tool wiring, message building
services/agent/tools/search_attachments.py Search/retrieve user-provided files and folders
services/agent/tools/search_knowledge.py Search the institutional knowledge base
services/agent/tools/search_knowledge_folder.py Direct folder search (used when auto-routing is inactive)
services/agent/tools/utilities.py GetCurrentDateTool
services/agent/tools/connector_tools.py ConnectorActionTool (external integrations)
services/agent/tools/base.py BaseTool, ToolRegistry, create_default_registry()
services/agent/middleware/ Token management, fast-path detection, knowledge routing
services/agent/events.py Streaming event types for real-time UI
prompts/agent_system_prompt.py Agent system prompt template
prompts/prompt_builder.py Reads AppConfig, formats the template

Two Retrieval Tools

The agent has two separate retrieval tools that map to two fundamentally different concerns:

search_attachments — User-Provided Context

Handles everything the user explicitly brought into the conversation: uploaded files and selected knowledge folders.

Parameters:

Param Type Description
query str Search query (required)
retrieve_all bool When True, return all content in document order instead of semantic search. For summaries/overviews. Default False.
scope str \| None Optional filename or folder name to target. Omit for all attachments.
num_results int Max results for search mode (1-20, default 10).

Two modes:

  • Search mode (retrieve_all=False): Hybrid semantic + full-text search within attachment scope. Uses RetrievalService.retrieve() with conversation and/or folder scoping, then reranks results.
  • Retrieve-all mode (retrieve_all=True): Direct ORM query returning all chunks ordered by document grouping + position. No semantic ranking. Capped at 50 chunks (~25k tokens) with a truncation note if exceeded.

Scoping logic: - No scope → search/retrieve across ALL attachments (files + folders) - scope = filename → filter to that file's chunks - scope = folder name → filter to that folder's chunks

Builder methods (called by agent_core.py before each run): - with_user(user) — access control - with_conversation_id(id) — scope to conversation's file attachments - with_folder_ids(ids, names) — scope to user-selected folders - with_attachment_filenames(filenames) — for scope name matching

search_knowledge — Institutional Knowledge Base

Searches the broader knowledge pool (website data, public/personal knowledge folders). No attachment awareness.

Parameters:

Param Type Description
query str Search query (required)
num_results int Max results (default 10)

Two execution paths: - With auto-routed folders: KnowledgeRoutingMiddleware identifies relevant folders via embedding similarity → parallel folder search + broad search → rerank combined results. - Broad search: No folder scoping, searches the full index.

Builder methods: - with_user(user) — access control - with_routed_folders(folder_ids, folder_names) — auto-routing context - with_query_embedding(embedding) — pre-computed embedding for efficiency

When the agent uses which tool

Scenario Tool Notes
No attachments, normal question search_knowledge Auto-routing scopes to relevant folders
File attached, "summarize this" search_attachments(retrieve_all=True) No scope → all attachments
File attached, specific question search_attachments(query="...") Falls back to search_knowledge if insufficient
Folder attached, "give me a summary" search_attachments(retrieve_all=True) No scope → all attachments
"What's in report.pdf?" search_attachments(scope="report.pdf") Scoped to specific file
Attachment doesn't have answer search_knowledge Agent calls as second tool

Attachment Handling

File Attachments

Files are conversation-scoped (not per-message). The ConversationAttachment model tracks each file with a processing status pipeline: uploading → extracting → indexing → ready.

The uploaded_with_message FK records which message a file arrived with (display only, not a scope boundary). Once a file is ready, it's available for all subsequent messages in that conversation.

Folder Attachments

Folder selections are ephemeral — sent as query parameters per message, not persisted. The user selects which KnowledgeFolder IDs to scope to for each message.

Recency Context

When files are attached with the current message, the system prompt labels them distinctly:

## User Attachments

### Files
- report.pdf **<-- attached with this message**
- syllabus.docx

### Folders
- Financial Aid (id: 42)

This helps the agent distinguish "this file" (the newly attached one) from earlier attachments. The recency label only applies to deictic references ("this file", "the file I attached") — generic requests like "give me a summary" should cover all attachments without scoping.

Data flow for recency: 1. Frontend sends attachment_ids (IDs of files uploaded with this message) as a query param 2. ChatAttachmentService.get_filenames_by_ids() resolves IDs → display names 3. chat_service.py passes both attachment_filenames (all) and new_attachment_filenames (this message) to agent.run() 4. _build_attachment_context() labels new files and adds a recency hint

Document Indexing for Attachments

File attachments use lightweight metadata (filename as title, first 200 chars as summary) instead of LLM-generated metadata. This speeds up processing from ~12-18s to ~5s per file.

Attachment chunks are isolated via metadata filters: - metadata.source_type = "conversation_attachment" - metadata.conversation_id — scopes to the conversation - metadata.user_id — scopes to the user

Knowledge Routing

KnowledgeRoutingMiddleware auto-discovers relevant knowledge folders for each query using embedding similarity. This runs on every non-fast-path query and wires matched folders into search_knowledge via with_routed_folders().

When auto-routing finds relevant folders, search_knowledge_folder is removed from the tool list (the folders are already wired into search_knowledge). When no folders match, search_knowledge falls back to broad index search.

System Prompt Design

The system prompt uses a two-message pattern for caching efficiency:

  1. Static system message — the full agent prompt template (cached across requests, ~5k tokens saved on cache hits)
  2. Dynamic system message — per-request context:
  3. Attachment context (files, folders, recency labels)
  4. Knowledge routing context (matched folders)
  5. Today's date

Streaming Events

The agent emits typed events during execution for real-time UI updates:

Event Purpose
ThinkingEvent Agent reasoning step indicator
ToolCallEvent Tool invocation with name and arguments
ToolResultEvent Tool result (success/failure)
TextEvent Response text (delta or full)
DoneEvent Agent completed with summary
ErrorEvent Error with recovery info

Events are streamed as SSE-formatted strings through chat_service.py to the frontend, where agent-stream.js processes them.