diff --git a/MultiAgentQnA/.gitignore b/MultiAgentQnA/.gitignore new file mode 100644 index 0000000000..0dffe08884 --- /dev/null +++ b/MultiAgentQnA/.gitignore @@ -0,0 +1,64 @@ +# Python +__pycache__/ +*.py[cod] +*$py.class +*.so +.Python +env/ +venv/ +ENV/ +build/ +develop-eggs/ +dist/ +downloads/ +eggs/ +.eggs/ +lib/ +lib64/ +parts/ +sdist/ +var/ +wheels/ +*.egg-info/ +.installed.cfg +*.egg +*.log + +# Node +node_modules/ +npm-debug.log* +yarn-debug.log* +yarn-error.log* +.pnpm-debug.log* +package-lock.json +dist/ +.cache/ + +# Environment +.env +.env.local +.env.*.local + +# IDEs +.vscode/ +.idea/ +*.swp +*.swo +*~ + +# OS +.DS_Store +Thumbs.db + +# Build +build/ +*.so + +# Testing +.coverage +.pytest_cache/ +htmlcov/ + +# Application specific +rag_index/ + diff --git a/MultiAgentQnA/QUICKSTART.md b/MultiAgentQnA/QUICKSTART.md new file mode 100644 index 0000000000..4270bce82c --- /dev/null +++ b/MultiAgentQnA/QUICKSTART.md @@ -0,0 +1,201 @@ +# Quick Start Guide + +Get up and running with the Multi-Agent Q&A application in 5 minutes! + +## Prerequisites + +- Docker and Docker Compose installed +- Valid enterprise-inference API credentials + +## Step-by-Step Setup + +### 1. Configure Environment Variables + +Create the API environment file: + +```bash +cd multiagent-qna +cp api/env.example api/.env +``` + +Edit `api/.env` with your credentials: + +```env +BASE_URL=https://your-enterprise-inference-url.com +KEYCLOAK_CLIENT_ID=your_client_id +KEYCLOAK_CLIENT_SECRET=your_client_secret +EMBEDDING_MODEL_ENDPOINT=bge-base-en-v1.5 +INFERENCE_MODEL_ENDPOINT=Llama-3.1-8B-Instruct +EMBEDDING_MODEL_NAME=bge-base-en-v1.5 +INFERENCE_MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct +``` + +### 2. Start with Docker Compose + +```bash +cd multiagent-qna +docker-compose up --build +``` + +Wait for both services to start: +- Backend API on http://localhost:5001 +- Frontend UI on http://localhost:3000 + +### 3. Access the Application + +Open your browser and navigate to: + +``` +http://localhost:3000 +``` + +You should see the Multi-Agent Q&A interface! + +## Alternative: Local Development + +### Backend + +```bash +cd multiagent-qna/api + +# Create virtual environment (recommended) +python -m venv venv +source venv/bin/activate # On Windows: venv\Scripts\activate + +# Install dependencies +pip install -r requirements.txt + +# Run the server +uvicorn server:app --reload --host 0.0.0.0 --port 5001 +``` + +### Frontend + +```bash +cd multiagent-qna/ui + +# Install dependencies +npm install + +# Run development server +npm run dev +``` + +## Testing the Application + +### Test the Chat Interface + +1. Navigate to the Chat page +2. Type a question, for example: + - **Code**: "How do I create a Python function?" + - **RAG**: "Find information about machine learning" + - **General**: "What is the weather like?" + +3. The system will automatically route your question to the appropriate agent + +### Test the Settings + +1. Click on "Settings" in the header +2. Modify agent configurations: + - Change roles, goals, or backstories + - Adjust max iterations + - Toggle verbose mode +3. Click "Save Configuration" +4. Test with new questions + +## Verify Everything Works + +### Check API Health + +```bash +curl http://localhost:5001/health +``` + +Expected response: +```json +{ + "status": "healthy", + "api_configured": true +} +``` + +### Test Chat Endpoint + +```bash +curl -X POST http://localhost:5001/chat \ + -H "Content-Type: application/json" \ + -d '{"message": "Hello!"}' +``` + +Expected response: +```json +{ + "response": "Response from agent...", + "agent": "normal_agent" +} +``` + +## Common First-Time Issues + +### Port Already in Use + +**Error**: `Address already in use` + +**Solution**: +```bash +# Find and kill process on port 5001 +lsof -ti:5001 | xargs kill -9 + +# Find and kill process on port 3000 +lsof -ti:3000 | xargs kill -9 + +# Or change ports in docker-compose.yml +``` + +### Cannot Connect to Enterprise API + +**Error**: `Authentication failed` + +**Solution**: +1. Double-check your `.env` credentials +2. Verify BASE_URL is correct +3. Ensure network access to enterprise-inference API + +### UI Shows "Failed to get response" + +**Solution**: +1. Check backend logs: `docker logs multiagent-qna-backend` +2. Verify API is running: `curl http://localhost:5001/health` +3. Check browser console for errors + +## Next Steps + +- Read the [README.md](README.md) for detailed documentation +- Check [TROUBLESHOOTING.md](TROUBLESHOOTING.md) for more help +- Customize agent configurations in the Settings page +- Integrate with your own knowledge bases or APIs + +## Architecture Overview + +``` +User Query + ↓ +Orchestration Agent (routes to appropriate specialist) + ↓ +┌─────────────────────────────────────────┐ +│ │ +├─ Code Agent ─── For programming Q&A │ +├─ RAG Agent ──── For document retrieval │ +└─ Normal Agent ── For general questions │ +``` + +The system automatically detects query type and routes to the best agent! + +## Need Help? + +- Check logs: `docker logs multiagent-qna-backend` +- Review [TROUBLESHOOTING.md](TROUBLESHOOTING.md) +- Verify environment variables are set correctly + +Happy chatting! 🚀 + diff --git a/MultiAgentQnA/README.md b/MultiAgentQnA/README.md new file mode 100644 index 0000000000..7445577dae --- /dev/null +++ b/MultiAgentQnA/README.md @@ -0,0 +1,300 @@ +## Multi-Agent Q&A Application + +A sophisticated multi-agent Q&A application featuring intelligent task delegation to specialized agents with enterprise inference integration. + +## Table of Contents + +- [Project Overview](#project-overview) +- [Features](#features) +- [Architecture](#architecture) +- [Prerequisites](#prerequisites) +- [Quick Start Deployment](#quick-start-deployment) +- [User Interface](#user-interface) +- [Troubleshooting](#troubleshooting) +- [Additional Info](#additional-info) + +--- +## Project Overview + +The multiagent-qna project is a sophisticated Question & Answer application built on a multi-agent architecture. Its core function is to receive a user's query, intelligently determine the nature of the question, and delegate it to the most suitable specialized agent for generating a high-quality response. The system is designed for enterprise environments, integrating with enterprise-grade inference APIs for its language model interactions. + +--- + +## Features + +- **Multi-Agent Architecture**: Orchestration agent that intelligently routes queries to specialized agents +- **Specialized Agents**: + - **Code Agent**: Handles code-related questions and programming queries + - **RAG Agent**: Retrieves and answers questions from documents + - **Normal Agent**: Handles general questions and conversations +- **Modern UI**: ChatGPT-like interface with settings management +- **Enterprise Integration**: Uses enterprise-inference API for LLM interactions +- **Configurable**: Easily configure agent roles, goals, and behavior via UI + +--- + +## Architecture + +Below is the multi-agent system architecture showing how user queries are intelligently routed to specialized agents. The orchestration layer analyzes incoming queries using keyword detection and delegates them to the appropriate agent (Code, RAG, or Normal) for processing, ensuring each query is handled by the most qualified specialist. + +```mermaid +graph TD + A[User Interface] -->|Query| B[FastAPI Backend] + B --> C[Orchestration Logic] + C -->|Keyword Analysis| D{Agent Router} + + D -->|Code Keywords| E[Code Agent] + D -->|RAG Keywords| F[RAG Agent] + D -->|Default| G[Normal Agent] + + F -->|Search| H[FAISS Vector Store] + H -->|Context| F + + E -->|Prompt + Role| I[LLM API] + F -->|Prompt + Role + Context| I + G -->|Prompt + Role| I + + I -->|Response| B + B -->|Answer| A + + J[PDF Upload] -->|Documents| H + + style C fill:#e1f5ff + style D fill:#fff4e1 + style E fill:#ffe1e1 + style F fill:#e1ffe1 + style G fill:#f0e1ff + style I fill:#ffebe1 +``` + +The application consists of: +1. **Orchestration Agent**: Analyzes user queries and delegates to appropriate specialized agents +2. **Specialized Agents**: Each handles a specific domain (code, RAG, general) +3. **API Layer**: FastAPI backend with enterprise-inference integration +4. **UI**: React-based chat interface with settings management + +**Service Components:** + +1. **React Web UI (Port 3000)** - Provides ChatGPT-like interface with settings management for configuring agent roles, goals, and behavior + +2. **FastAPI Backend (Port 5001)** - Orchestrates multi-agent system, analyzes queries using keyword detection, delegates to specialized agents (Code, RAG, Normal), and manages FAISS vector store for document retrieval + +**Typical Flow:** + +1. User submits a query through the chat interface. +2. The FastAPI backend receives the query and passes it to the orchestration logic. +3. The orchestration logic analyzes the query using keyword detection to determine intent. +4. The agent router delegates the query to the appropriate specialized agent: + - **Code-related queries** (keywords: code, function, debug, etc.) → Code Agent + - **Document-based queries** (keywords: document, PDF, search, etc.) → RAG Agent + - **General queries** → Normal Agent +5. If RAG Agent is selected: + - Searches the FAISS vector store for relevant document context + - Retrieves matching chunks to augment the prompt +6. The selected agent constructs a specialized prompt with its role, goal, and context. +7. The agent calls the enterprise LLM API with the pre-configured token and specialized prompt. +8. The LLM generates a response tailored to the agent's expertise. +9. The response is returned to the user via the UI with agent attribution showing which specialist handled the query. + +--- + +## Prerequisites + +### System Requirements + +Before you begin, ensure you have the following installed: + +- **Docker and Docker Compose** +- **Enterprise inference endpoint access** (token-based authentication) + +### Required API Configuration + +**For Inference Service:** + +This application supports multiple inference deployment patterns: + +- **GenAI Gateway**: Provide your GenAI Gateway URL and API key +- **APISIX Gateway**: Provide your APISIX Gateway URL and authentication token + +Configuration requirements: +- INFERENCE_API_ENDPOINT: URL to your inference service (GenAI Gateway, APISIX Gateway, etc.) +- INFERENCE_API_TOKEN: Authentication token/API key for your chosen service + +### Verify Docker Installation + +```bash +# Check Docker version +docker --version + +# Check Docker Compose version +docker compose version + +# Verify Docker is running +docker ps +``` +--- + +## Quick Start Deployment + +### Clone the Repository + +```bash +git clone https://github.com/opea-project/GenAIExamples.git +cd GenAIExamples/MultiAgentQnA +``` + +### Set up the Environment + +This application requires **two `.env` files** for proper configuration: + +1. **Root `.env` file** (for Docker Compose variables) +2. **`api/.env` file** (for backend application configuration) + +#### Step 1: Create Root `.env` File + +```bash +# From the MultiAgentQnA directory +cat > .env << EOF +# Docker Compose Configuration +LOCAL_URL_ENDPOINT=not-needed +EOF +``` + +**Note:** If using a local domain (e.g., `inference.example.com` mapped to localhost), replace `not-needed` with your domain name (without `https://`). + +#### Step 2: Create `api/.env` File + +You can either copy from the example file: + +```bash +cp api/.env.example api/.env +``` + +Then edit `api/.env` with your actual credentials, **OR** create it directly: + +```bash +cat > api/.env << EOF +# Inference API Configuration +# INFERENCE_API_ENDPOINT: URL to your inference service (without /v1 suffix) +# - For GenAI Gateway: https://genai-gateway.example.com +# - For APISIX Gateway: https://apisix-gateway.example.com/inference +INFERENCE_API_ENDPOINT=https://your-actual-api-endpoint.com +INFERENCE_API_TOKEN=your-actual-token-here + +# Model Configuration +# IMPORTANT: Use the full model names as they appear in your inference service +# Check available models: curl https://your-api-endpoint.com/v1/models -H "Authorization: Bearer your-token" +EMBEDDING_MODEL_NAME=BAAI/bge-base-en-v1.5 +INFERENCE_MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct + +# Local URL Endpoint (for Docker) +LOCAL_URL_ENDPOINT=not-needed +EOF +``` + +**Important Configuration Notes:** + +- **INFERENCE_API_ENDPOINT**: Your actual inference service URL (replace `https://your-actual-api-endpoint.com`) +- **INFERENCE_API_TOKEN**: Your actual pre-generated authentication token +- **EMBEDDING_MODEL_NAME** and **INFERENCE_MODEL_NAME**: Use the exact model names from your inference service + - To check available models: `curl https://your-api-endpoint.com/v1/models -H "Authorization: Bearer your-token"` +- **LOCAL_URL_ENDPOINT**: Only needed if using local domain mapping + +**Note**: The docker-compose.yml file automatically loads environment variables from both `.env` (root) and `./api/.env` (backend) files. + +### Running the Application + +Start both API and UI services together with Docker Compose: + +```bash +# From the MultiAgentQnA directory +docker compose up --build + +# Or run in detached mode (background) +docker compose up -d --build +``` + +The API will be available at: `http://localhost:5001` +The UI will be available at: `http://localhost:3000` + +**View logs**: + +```bash +# All services +docker compose logs -f + +# Backend only +docker compose logs -f backend + +# Frontend only +docker compose logs -f frontend +``` + +**Verify the services are running**: + +```bash +# Check API health +curl http://localhost:5001/health + +# Check if containers are running +docker compose ps +``` +--- + +## User Interface + +**Using the Application** + +Make sure you are at the `http://localhost:3000` URL + +You will be directed to the main page which has each feature + + + +### Chat Interface + +1. Navigate to the chat interface +2. Type your question in the input box +3. The orchestration agent will analyze your query and route it to the appropriate specialized agent +4. View the response with agent attribution + +### Settings Page + +Configure agents: +- **Orchestration Agent**: Role, goal, and backstory +- **Code Agent**: Code-specific configuration +- **RAG Agent**: Document retrieval settings +- **Normal Agent**: General conversation settings + +**UI Configuration** + +When running with Docker Compose, the UI automatically connects to the backend API. The frontend is available at `http://localhost:3000` and the API at `http://localhost:5001`. + +For production deployments, you may want to configure a reverse proxy or update the API URL in the frontend configuration. + +### Stopping the Application + + +```bash +docker compose down +``` + +--- + +## Troubleshooting + +For comprehensive troubleshooting guidance, common issues, and solutions, refer to: + +[TROUBLESHOOTING.md](TROUBLESHOOTING.md) + +--- + +## Additional Info + +The following models have been validated with MultiAgentQnA: + +| Model | Hardware | +|-------|----------| +| **meta-llama/Llama-3.1-8B-Instruct** | Gaudi | +| **Qwen/Qwen3-4B-Instruct-2507** | Xeon | +| **BAAI/bge-base-en-v1.5** (embeddings) | Gaudi | diff --git a/MultiAgentQnA/TROUBLESHOOTING.md b/MultiAgentQnA/TROUBLESHOOTING.md new file mode 100644 index 0000000000..2d6500321d --- /dev/null +++ b/MultiAgentQnA/TROUBLESHOOTING.md @@ -0,0 +1,289 @@ +# Troubleshooting Guide + +This document addresses common issues you may encounter when running the Multi-Agent Q&A application. + +## Table of Contents + +1. [Environment Setup](#environment-setup) +2. [API Issues](#api-issues) +3. [UI Issues](#ui-issues) +4. [Agent Issues](#agent-issues) +5. [Docker Issues](#docker-issues) + +--- + +## Environment Setup + +### Issue: Missing environment variables + +**Symptoms:** +- API fails to start +- Authentication errors +- "Configuration error" messages + +**Solution:** +1. Ensure you have created `api/.env` file based on `api/env.example` +2. Verify all required variables are set: + ```bash + BASE_URL=https://your-api-url.com + KEYCLOAK_CLIENT_ID=your_client_id + KEYCLOAK_CLIENT_SECRET=your_client_secret + ``` +3. Check that the `.env` file is in the `api/` directory + +### Issue: Python dependencies not installing + +**Symptoms:** +- Import errors +- "Module not found" errors + +**Solution:** +```bash +cd api +pip install -r requirements.txt +``` + +--- + +## API Issues + +### Issue: Authentication fails + +**Symptoms:** +- "Authentication failed" errors +- 401 Unauthorized responses + +**Solution:** +1. Verify your `KEYCLOAK_CLIENT_SECRET` is correct +2. Check that `BASE_URL` points to the correct endpoint +3. Ensure network connectivity to the enterprise-inference API +4. Verify SSL certificate issues by checking if `verify=False` is needed in api_client.py + +### Issue: API returns empty responses + +**Symptoms:** +- No content in chat responses +- "Unexpected response structure" errors + +**Solution:** +1. Check the model endpoints in `.env`: + ```bash + EMBEDDING_MODEL_ENDPOINT=bge-base-en-v1.5 + INFERENCE_MODEL_ENDPOINT=Llama-3.1-8B-Instruct + ``` +2. Verify the inference API is returning valid responses +3. Check API logs for detailed error messages + +### Issue: API is slow to respond + +**Symptoms:** +- Long wait times for responses +- Timeout errors + +**Solution:** +1. Reduce `max_tokens` in agent configurations +2. Set `max_iter` to a lower value (e.g., 5-10 instead of 15) +3. Check network latency to enterprise-inference API + +--- + +## UI Issues + +### Issue: Cannot connect to API + +**Symptoms:** +- "Failed to get response" errors +- Network errors in browser console + +**Solution:** +1. Verify backend is running on port 5001: + ```bash + curl http://localhost:5001/health + ``` +2. Check Vite proxy configuration in `ui/vite.config.js` +3. Ensure CORS is properly configured in `api/server.py` + +### Issue: Page doesn't load + +**Symptoms:** +- Blank page +- JavaScript errors in console + +**Solution:** +1. Install dependencies: + ```bash + cd ui + npm install + ``` +2. Clear browser cache and reload +3. Check for console errors in browser DevTools + +--- + +## Agent Issues + +### Issue: Wrong agent is selected + +**Symptoms:** +- Code questions routed to general agent +- RAG questions not using retrieval + +**Solution:** +1. The routing is based on keyword detection in `agents.py` +2. Review `determine_agent_type()` function +3. Add custom keywords to improve routing accuracy + +### Issue: Agent configuration not saved + +**Symptoms:** +- Settings revert after saving +- Default configuration persists + +**Solution:** +1. Check browser console for API errors +2. Verify `/config` POST endpoint is working: + ```bash + curl -X POST http://localhost:5001/config \ + -H "Content-Type: application/json" \ + -d '{"orchestration": {"role": "Test"}}' + ``` + +--- + +## Docker Issues + +### Issue: Containers won't start + +**Symptoms:** +- `docker-compose up` fails +- Port already in use errors + +**Solution:** +1. Stop existing containers: + ```bash + docker-compose down + ``` +2. Check for port conflicts: + ```bash + # Check if ports are in use + lsof -i :5001 # API + lsof -i :3000 # UI + ``` +3. Rebuild containers: + ```bash + docker-compose up --build + ``` + +### Issue: Environment variables not loading + +**Symptoms:** +- Configuration defaults being used +- Authentication errors in containers + +**Solution:** +1. Ensure `.env` file exists in `api/` directory +2. Check docker-compose.yml references the env_file correctly: + ```yaml + env_file: + - ./api/.env + ``` +3. Restart containers after creating `.env`: + ```bash + docker-compose restart backend + ``` + +### Issue: Volume mounting not working + +**Symptoms:** +- Code changes not reflecting in containers +- Hot reload not working + +**Solution:** +1. Verify volume mounts in docker-compose.yml: + ```yaml + volumes: + - ./api:/app + ``` +2. Check file permissions +3. Restart containers after configuration changes + +--- + +## General Debugging Tips + +### Check Logs + +**Backend:** +```bash +# Docker +docker logs multiagent-qna-backend + +# Local +tail -f api/logs/app.log +``` + +**Frontend:** +```bash +# Docker +docker logs multiagent-qna-frontend + +# Local - check browser DevTools console +``` + +### Test API Endpoints + +```bash +# Health check +curl http://localhost:5001/health + +# Configuration +curl http://localhost:5001/config + +# Chat +curl -X POST http://localhost:5001/chat \ + -H "Content-Type: application/json" \ + -d '{"message": "Hello!"}' +``` + +### Common Python Debugging + +```python +# Add verbose logging +import logging +logging.basicConfig(level=logging.DEBUG) + +# Check agent configuration +from services.agents import get_code_agent +agent = get_code_agent() +print(agent.role, agent.goal) +``` + +--- + +## Performance Optimization + +1. **Reduce Token Usage:** + - Lower `max_tokens` in API calls + - Reduce `max_iter` for agents + - Use caching where possible + +2. **Optimize Agent Routing:** + - Fine-tune keyword detection + - Consider using more sophisticated classification + +3. **Batch Processing:** + - Process multiple queries in parallel + - Use connection pooling for API calls + +--- + +## Still Having Issues? + +1. Check the [README.md](README.md) for setup instructions +2. Review error logs in detail +3. Verify all prerequisites are met +4. Ensure network connectivity to enterprise-inference API +5. Test API endpoints independently + +For additional help, please check the project documentation or contact support. + diff --git a/MultiAgentQnA/api/.env.example b/MultiAgentQnA/api/.env.example new file mode 100644 index 0000000000..b632f68128 --- /dev/null +++ b/MultiAgentQnA/api/.env.example @@ -0,0 +1,22 @@ +# Inference API Configuration +# INFERENCE_API_ENDPOINT: URL to your inference service (without /v1 suffix) +# - For GenAI Gateway: https://genai-gateway.example.com +# - For APISIX Gateway: https://apisix-gateway.example.com/inference +# +# INFERENCE_API_TOKEN: Authentication token/API key for the inference service +# - For GenAI Gateway: Your GenAI Gateway API key +# - For APISIX Gateway: Your APISIX authentication token +INFERENCE_API_ENDPOINT=https://api.example.com +INFERENCE_API_TOKEN=your-pre-generated-token-here + +# Model Configuration +# IMPORTANT: Use the full model names as they appear in your inference service +# Check available models: curl https://your-api-endpoint.com/v1/models -H "Authorization: Bearer your-token" +EMBEDDING_MODEL_NAME=BAAI/bge-base-en-v1.5 +INFERENCE_MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct + +# Local URL Endpoint (only needed for non-public domains) +# If using a local domain like inference.example.com mapped to localhost: +# Set this to: inference.example.com (domain without https://) +# If using a public domain, set any placeholder value like: not-needed +LOCAL_URL_ENDPOINT=not-needed diff --git a/MultiAgentQnA/api/Dockerfile b/MultiAgentQnA/api/Dockerfile new file mode 100644 index 0000000000..e957a3ba75 --- /dev/null +++ b/MultiAgentQnA/api/Dockerfile @@ -0,0 +1,20 @@ +FROM python:3.10-slim + +# Set the working directory in the container +WORKDIR /app + +# Copy requirements file +COPY requirements.txt . + +# Install Python dependencies +RUN pip install --no-cache-dir -r requirements.txt + +# Copy the rest of the application files into the container +COPY . . + +# Expose the port the service runs on +EXPOSE 5001 + +# Command to run the application +CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "5001", "--reload"] + diff --git a/MultiAgentQnA/api/__init__.py b/MultiAgentQnA/api/__init__.py new file mode 100644 index 0000000000..25f9058b60 --- /dev/null +++ b/MultiAgentQnA/api/__init__.py @@ -0,0 +1,4 @@ +""" +Multi-Agent Q&A API +""" + diff --git a/MultiAgentQnA/api/config.py b/MultiAgentQnA/api/config.py new file mode 100644 index 0000000000..8d48da0b7a --- /dev/null +++ b/MultiAgentQnA/api/config.py @@ -0,0 +1,36 @@ +""" +Configuration settings for Multi-Agent Q&A API +""" + +import os +from dotenv import load_dotenv + +# Load environment variables from .env file +load_dotenv() + +# Inference API Configuration +# Supports multiple inference deployment patterns: +# - GenAI Gateway: Provide your GenAI Gateway URL and API key +# - APISIX Gateway: Provide your APISIX Gateway URL and authentication token +INFERENCE_API_ENDPOINT = os.getenv("INFERENCE_API_ENDPOINT", "https://api.example.com") +INFERENCE_API_TOKEN = os.getenv("INFERENCE_API_TOKEN") + +# Model Configuration +EMBEDDING_MODEL_NAME = os.getenv("EMBEDDING_MODEL_NAME", "bge-base-en-v1.5") +INFERENCE_MODEL_NAME = os.getenv("INFERENCE_MODEL_NAME", "meta-llama/Llama-3.1-8B-Instruct") + +# Validate required configuration +if not INFERENCE_API_ENDPOINT or not INFERENCE_API_TOKEN: + raise ValueError("INFERENCE_API_ENDPOINT and INFERENCE_API_TOKEN must be set in environment variables") + +# Application Settings +APP_TITLE = "Multi-Agent Q&A" +APP_DESCRIPTION = "A multi-agent Q&A system using CrewAI" +APP_VERSION = "1.0.0" + +# CORS Settings +CORS_ALLOW_ORIGINS = ["*"] # Update with specific origins in production +CORS_ALLOW_CREDENTIALS = True +CORS_ALLOW_METHODS = ["*"] +CORS_ALLOW_HEADERS = ["*"] + diff --git a/MultiAgentQnA/api/env.example b/MultiAgentQnA/api/env.example new file mode 100644 index 0000000000..c3630cfacf --- /dev/null +++ b/MultiAgentQnA/api/env.example @@ -0,0 +1,15 @@ +# Enterprise Inference API Configuration +BASE_URL=https://api.example.com +KEYCLOAK_REALM=master +KEYCLOAK_CLIENT_ID=api +KEYCLOAK_CLIENT_SECRET=your_secret_here + +# Model Endpoints +EMBEDDING_MODEL_ENDPOINT=bge-base-en-v1.5 +INFERENCE_MODEL_ENDPOINT=Llama-3.1-8B-Instruct +EMBEDDING_MODEL_NAME=bge-base-en-v1.5 +INFERENCE_MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct + +# Optional: OpenAI API Key (alternative to enterprise-inference) +# OPENAI_API_KEY=sk-... + diff --git a/MultiAgentQnA/api/models.py b/MultiAgentQnA/api/models.py new file mode 100644 index 0000000000..94a7f3e69c --- /dev/null +++ b/MultiAgentQnA/api/models.py @@ -0,0 +1,70 @@ +""" +Pydantic models for request/response validation +""" + +from pydantic import BaseModel, Field +from typing import Optional, Dict, Any + + +class ChatMessage(BaseModel): + """Individual chat message""" + role: str = Field(..., description="Message role (user or assistant)") + content: str = Field(..., description="Message content") + agent: Optional[str] = Field(None, description="Agent that generated the response") + + +class ChatRequest(BaseModel): + """Request model for chat messages""" + message: str = Field(..., min_length=1, description="User message") + agent_config: Optional[Dict[str, Any]] = Field(None, description="Optional agent configuration") + + class Config: + json_schema_extra = { + "example": { + "message": "How do I implement a recursive function in Python?", + "agent_config": None + } + } + + +class ChatResponse(BaseModel): + """Response model for chat messages""" + response: str = Field(..., description="Agent response") + agent: str = Field(..., description="Agent that generated the response") + + class Config: + json_schema_extra = { + "example": { + "response": "A recursive function calls itself...", + "agent": "code_agent" + } + } + + +class AgentConfig(BaseModel): + """Configuration for a single agent""" + role: str = Field(..., description="Agent role") + goal: str = Field(..., description="Agent goal") + backstory: str = Field(..., description="Agent backstory") + max_iter: Optional[int] = Field(15, description="Maximum iterations") + verbose: Optional[bool] = Field(True, description="Verbose output") + + +class AgentConfigs(BaseModel): + """Configuration for all agents""" + orchestration: Optional[AgentConfig] = None + code: Optional[AgentConfig] = None + rag: Optional[AgentConfig] = None + normal: Optional[AgentConfig] = None + + +class HealthResponse(BaseModel): + """Response model for health check""" + status: str = Field(..., description="Health status") + api_configured: bool = Field(..., description="Whether API is configured") + + +class ConfigResponse(BaseModel): + """Response model for configuration""" + config: Dict[str, Any] = Field(..., description="Current agent configuration") + diff --git a/MultiAgentQnA/api/rag_index/documents.pkl b/MultiAgentQnA/api/rag_index/documents.pkl new file mode 100644 index 0000000000..f3cb9736fe Binary files /dev/null and b/MultiAgentQnA/api/rag_index/documents.pkl differ diff --git a/MultiAgentQnA/api/rag_index/index.faiss b/MultiAgentQnA/api/rag_index/index.faiss new file mode 100644 index 0000000000..93684dd23d Binary files /dev/null and b/MultiAgentQnA/api/rag_index/index.faiss differ diff --git a/MultiAgentQnA/api/requirements.txt b/MultiAgentQnA/api/requirements.txt new file mode 100644 index 0000000000..a040c46117 --- /dev/null +++ b/MultiAgentQnA/api/requirements.txt @@ -0,0 +1,14 @@ +fastapi>=0.109.0 +uvicorn[standard]>=0.27.0 +python-dotenv>=1.0.0 +openai>=1.10.0 +python-multipart>=0.0.6 +pydantic>=2.5.0 +pydantic-settings>=2.1.0 +cryptography>=3.1.0 +httpx>=0.24.0 +requests>=2.31.0 +pypdf>=3.17.0 +faiss-cpu>=1.7.4 +numpy>=1.24.0 + diff --git a/MultiAgentQnA/api/server.py b/MultiAgentQnA/api/server.py new file mode 100644 index 0000000000..2a7fc0db2c --- /dev/null +++ b/MultiAgentQnA/api/server.py @@ -0,0 +1,290 @@ +""" +FastAPI server with routes for Multi-Agent Q&A API +""" + +import logging +import os +import tempfile +from contextlib import asynccontextmanager +from fastapi import FastAPI, HTTPException, status, UploadFile, File +from fastapi.middleware.cors import CORSMiddleware + +import config +from models import ( + ChatRequest, ChatResponse, HealthResponse, + AgentConfigs, ConfigResponse +) +from services import process_query, update_agent_configs +from services.rag_service import get_rag_service + +# Configure logging +logging.basicConfig( + level=logging.INFO, + format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' +) +logger = logging.getLogger(__name__) + +# In-memory log store for agent activity +agent_activity_logs = [] + + +@asynccontextmanager +async def lifespan(app: FastAPI): + """Lifespan context manager for FastAPI app""" + # Startup + logger.info("Initializing Multi-Agent Q&A API") + + yield + + # Shutdown + logger.info("Shutting down Multi-Agent Q&A API") + + +# Initialize FastAPI app +app = FastAPI( + title=config.APP_TITLE, + description=config.APP_DESCRIPTION, + version=config.APP_VERSION, + lifespan=lifespan +) + +# Add CORS middleware +app.add_middleware( + CORSMiddleware, + allow_origins=config.CORS_ALLOW_ORIGINS, + allow_credentials=config.CORS_ALLOW_CREDENTIALS, + allow_methods=config.CORS_ALLOW_METHODS, + allow_headers=config.CORS_ALLOW_HEADERS, +) + + +# ==================== Routes ==================== + +@app.get("/") +def root(): + """Health check endpoint""" + return { + "message": "Multi-Agent Q&A API is running", + "version": config.APP_VERSION, + "status": "healthy" + } + + +@app.get("/health", response_model=HealthResponse) +def health_check(): + """Detailed health check""" + api_configured = bool(config.INFERENCE_API_TOKEN) + return HealthResponse( + status="healthy", + api_configured=api_configured + ) + + +@app.post("/chat", response_model=ChatResponse) +def chat_endpoint(request: ChatRequest): + """ + Process a chat message using the multi-agent system + + - **message**: User message/query + - **agent_config**: Optional agent configuration override + """ + if not request.message or not request.message.strip(): + raise HTTPException( + status_code=status.HTTP_400_BAD_REQUEST, + detail="Message cannot be empty" + ) + + try: + # Process the query with the appropriate agent + response, agent = process_query( + query=request.message, + agent_config=request.agent_config + ) + + return ChatResponse( + response=response, + agent=agent + ) + + except Exception as e: + logger.error(f"Error processing query: {str(e)}", exc_info=True) + raise HTTPException( + status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, + detail=f"Error processing query: {str(e)}" + ) + + +@app.get("/config", response_model=ConfigResponse) +def get_config(): + """Get current agent configuration""" + from services.agents import ( + DEFAULT_ORCHESTRATION_CONFIG, + DEFAULT_CODE_CONFIG, + DEFAULT_RAG_CONFIG, + DEFAULT_NORMAL_CONFIG + ) + + return ConfigResponse( + config={ + "orchestration": DEFAULT_ORCHESTRATION_CONFIG, + "code": DEFAULT_CODE_CONFIG, + "rag": DEFAULT_RAG_CONFIG, + "normal": DEFAULT_NORMAL_CONFIG + } + ) + + +@app.post("/config") +def update_config(configs: AgentConfigs): + """Update agent configurations""" + try: + config_dict = {} + + if configs.orchestration: + config_dict["orchestration"] = { + "role": configs.orchestration.role, + "goal": configs.orchestration.goal, + "backstory": configs.orchestration.backstory, + "max_iter": configs.orchestration.max_iter, + "verbose": configs.orchestration.verbose + } + + if configs.code: + config_dict["code"] = { + "role": configs.code.role, + "goal": configs.code.goal, + "backstory": configs.code.backstory, + "max_iter": configs.code.max_iter, + "verbose": configs.code.verbose + } + + if configs.rag: + config_dict["rag"] = { + "role": configs.rag.role, + "goal": configs.rag.goal, + "backstory": configs.rag.backstory, + "max_iter": configs.rag.max_iter, + "verbose": configs.rag.verbose + } + + if configs.normal: + config_dict["normal"] = { + "role": configs.normal.role, + "goal": configs.normal.goal, + "backstory": configs.normal.backstory, + "max_iter": configs.normal.max_iter, + "verbose": configs.normal.verbose + } + + update_agent_configs(config_dict) + + return {"message": "Configuration updated successfully", "status": "success"} + + except Exception as e: + logger.error(f"Error updating configuration: {str(e)}") + raise HTTPException( + status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, + detail=f"Error updating configuration: {str(e)}" + ) + + +@app.get("/logs") +def get_logs(): + """Get agent activity logs""" + from services.agents import activity_logs + + # Return last 100 logs + return { + "logs": activity_logs[-100:], + "total": len(activity_logs) + } + + +@app.post("/rag/upload") +def upload_pdf(file: UploadFile = File(...)): + """ + Upload a PDF file and build RAG index + + - **file**: PDF file to upload and index + """ + if not file.filename.endswith('.pdf'): + raise HTTPException( + status_code=status.HTTP_400_BAD_REQUEST, + detail="Only PDF files are supported" + ) + + try: + # Save uploaded file temporarily + with tempfile.NamedTemporaryFile(delete=False, suffix='.pdf') as tmp_file: + content = file.file.read() + tmp_file.write(content) + tmp_path = tmp_file.name + + try: + # Process PDF + rag_service = get_rag_service() + chunks = rag_service.process_pdf(tmp_path) + + # Build index + rag_service.build_index(chunks) + + logger.info(f"Successfully indexed {len(chunks)} chunks from {file.filename}") + + return { + "message": f"Successfully uploaded and indexed PDF: {file.filename}", + "filename": file.filename, + "chunks": len(chunks), + "status": "success" + } + + finally: + # Clean up temporary file + if os.path.exists(tmp_path): + os.unlink(tmp_path) + + except Exception as e: + logger.error(f"Error uploading PDF: {str(e)}") + raise HTTPException( + status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, + detail=f"Error processing PDF: {str(e)}" + ) + + +@app.get("/rag/status") +def get_rag_status(): + """Get RAG index status""" + try: + rag_service = get_rag_service() + status_info = rag_service.get_status() + return status_info + except Exception as e: + logger.error(f"Error getting RAG status: {str(e)}") + raise HTTPException( + status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, + detail=f"Error getting status: {str(e)}" + ) + + +@app.delete("/rag/index") +def delete_rag_index(): + """Delete the RAG index""" + try: + rag_service = get_rag_service() + deleted = rag_service.delete_index() + return { + "message": "RAG index deleted successfully" if deleted else "No index to delete", + "status": "success" + } + except Exception as e: + logger.error(f"Error deleting RAG index: {str(e)}") + raise HTTPException( + status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, + detail=f"Error deleting index: {str(e)}" + ) + + +# Entry point for running with uvicorn +if __name__ == "__main__": + import uvicorn + uvicorn.run(app, host="0.0.0.0", port=5001) + diff --git a/MultiAgentQnA/api/services/__init__.py b/MultiAgentQnA/api/services/__init__.py new file mode 100644 index 0000000000..9d574a1f0f --- /dev/null +++ b/MultiAgentQnA/api/services/__init__.py @@ -0,0 +1,25 @@ +""" +Services module for Multi-Agent Q&A +""" + +from .api_client import get_api_client, APIClient +from .agents import ( + get_orchestration_agent, + get_code_agent, + get_rag_agent, + get_normal_agent, + update_agent_configs, + process_query +) + +__all__ = [ + "get_api_client", + "APIClient", + "get_orchestration_agent", + "get_code_agent", + "get_rag_agent", + "get_normal_agent", + "update_agent_configs", + "process_query" +] + diff --git a/MultiAgentQnA/api/services/agents.py b/MultiAgentQnA/api/services/agents.py new file mode 100644 index 0000000000..f8e756c072 --- /dev/null +++ b/MultiAgentQnA/api/services/agents.py @@ -0,0 +1,303 @@ +""" +Multi-Agent Q&A System +Simplified agent implementation without CrewAI dependency +""" + +import logging +from typing import Dict, Any, Optional +from datetime import datetime +from services.api_client import get_api_client +from services.rag_service import get_rag_service + +logger = logging.getLogger(__name__) + +# Activity logs for agent interactions +activity_logs = [] + + +def add_activity_log(message: str, log_type: str = "info"): + """Add an activity log entry""" + from datetime import datetime + activity_logs.append({ + "timestamp": datetime.now().isoformat(), + "type": log_type, + "message": message + }) + # Keep only last 500 logs + if len(activity_logs) > 500: + activity_logs.pop(0) + +# Default configurations +DEFAULT_ORCHESTRATION_CONFIG = { + "role": "Orchestration Coordinator", + "goal": "Analyze user queries and delegate them to the most appropriate specialized agent", + "backstory": "You are an expert coordinator who understands different types of questions. You excel at categorizing queries and routing them to the right specialist: code questions to developers, document questions to researchers, and general questions to assistants.", + "max_tokens": 500, + "temperature": 0.7 +} + +DEFAULT_CODE_CONFIG = { + "role": "Senior Software Developer", + "goal": "Answer coding questions with accurate, practical, and well-explained solutions", + "backstory": "You are an experienced software engineer with expertise in multiple programming languages. You provide clear, working code examples and best practices.", + "max_tokens": 500, + "temperature": 0.5 +} + +DEFAULT_RAG_CONFIG = { + "role": "Research Assistant", + "goal": "Retrieve information from documents and provide accurate answers", + "backstory": "You are a skilled researcher who excels at finding relevant information from knowledge bases and synthesizing comprehensive answers.", + "max_tokens": 800, + "temperature": 0.7 +} + +DEFAULT_NORMAL_CONFIG = { + "role": "Helpful Assistant", + "goal": "Provide clear, accurate, and helpful answers to general questions", + "backstory": "You are a knowledgeable assistant who loves helping people with their questions. You provide thoughtful and informative responses.", + "max_tokens": 500, + "temperature": 0.7 +} + + +def get_orchestration_agent(config: Optional[Dict[str, Any]] = None) -> Dict[str, Any]: + """Get orchestration agent configuration""" + return config if config else DEFAULT_ORCHESTRATION_CONFIG + + +def get_code_agent(config: Optional[Dict[str, Any]] = None) -> Dict[str, Any]: + """Get code agent configuration""" + return config if config else DEFAULT_CODE_CONFIG + + +def get_rag_agent(config: Optional[Dict[str, Any]] = None) -> Dict[str, Any]: + """Get RAG agent configuration""" + return config if config else DEFAULT_RAG_CONFIG + + +def get_normal_agent(config: Optional[Dict[str, Any]] = None) -> Dict[str, Any]: + """Get normal agent configuration""" + return config if config else DEFAULT_NORMAL_CONFIG + + +def determine_agent_type(query: str, verbose: bool = True) -> tuple[str, str]: + """ + Determine which agent should handle the query + + Args: + query: User query + verbose: Whether to log the decision + + Returns: + Tuple of (agent_type, reasoning) + """ + query_lower = query.lower() + + # Check for code-related keywords + code_keywords = ['code', 'programming', 'function', 'variable', 'debug', 'error', + 'python', 'javascript', 'java', 'c++', 'git', 'repo', 'repository', + 'algorithm', 'data structure', 'api', 'syntax', 'compile', 'test'] + if any(keyword in query_lower for keyword in code_keywords): + matched_keywords = [kw for kw in code_keywords if kw in query_lower] + reasoning = f"Code keywords detected: {', '.join(matched_keywords[:3])}" + if verbose: + logger.info(f"🔍 ORCHESTRATION: {reasoning} → Routing to Code Agent") + add_activity_log(f"🔍 {reasoning} → Routing to Code Agent", "info") + return 'code', reasoning + + # Check for document/retrieval-related keywords + rag_keywords = ['document', 'file', 'pdf', 'retrieve', 'search', 'find in', + 'according to', 'read', 'extract', 'index'] + if any(keyword in query_lower for keyword in rag_keywords): + matched_keywords = [kw for kw in rag_keywords if kw in query_lower] + reasoning = f"RAG keywords detected: {', '.join(matched_keywords[:3])}" + if verbose: + logger.info(f"🔍 ORCHESTRATION: {reasoning} → Routing to RAG Agent") + add_activity_log(f"🔍 {reasoning} → Routing to RAG Agent", "info") + return 'rag', reasoning + + # Default to normal agent + reasoning = "No specialized keywords found, using general agent" + if verbose: + logger.info(f"🔍 ORCHESTRATION: {reasoning} → Routing to General Agent") + add_activity_log(f"🔍 {reasoning} → Routing to General Agent", "info") + return 'normal', reasoning + + +def process_query(query: str, agent_config: Optional[Dict[str, Any]] = None, verbose: bool = True) -> tuple[str, str]: + """ + Process a query using the appropriate agent with full logging + + Args: + query: User query + agent_config: Optional agent configuration + verbose: Whether to log agent interactions + + Returns: + Tuple of (response, agent_name) + """ + # Step 1: Orchestration - Determine which agent to use + if verbose: + logger.info("=" * 80) + logger.info("🎯 ORCHESTRATION: Analyzing user query") + logger.info(f"📝 Query: {query}") + logger.info("=" * 80) + add_activity_log("🎯 ORCHESTRATION: Analyzing user query", "info") + add_activity_log(f"📝 Query: {query}", "info") + + agent_type, reasoning = determine_agent_type(query, verbose=verbose) + + try: + # Get agent configurations if provided + code_config = None + rag_config = None + normal_config = None + + if agent_config: + code_config = agent_config.get("code") + rag_config = agent_config.get("rag") + normal_config = agent_config.get("normal") + + # Get the appropriate agent config + if agent_type == 'code': + agent_config_data = get_code_agent(code_config) + agent_name = 'code_agent' + elif agent_type == 'rag': + agent_config_data = get_rag_agent(rag_config) + agent_name = 'rag_agent' + else: + agent_config_data = get_normal_agent(normal_config) + agent_name = 'normal_agent' + + # Step 2: Agent configuration + if verbose: + logger.info("") + logger.info(f"🤖 AGENT SELECTED: {agent_name}") + logger.info(f" Role: {agent_config_data.get('role', 'Assistant')}") + logger.info(f" Goal: {agent_config_data.get('goal', 'Help the user')}") + logger.info("") + add_activity_log(f"🤖 AGENT SELECTED: {agent_name}", "info") + add_activity_log(f" Role: {agent_config_data.get('role', 'Assistant')}", "info") + + # Build the prompt with agent context + role = agent_config_data.get("role", "Assistant") + goal = agent_config_data.get("goal", "Help the user") + backstory = agent_config_data.get("backstory", "You are a helpful assistant") + + # Step 3: Agent processing + if verbose: + logger.info("💭 AGENT THINKING: Processing query with agent-specific context...") + add_activity_log("💭 AGENT THINKING: Processing query with agent-specific context...", "info") + + api_client = get_api_client() + + # Handle RAG agent differently - search documents first + if agent_type == 'rag': + try: + rag_service = get_rag_service() + if verbose: + add_activity_log("🔍 RAG: Searching document index...", "info") + + # Search for relevant documents + results = rag_service.search(query, k=3) + + if results: + # Build context from retrieved documents + context_parts = [] + for i, result in enumerate(results, 1): + doc_text = result['document']['text'] + similarity = result['similarity'] + context_parts.append(f"Document {i} (similarity: {similarity:.2f}):\n{doc_text}") + + context = "\n\n".join(context_parts) + + if verbose: + add_activity_log(f"📄 RAG: Found {len(results)} relevant documents", "info") + logger.info(f"RAG retrieval: Found {len(results)} relevant documents") + + system_prompt = f"""You are a {role}. + +Your goal: {goal} + +{backstory} + +Use the following retrieved documents to answer the question. If the documents don't contain relevant information, say so. + +Retrieved Documents: +{context} + +Now answer the following question based on the documents above:""" + else: + if verbose: + add_activity_log("⚠️ RAG: No documents found in index", "warning") + + system_prompt = f"""You are a {role}. + +Your goal: {goal} + +{backstory} + +Note: No documents are currently indexed. You cannot answer questions about documents until documents are uploaded. + +Now answer the following question:""" + except Exception as e: + logger.error(f"Error in RAG retrieval: {str(e)}") + if verbose: + add_activity_log(f"❌ RAG Error: {str(e)}", "error") + + # Fall back to normal prompt + system_prompt = f"""You are a {role}. + +Your goal: {goal} + +{backstory} + +Now answer the following question:""" + else: + # For non-RAG agents, use standard prompt + system_prompt = f"""You are a {role}. + +Your goal: {goal} + +{backstory} + +Now answer the following question:""" + + messages = [ + {"role": "system", "content": system_prompt}, + {"role": "user", "content": query} + ] + + response = api_client.chat_complete( + messages=messages, + max_tokens=agent_config_data.get("max_tokens", 500), + temperature=agent_config_data.get("temperature", 0.7) + ) + + if verbose: + logger.info(f"✅ AGENT RESPONSE: Generated {len(str(response))} characters") + logger.info("=" * 80) + logger.info("") + add_activity_log(f"✅ AGENT RESPONSE: Generated {len(str(response))} characters", "success") + + return str(response), agent_name + + except Exception as e: + error_msg = f"❌ ERROR: {str(e)}" + logger.error(error_msg, exc_info=True) + add_activity_log(error_msg, "error") + raise + + +def update_agent_configs(configs: Dict[str, Any]) -> None: + """ + Update all agent configurations (stored as defaults for future queries) + + Args: + configs: Dictionary containing agent configurations + """ + # In this simplified version, configs are used at query time + # This function exists for API compatibility + logger.info("Agent configurations updated") + diff --git a/MultiAgentQnA/api/services/api_client.py b/MultiAgentQnA/api/services/api_client.py new file mode 100644 index 0000000000..833afdafbb --- /dev/null +++ b/MultiAgentQnA/api/services/api_client.py @@ -0,0 +1,185 @@ +""" +API Client for authentication and API calls +Similar to rag-chatbot implementation +""" + +import logging +import requests +import httpx +from typing import Optional +import config + +logger = logging.getLogger(__name__) + + +class APIClient: + """ + Client for handling API calls with token-based authentication + """ + + def __init__(self): + self.base_url = config.INFERENCE_API_ENDPOINT + self.token = config.INFERENCE_API_TOKEN + self.http_client = httpx.Client(verify=False) + logger.info(f"✓ API Client initialized with endpoint: {self.base_url}") + + def get_embedding_client(self): + """ + Get OpenAI-style client for embeddings + Uses bge-base-en-v1.5 model + """ + from openai import OpenAI + + return OpenAI( + api_key=self.token, + base_url=f"{self.base_url}/v1", + http_client=self.http_client + ) + + def get_inference_client(self): + """ + Get OpenAI-style client for inference/completions + Uses Llama-3.1-8B-Instruct model + """ + from openai import OpenAI + + return OpenAI( + api_key=self.token, + base_url=f"{self.base_url}/v1", + http_client=self.http_client + ) + + def embed_text(self, text: str) -> list: + """ + Get embedding for text + Uses the bge-base-en-v1.5 embedding model + + Args: + text: Text to embed + + Returns: + List of embedding values + """ + try: + client = self.get_embedding_client() + # Call the embeddings endpoint + response = client.embeddings.create( + model=config.EMBEDDING_MODEL_NAME, + input=text + ) + return response.data[0].embedding + except Exception as e: + logger.error(f"Error generating embedding: {str(e)}") + raise + + def embed_texts(self, texts: list) -> list: + """ + Get embeddings for multiple texts + Batches requests to avoid exceeding API limits (max batch size: 32) + + Args: + texts: List of texts to embed + + Returns: + List of embedding vectors + """ + try: + BATCH_SIZE = 32 # Maximum allowed batch size + all_embeddings = [] + client = self.get_embedding_client() + + # Process in batches of 32 + for i in range(0, len(texts), BATCH_SIZE): + batch = texts[i:i + BATCH_SIZE] + logger.info(f"Processing embedding batch {i//BATCH_SIZE + 1}/{(len(texts) + BATCH_SIZE - 1)//BATCH_SIZE} ({len(batch)} texts)") + + response = client.embeddings.create( + model=config.EMBEDDING_MODEL_NAME, + input=batch + ) + batch_embeddings = [data.embedding for data in response.data] + all_embeddings.extend(batch_embeddings) + + return all_embeddings + except Exception as e: + logger.error(f"Error generating embeddings: {str(e)}") + raise + + def chat_complete(self, messages: list, max_tokens: int = 500, temperature: float = 0.7) -> str: + """ + Get chat completion from the inference model + + Args: + messages: List of message dicts with 'role' and 'content' + max_tokens: Maximum tokens to generate + temperature: Temperature for generation + + Returns: + Generated text + """ + try: + client = self.get_inference_client() + # Convert messages to a prompt for the completions endpoint + # (since Llama models use completions, not chat.completions) + prompt = "" + for msg in messages: + role = msg.get('role', 'user') + content = msg.get('content', '') + if role == 'system': + prompt += f"System: {content}\n\n" + elif role == 'user': + prompt += f"User: {content}\n\n" + elif role == 'assistant': + prompt += f"Assistant: {content}\n\n" + prompt += "Assistant:" + + logger.info(f"Calling inference with prompt length: {len(prompt)}") + + response = client.completions.create( + model=config.INFERENCE_MODEL_NAME, + prompt=prompt, + max_tokens=max_tokens, + temperature=temperature + ) + + # Handle response structure + if hasattr(response, 'choices') and len(response.choices) > 0: + choice = response.choices[0] + if hasattr(choice, 'text'): + return choice.text + elif hasattr(choice, 'message') and hasattr(choice.message, 'content'): + return choice.message.content + else: + logger.error(f"Unexpected response structure: {type(choice)}, {choice}") + return str(choice) + else: + logger.error(f"Unexpected response: {type(response)}, {response}") + return "" + except Exception as e: + logger.error(f"Error generating chat completion: {str(e)}", exc_info=True) + raise + + def __del__(self): + """ + Cleanup: close httpx client + """ + if self.http_client: + self.http_client.close() + + +# Global API client instance +_api_client: Optional[APIClient] = None + + +def get_api_client() -> APIClient: + """ + Get or create the global API client instance + + Returns: + APIClient instance + """ + global _api_client + if _api_client is None: + _api_client = APIClient() + return _api_client + diff --git a/MultiAgentQnA/api/services/rag_service.py b/MultiAgentQnA/api/services/rag_service.py new file mode 100644 index 0000000000..874b7d1439 --- /dev/null +++ b/MultiAgentQnA/api/services/rag_service.py @@ -0,0 +1,293 @@ +""" +RAG Service for PDF processing and vector store management +Handles PDF parsing, text chunking, and FAISS vector operations +""" + +import os +import logging +import shutil +from typing import List, Optional, Dict, Any +import numpy as np +import faiss +from pypdf import PdfReader +import config +from services.api_client import get_api_client + +logger = logging.getLogger(__name__) + +# Constants +VECTOR_STORE_DIR = "./rag_index" +CHUNK_SIZE = 1000 # Characters per chunk +CHUNK_OVERLAP = 200 # Overlap between chunks + + +class RAGService: + """ + Service for RAG operations: PDF processing, embedding, and retrieval + """ + + def __init__(self): + self.api_client = get_api_client() + self.vector_store_path = VECTOR_STORE_DIR + self.index: Optional[faiss.Index] = None + self.documents: List[Dict[str, Any]] = [] + self._ensure_directory() + + def _ensure_directory(self): + """Ensure the vector store directory exists""" + os.makedirs(self.vector_store_path, exist_ok=True) + + def process_pdf(self, pdf_path: str) -> List[Dict[str, Any]]: + """ + Process a PDF file and extract text chunks + + Args: + pdf_path: Path to the PDF file + + Returns: + List of document chunks with metadata + """ + try: + logger.info(f"Processing PDF: {pdf_path}") + reader = PdfReader(pdf_path) + + # Extract text from all pages + full_text = "" + for i, page in enumerate(reader.pages): + page_text = page.extract_text() + if page_text: + full_text += f"\n--- Page {i+1} ---\n{page_text}\n" + + if not full_text.strip(): + raise ValueError("No text extracted from PDF") + + # Chunk the text + chunks = self._chunk_text(full_text, metadata={"source": pdf_path}) + + logger.info(f"Extracted {len(chunks)} chunks from PDF") + return chunks + + except Exception as e: + logger.error(f"Error processing PDF: {str(e)}") + raise + + def _chunk_text(self, text: str, metadata: Dict[str, Any]) -> List[Dict[str, Any]]: + """ + Split text into overlapping chunks + + Args: + text: Text to chunk + metadata: Metadata to attach to chunks + + Returns: + List of chunk dictionaries + """ + chunks = [] + + # Split by paragraphs first + paragraphs = text.split('\n\n') + current_chunk = "" + current_size = 0 + + for para in paragraphs: + para_size = len(para) + + # If adding this paragraph would exceed chunk size, save current chunk + if current_size + para_size > CHUNK_SIZE and current_chunk: + chunks.append({ + "text": current_chunk, + "metadata": {**metadata, "chunk_index": len(chunks)} + }) + + # Start new chunk with overlap + overlap_text = current_chunk[-CHUNK_OVERLAP:] if len(current_chunk) > CHUNK_OVERLAP else current_chunk + current_chunk = overlap_text + " " + para + current_size = len(current_chunk) + else: + current_chunk += "\n\n" + para if current_chunk else para + current_size = len(current_chunk) + + # Add the last chunk + if current_chunk: + chunks.append({ + "text": current_chunk, + "metadata": {**metadata, "chunk_index": len(chunks)} + }) + + return chunks + + def build_index(self, chunks: List[Dict[str, Any]]) -> None: + """ + Create embeddings and build FAISS index + + Args: + chunks: List of document chunks + """ + try: + logger.info(f"Building FAISS index with {len(chunks)} chunks") + + # Extract texts + texts = [chunk["text"] for chunk in chunks] + + # Get embeddings from API + embeddings = self.api_client.embed_texts(texts) + + # Convert to numpy array + embedding_dim = len(embeddings[0]) + embeddings_array = np.array(embeddings, dtype=np.float32) + + # Create FAISS index (using L2 distance) + self.index = faiss.IndexFlatL2(embedding_dim) + self.index.add(embeddings_array) + + # Store documents + self.documents = chunks + + # Save to disk + self._save_index() + + logger.info(f"Index built successfully with {self.index.ntotal} vectors") + + except Exception as e: + logger.error(f"Error building index: {str(e)}") + raise + + def _save_index(self): + """Save FAISS index and documents to disk""" + try: + # Save FAISS index + faiss.write_index(self.index, os.path.join(self.vector_store_path, "index.faiss")) + + # Save documents as a simple format (JSON would be better but keeping it simple) + import pickle + with open(os.path.join(self.vector_store_path, "documents.pkl"), "wb") as f: + pickle.dump(self.documents, f) + + logger.info(f"Index saved to {self.vector_store_path}") + + except Exception as e: + logger.error(f"Error saving index: {str(e)}") + raise + + def load_index(self) -> bool: + """ + Load FAISS index from disk + + Returns: + True if loaded successfully, False otherwise + """ + try: + faiss_path = os.path.join(self.vector_store_path, "index.faiss") + docs_path = os.path.join(self.vector_store_path, "documents.pkl") + + if not os.path.exists(faiss_path) or not os.path.exists(docs_path): + logger.warning("No existing index found") + return False + + # Load FAISS index + self.index = faiss.read_index(faiss_path) + + # Load documents + import pickle + with open(docs_path, "rb") as f: + self.documents = pickle.load(f) + + logger.info(f"Loaded index with {self.index.ntotal} vectors") + return True + + except Exception as e: + logger.error(f"Error loading index: {str(e)}") + return False + + def search(self, query: str, k: int = 3) -> List[Dict[str, Any]]: + """ + Search for similar documents + + Args: + query: Search query + k: Number of results to return + + Returns: + List of matching documents with similarity scores + """ + try: + if not self.index or self.index.ntotal == 0: + logger.warning("No index loaded or index is empty") + return [] + + # Get query embedding + query_embedding = self.api_client.embed_text(query) + query_vector = np.array([query_embedding], dtype=np.float32) + + # Search + distances, indices = self.index.search(query_vector, k) + + # Retrieve documents + results = [] + for i, idx in enumerate(indices[0]): + if idx < len(self.documents): + results.append({ + "document": self.documents[idx], + "score": float(distances[0][i]), + "similarity": 1.0 / (1.0 + float(distances[0][i])) # Convert distance to similarity + }) + + return results + + except Exception as e: + logger.error(f"Error searching: {str(e)}") + raise + + def delete_index(self) -> bool: + """ + Delete the vector store + + Returns: + True if deleted successfully + """ + try: + if os.path.exists(self.vector_store_path): + shutil.rmtree(self.vector_store_path) + logger.info("Vector store deleted") + self.index = None + self.documents = [] + return True + return False + + except Exception as e: + logger.error(f"Error deleting vector store: {str(e)}") + raise + + def get_status(self) -> Dict[str, Any]: + """ + Get the status of the vector store + + Returns: + Dictionary with status information + """ + return { + "index_exists": self.index is not None, + "num_documents": self.index.ntotal if self.index else 0, + "num_vectors": self.index.ntotal if self.index else 0, + "path": self.vector_store_path + } + + +# Global RAG service instance +_rag_service: Optional[RAGService] = None + + +def get_rag_service() -> RAGService: + """ + Get or create the global RAG service instance + + Returns: + RAGService instance + """ + global _rag_service + if _rag_service is None: + _rag_service = RAGService() + # Try to load existing index + _rag_service.load_index() + return _rag_service + diff --git a/MultiAgentQnA/docker-compose.yml b/MultiAgentQnA/docker-compose.yml new file mode 100644 index 0000000000..7a6d0455f9 --- /dev/null +++ b/MultiAgentQnA/docker-compose.yml @@ -0,0 +1,40 @@ +services: + # Backend API (Python) + backend: + build: + context: ./api + dockerfile: Dockerfile + container_name: multiagent-qna-backend + ports: + - "5001:5001" + env_file: + - ./api/.env + volumes: + - ./api:/app + networks: + - app_network + extra_hosts: + - "${LOCAL_URL_ENDPOINT}:host-gateway" + restart: unless-stopped + + # Frontend UI (React) + frontend: + build: + context: ./ui + dockerfile: Dockerfile + container_name: multiagent-qna-frontend + ports: + - "3000:3000" + depends_on: + - backend + networks: + - app_network + restart: unless-stopped + +################################## +# 🔗 Shared Network +################################## +networks: + app_network: + driver: bridge + diff --git a/MultiAgentQnA/images/ui.png b/MultiAgentQnA/images/ui.png new file mode 100644 index 0000000000..fa9a77802a Binary files /dev/null and b/MultiAgentQnA/images/ui.png differ diff --git a/MultiAgentQnA/ui/Dockerfile b/MultiAgentQnA/ui/Dockerfile new file mode 100644 index 0000000000..4a92b94094 --- /dev/null +++ b/MultiAgentQnA/ui/Dockerfile @@ -0,0 +1,20 @@ +FROM node:18 + +# Set the working directory +WORKDIR /app + +# Copy package.json +COPY package.json ./ + +# Install dependencies (this will create package-lock.json if missing) +RUN npm install + +# Copy the rest of the application files +COPY . . + +# Expose the port the app runs on +EXPOSE 3000 + +# Command to run the application +CMD ["npm", "run", "dev"] + diff --git a/MultiAgentQnA/ui/index.html b/MultiAgentQnA/ui/index.html new file mode 100644 index 0000000000..7f3995c6dd --- /dev/null +++ b/MultiAgentQnA/ui/index.html @@ -0,0 +1,14 @@ + + +
+ + + +
+ case 'rag_agent':
+ return + Ask anything - code questions, document queries, or general questions +
+Our intelligent multi-agent system will route your question to the best specialist:
+
+ Code Agent - Programming, algorithms, debugging
+ {message.content}
++ {message.timestamp.toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })} +
++ Press Enter to send • Upload PDFs for document questions • The AI will route your question to the best specialist +
+Intelligent question answering with specialized agents
+Loading configuration...
++ Configure the behavior and expertise of each agent in the multi-agent system. + Changes will be applied to new conversations. +
+