# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It's a full-stack application with: - Python backend (Flask-based API server) - React/TypeScript frontend (built with UmiJS) - Microservices architecture with Docker deployment - Multiple data stores (MySQL, Elasticsearch/Infinity, Redis, MinIO) ## Architecture ### Backend (`/api/`) - **Main Server**: `api/ragflow_server.py` - Flask application entry point - **Apps**: Modular Flask blueprints in `api/apps/` for different functionalities: - `kb_app.py` - Knowledge base management - `dialog_app.py` - Chat/conversation handling - `document_app.py` - Document processing - `canvas_app.py` - Agent workflow canvas - `file_app.py` - File upload/management - **Services**: Business logic in `api/db/services/` - **Models**: Database models in `api/db/db_models.py` ### Core Processing (`/rag/`) - **Document Processing**: `deepdoc/` - PDF parsing, OCR, layout analysis - **LLM Integration**: `rag/llm/` - Model abstractions for chat, embedding, reranking - **RAG Pipeline**: `rag/flow/` - Chunking, parsing, tokenization - **Graph RAG**: `graphrag/` - Knowledge graph construction and querying ### Agent System (`/agent/`) - **Components**: Modular workflow components (LLM, retrieval, categorize, etc.) - **Templates**: Pre-built agent workflows in `agent/templates/` - **Tools**: External API integrations (Tavily, Wikipedia, SQL execution, etc.) ### Frontend (`/web/`) - React/TypeScript with UmiJS framework - Ant Design + shadcn/ui components - State management with Zustand - Tailwind CSS for styling ## Common Development Commands ### Backend Development ```bash # Install Python dependencies uv sync --python 3.10 --all-extras uv run download_deps.py pre-commit install # Start dependent services docker compose -f docker/docker-compose-base.yml up -d # Run backend (requires services to be running) source .venv/bin/activate export PYTHONPATH=$(pwd) bash docker/launch_backend_service.sh # Run tests uv run pytest # Linting ruff check ruff format ``` ### Frontend Development ```bash cd web npm install npm run dev # Development server npm run build # Production build npm run lint # ESLint npm run test # Jest tests ``` ### Docker Development ```bash # Full stack with Docker cd docker docker compose -f docker-compose.yml up -d # Check server status docker logs -f ragflow-server # Rebuild images docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly . ``` ## Key Configuration Files - `docker/.env` - Environment variables for Docker deployment - `docker/service_conf.yaml.template` - Backend service configuration - `pyproject.toml` - Python dependencies and project configuration - `web/package.json` - Frontend dependencies and scripts ## Testing - **Python**: pytest with markers (p1/p2/p3 priority levels) - **Frontend**: Jest with React Testing Library - **API Tests**: HTTP API and SDK tests in `test/` and `sdk/python/test/` ## Database Engines RAGFlow supports switching between Elasticsearch (default) and Infinity: - Set `DOC_ENGINE=infinity` in `docker/.env` to use Infinity - Requires container restart: `docker compose down -v && docker compose up -d` ## Development Environment Requirements - Python 3.10-3.12 - Node.js >=18.20.4 - Docker & Docker Compose - uv package manager - 16GB+ RAM, 50GB+ disk space