mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-01-29 22:56:36 +08:00
## Summary Implement a flexible sandbox provider system supporting both self-managed (Docker) and SaaS (Aliyun Code Interpreter) backends for secure code execution in agent workflows. **Key Changes:** - ✅ Aliyun Code Interpreter provider using official `agentrun-sdk>=0.0.16` - ✅ Self-managed provider with gVisor (runsc) security - ✅ Arguments parameter support for dynamic code execution - ✅ Database-only configuration (removed fallback logic) - ✅ Configuration scripts for quick setup Issue #12479 ## Features ### 🔌 Provider Abstraction Layer **1. Self-Managed Provider** (`agent/sandbox/providers/self_managed.py`) - Wraps existing executor_manager HTTP API - gVisor (runsc) for secure container isolation - Configurable pool size, timeout, retry logic - Languages: Python, Node.js, JavaScript - ⚠️ **Requires**: gVisor installation, Docker, base images **2. Aliyun Code Interpreter** (`agent/sandbox/providers/aliyun_codeinterpreter.py`) - SaaS integration using official agentrun-sdk - Serverless microVM execution with auto-authentication - Hard timeout: 30 seconds max - Credentials: `AGENTRUN_ACCESS_KEY_ID`, `AGENTRUN_ACCESS_KEY_SECRET`, `AGENTRUN_ACCOUNT_ID`, `AGENTRUN_REGION` - Automatically wraps code to call `main()` function **3. E2B Provider** (`agent/sandbox/providers/e2b.py`) - Placeholder for future integration ### ⚙️ Configuration System - `conf/system_settings.json`: Default provider = `aliyun_codeinterpreter` - `agent/sandbox/client.py`: Enforces database-only configuration - Admin UI: `/admin/sandbox-settings` - Configuration validation via `validate_config()` method - Health checks for all providers ### 🎯 Key Capabilities **Arguments Parameter Support:** All providers support passing arguments to `main()` function: ```python # User code def main(name: str, count: int) -> dict: return {"message": f"Hello {name}!" * count} # Executed with: arguments={"name": "World", "count": 3} # Result: {"message": "Hello World!Hello World!Hello World!"} ``` **Self-Describing Providers:** Each provider implements `get_config_schema()` returning form configuration for Admin UI **Error Handling:** Structured `ExecutionResult` with stdout, stderr, exit_code, execution_time ## Configuration Scripts Two scripts for quick Aliyun sandbox setup: **Shell Script (requires jq):** ```bash source scripts/configure_aliyun_sandbox.sh ``` **Python Script (interactive):** ```bash python3 scripts/configure_aliyun_sandbox.py ``` ## Testing ```bash # Unit tests uv run pytest agent/sandbox/tests/test_providers.py -v # Aliyun provider tests uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter.py -v # Integration tests (requires credentials) uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py -v # Quick SDK validation python3 agent/sandbox/tests/verify_sdk.py ``` **Test Coverage:** - 30 unit tests for provider abstraction - Provider-specific tests for Aliyun - Integration tests with real API - Security tests for executor_manager ## Documentation - `docs/develop/sandbox_spec.md` - Complete architecture specification - `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration from legacy sandbox - `agent/sandbox/tests/QUICKSTART.md` - Quick start guide - `agent/sandbox/tests/README.md` - Testing documentation ## Breaking Changes ⚠️ **Migration Required:** 1. **Directory Move**: `sandbox/` → `agent/sandbox/` - Update imports: `from sandbox.` → `from agent.sandbox.` 2. **Mandatory Configuration**: - SystemSettings must have `sandbox.provider_type` configured - Removed fallback default values - Configuration must exist in database (from `conf/system_settings.json`) 3. **Aliyun Credentials**: - Requires `AGENTRUN_*` environment variables (not `ALIYUN_*`) - `AGENTRUN_ACCOUNT_ID` is now required (Aliyun primary account ID) 4. **Self-Managed Provider**: - gVisor (runsc) must be installed for security - Install: `go install gvisor.dev/gvisor/runsc@latest` ## Database Schema Changes ```python # SystemSettings.value: CharField → TextField api/db/db_models.py: Changed for unlimited config length # SystemSettingsService.get_by_name(): Fixed query precision api/db/services/system_settings_service.py: startswith → exact match ``` ## Files Changed ### Backend (Python) - `agent/sandbox/providers/base.py` - SandboxProvider ABC interface - `agent/sandbox/providers/manager.py` - ProviderManager - `agent/sandbox/providers/self_managed.py` - Self-managed provider - `agent/sandbox/providers/aliyun_codeinterpreter.py` - Aliyun provider - `agent/sandbox/providers/e2b.py` - E2B provider (placeholder) - `agent/sandbox/client.py` - Unified client (enforces DB-only config) - `agent/tools/code_exec.py` - Updated to use provider system - `admin/server/services.py` - SandboxMgr with registry & validation - `admin/server/routes.py` - 5 sandbox API endpoints - `conf/system_settings.json` - Default: aliyun_codeinterpreter - `api/db/db_models.py` - TextField for SystemSettings.value - `api/db/services/system_settings_service.py` - Exact match query ### Frontend (TypeScript/React) - `web/src/pages/admin/sandbox-settings.tsx` - Settings UI - `web/src/services/admin-service.ts` - Sandbox service functions - `web/src/services/admin.service.d.ts` - Type definitions - `web/src/utils/api.ts` - Sandbox API endpoints ### Documentation - `docs/develop/sandbox_spec.md` - Architecture spec - `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration guide - `agent/sandbox/tests/QUICKSTART.md` - Quick start - `agent/sandbox/tests/README.md` - Testing guide ### Configuration Scripts - `scripts/configure_aliyun_sandbox.sh` - Shell script (jq) - `scripts/configure_aliyun_sandbox.py` - Python script ### Tests - `agent/sandbox/tests/test_providers.py` - 30 unit tests - `agent/sandbox/tests/test_aliyun_codeinterpreter.py` - Provider tests - `agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py` - Integration tests - `agent/sandbox/tests/verify_sdk.py` - SDK validation ## Architecture ``` Admin UI → Admin API → SandboxMgr → ProviderManager → [SelfManaged|Aliyun|E2B] ↓ SystemSettings ``` ## Usage ### 1. Configure Provider **Via Admin UI:** 1. Navigate to `/admin/sandbox-settings` 2. Select provider (Aliyun Code Interpreter / Self-Managed) 3. Fill in configuration 4. Click "Test Connection" to verify 5. Click "Save" to apply **Via Configuration Scripts:** ```bash # Aliyun provider export AGENTRUN_ACCESS_KEY_ID="xxx" export AGENTRUN_ACCESS_KEY_SECRET="yyy" export AGENTRUN_ACCOUNT_ID="zzz" export AGENTRUN_REGION="cn-shanghai" source scripts/configure_aliyun_sandbox.sh ``` ### 2. Restart Service ```bash cd docker docker compose restart ragflow-server ``` ### 3. Execute Code in Agent ```python from agent.sandbox.client import execute_code result = execute_code( code='def main(name: str) -> dict: return {"message": f"Hello {name}!"}', language="python", timeout=30, arguments={"name": "World"} ) print(result.stdout) # {"message": "Hello World!"} ``` ## Troubleshooting ### "Container pool is busy" (Self-Managed) - **Cause**: Pool exhausted (default: 1 container in `.env`) - **Fix**: Increase `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` to 5+ ### "Sandbox provider type not configured" - **Cause**: Database missing configuration - **Fix**: Run config script or set via Admin UI ### "gVisor not found" - **Cause**: runsc not installed - **Fix**: `go install gvisor.dev/gvisor/runsc@latest && sudo cp ~/go/bin/runsc /usr/local/bin/` ### Aliyun authentication errors - **Cause**: Wrong environment variable names - **Fix**: Use `AGENTRUN_*` prefix (not `ALIYUN_*`) ## Checklist - [x] All tests passing (30 unit tests + integration tests) - [x] Documentation updated (spec, migration guide, quickstart) - [x] Type definitions added (TypeScript) - [x] Admin UI implemented - [x] Configuration validation - [x] Health checks implemented - [x] Error handling with structured results - [x] Breaking changes documented - [x] Configuration scripts created - [x] gVisor requirements documented Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
348 lines
9.2 KiB
Markdown
348 lines
9.2 KiB
Markdown
# RAGFlow Sandbox
|
|
|
|
A secure, pluggable code execution backend for RAGFlow and beyond.
|
|
|
|
## 🔧 Features
|
|
|
|
- ✅ **Seamless RAGFlow Integration** — Out-of-the-box compatibility with the `code` component.
|
|
- 🔐 **High Security** — Leverages [gVisor](https://gvisor.dev/) for syscall-level sandboxing.
|
|
- 🔧 **Customizable Sandboxing** — Easily modify `seccomp` settings as needed.
|
|
- 🧩 **Pluggable Runtime Support** — Easily extend to support any programming language.
|
|
- ⚙️ **Developer Friendly** — Get started with a single command using `Makefile`.
|
|
|
|
## 🏗 Architecture
|
|
|
|
<p align="center">
|
|
<img src="asserts/code_executor_manager.svg" width="520" alt="Architecture Diagram">
|
|
</p>
|
|
|
|
## 🚀 Quick Start
|
|
|
|
### 📋 Prerequisites
|
|
|
|
#### Required
|
|
|
|
- Linux distro compatible with gVisor
|
|
- [gVisor](https://gvisor.dev/docs/user_guide/install/)
|
|
- Docker >= `25.0` (API 1.44+) — executor manager now bundles Docker CLI `29.1.0` to match newer daemons.
|
|
- Docker Compose >= `v2.26.1` like [RAGFlow](https://github.com/infiniflow/ragflow)
|
|
- [uv](https://docs.astral.sh/uv/) as package and project manager
|
|
|
|
#### Optional (Recommended)
|
|
|
|
- [GNU Make](https://www.gnu.org/software/make/) for simplified CLI management
|
|
|
|
---
|
|
|
|
> ⚠️ **New Docker CLI requirement**
|
|
>
|
|
> If you see `client version 1.43 is too old. Minimum supported API version is 1.44`, pull the latest `infiniflow/sandbox-executor-manager:latest` (rebuilt with Docker CLI `29.1.0`) or rebuild it in `./sandbox/executor_manager`. Older images shipped Docker 24.x, which cannot talk to newer Docker daemons.
|
|
|
|
### 🐳 Build Docker Base Images
|
|
|
|
We use isolated base images for secure containerized execution:
|
|
|
|
```bash
|
|
# Build base images manually
|
|
docker build -t sandbox-base-python:latest ./sandbox_base_image/python
|
|
docker build -t sandbox-base-nodejs:latest ./sandbox_base_image/nodejs
|
|
|
|
# OR use Makefile
|
|
make build
|
|
```
|
|
|
|
Then, build the executor manager image:
|
|
|
|
```bash
|
|
docker build -t sandbox-executor-manager:latest ./executor_manager
|
|
```
|
|
|
|
---
|
|
|
|
### 📦 Running with RAGFlow
|
|
|
|
1. Ensure gVisor is correctly installed.
|
|
2. Configure your `.env` in `docker/.env`:
|
|
|
|
- Uncomment sandbox-related variables.
|
|
- Enable sandbox profile at the bottom.
|
|
3. Add the following line to `/etc/hosts` as recommended:
|
|
|
|
```text
|
|
127.0.0.1 sandbox-executor-manager
|
|
```
|
|
|
|
4. Start RAGFlow service.
|
|
|
|
---
|
|
|
|
### 🧭 Running Standalone
|
|
|
|
#### Manual Setup
|
|
|
|
1. Initialize environment:
|
|
|
|
```bash
|
|
cp .env.example .env
|
|
```
|
|
|
|
2. Launch:
|
|
|
|
```bash
|
|
docker compose -f docker-compose.yml up
|
|
```
|
|
|
|
3. Test:
|
|
|
|
```bash
|
|
source .venv/bin/activate
|
|
export PYTHONPATH=$(pwd)
|
|
uv pip install -r executor_manager/requirements.txt
|
|
uv run tests/sandbox_security_tests_full.py
|
|
```
|
|
|
|
#### With Make
|
|
|
|
```bash
|
|
make # setup + build + launch + test
|
|
```
|
|
|
|
---
|
|
|
|
### 📈 Monitoring
|
|
|
|
```bash
|
|
docker logs -f sandbox-executor-manager # Manual
|
|
make logs # With Make
|
|
```
|
|
|
|
---
|
|
|
|
### 🧰 Makefile Toolbox
|
|
|
|
| Command | Description |
|
|
|-------------------|--------------------------------------------------|
|
|
| `make` | Setup, build, launch and test all at once |
|
|
| `make setup` | Initialize environment and install uv |
|
|
| `make ensure_env` | Auto-create `.env` if missing |
|
|
| `make ensure_uv` | Install `uv` package manager if missing |
|
|
| `make build` | Build all Docker base images |
|
|
| `make start` | Start services with safe env loading and testing |
|
|
| `make stop` | Gracefully stop all services |
|
|
| `make restart` | Shortcut for `stop` + `start` |
|
|
| `make test` | Run full test suite |
|
|
| `make logs` | Stream container logs |
|
|
| `make clean` | Stop and remove orphan containers and volumes |
|
|
|
|
---
|
|
|
|
## 🔐 Security
|
|
|
|
The RAGFlow sandbox is designed to balance security and usability, offering solid protection without compromising developer experience.
|
|
|
|
### ✅ gVisor Isolation
|
|
|
|
At its core, we use [gVisor](https://gvisor.dev/docs/architecture_guide/security/), a user-space kernel, to isolate code execution from the host system. gVisor intercepts and restricts syscalls, offering robust protection against container escapes and privilege escalations.
|
|
|
|
### 🔒 Optional seccomp Support (Advanced)
|
|
|
|
For users who need **zero-trust-level syscall control**, we support an additional `seccomp` profile. This feature restricts containers to only a predefined set of system calls, as specified in `executor_manager/seccomp-profile-default.json`.
|
|
|
|
> ⚠️ This feature is **disabled by default** to maintain compatibility and usability. Enabling it may cause compatibility issues with some dependencies.
|
|
|
|
#### To enable seccomp
|
|
|
|
1. Edit your `.env` file:
|
|
|
|
```dotenv
|
|
SANDBOX_ENABLE_SECCOMP=true
|
|
```
|
|
|
|
2. Customize allowed syscalls in:
|
|
|
|
```
|
|
executor_manager/seccomp-profile-default.json
|
|
```
|
|
|
|
This profile is passed to the container with:
|
|
|
|
```bash
|
|
--security-opt seccomp=/app/seccomp-profile-default.json
|
|
```
|
|
|
|
### 🧠 Python Code AST Inspection
|
|
|
|
In addition to sandboxing, Python code is **statically analyzed via AST (Abstract Syntax Tree)** before execution. Potentially malicious code (e.g. file operations, subprocess calls, etc.) is rejected early, providing an extra layer of protection.
|
|
|
|
---
|
|
|
|
This security model strikes a balance between **robust isolation** and **developer usability**. While `seccomp` can be highly restrictive, our default setup aims to keep things usable for most developers — no obscure crashes or cryptic setup required.
|
|
|
|
## 📦 Add Extra Dependencies for Supported Languages
|
|
|
|
Currently, the following languages are officially supported:
|
|
|
|
| Language | Priority |
|
|
|----------|----------|
|
|
| Python | High |
|
|
| Node.js | Medium |
|
|
|
|
### 🐍 Python
|
|
|
|
To add Python dependencies, simply edit the following file:
|
|
|
|
```bash
|
|
sandbox_base_image/python/requirements.txt
|
|
```
|
|
|
|
Add any additional packages you need, one per line (just like a normal pip requirements file).
|
|
|
|
### 🟨 Node.js
|
|
|
|
To add Node.js dependencies:
|
|
|
|
1. Navigate to the Node.js base image directory:
|
|
|
|
```bash
|
|
cd sandbox_base_image/nodejs
|
|
```
|
|
|
|
2. Use `npm` to install the desired packages. For example:
|
|
|
|
```bash
|
|
npm install lodash
|
|
```
|
|
|
|
3. The dependencies will be saved to `package.json` and `package-lock.json`, and included in the Docker image when rebuilt.
|
|
|
|
---
|
|
|
|
|
|
## Usage
|
|
|
|
### 🐍 A Python example
|
|
|
|
```python
|
|
def main(arg1: str, arg2: str) -> str:
|
|
return f"result: {arg1 + arg2}"
|
|
```
|
|
|
|
### 🟨 JavaScript examples
|
|
|
|
A simple sync function
|
|
|
|
```javascript
|
|
function main({arg1, arg2}) {
|
|
return arg1+arg2
|
|
}
|
|
```
|
|
|
|
Async funcion with aioxs
|
|
|
|
```javascript
|
|
const axios = require('axios');
|
|
async function main() {
|
|
try {
|
|
const response = await axios.get('https://github.com/infiniflow/ragflow');
|
|
return 'Body:' + response.data;
|
|
} catch (error) {
|
|
return 'Error:' + error.message;
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 📋 FAQ
|
|
|
|
### ❓Sandbox Not Working?
|
|
|
|
Follow this checklist to troubleshoot:
|
|
|
|
- [ ] **Is your machine compatible with gVisor?**
|
|
|
|
Ensure that your system supports gVisor. Refer to the [gVisor installation guide](https://gvisor.dev/docs/user_guide/install/).
|
|
|
|
- [ ] **Is gVisor properly installed?**
|
|
|
|
**Common error:**
|
|
|
|
`HTTPConnectionPool(host='sandbox-executor-manager', port=9385): Read timed out.`
|
|
|
|
Cause: `runsc` is an unknown or invalid Docker runtime.
|
|
**Fix:**
|
|
|
|
- Install gVisor
|
|
|
|
- Restart Docker
|
|
|
|
- Test with:
|
|
|
|
```bash
|
|
docker run --rm --runtime=runsc hello-world
|
|
```
|
|
|
|
- [ ] **Is `sandbox-executor-manager` mapped in `/etc/hosts`?**
|
|
|
|
**Common error:**
|
|
|
|
`HTTPConnectionPool(host='none', port=9385): Max retries exceeded.`
|
|
|
|
**Fix:**
|
|
|
|
Add the following entry to `/etc/hosts`:
|
|
|
|
```text
|
|
127.0.0.1 es01 infinity mysql minio redis sandbox-executor-manager
|
|
```
|
|
|
|
- [ ] **Are you running the latest executor manager image?**
|
|
|
|
**Common error:**
|
|
|
|
`docker: Error response from daemon: client version 1.43 is too old. Minimum supported API version is 1.44`
|
|
|
|
**Fix:**
|
|
|
|
Pull the refreshed image that bundles Docker CLI `29.1.0`, or rebuild it in `./sandbox/executor_manager`:
|
|
|
|
```bash
|
|
docker pull infiniflow/sandbox-executor-manager:latest
|
|
# or
|
|
docker build -t sandbox-executor-manager:latest ./sandbox/executor_manager
|
|
```
|
|
|
|
- [ ] **Have you enabled sandbox-related configurations in RAGFlow?**
|
|
|
|
Double-check that all sandbox settings are correctly enabled in your RAGFlow configuration.
|
|
|
|
- [ ] **Have you pulled the required base images for the runners?**
|
|
|
|
**Common error:**
|
|
|
|
`HTTPConnectionPool(host='sandbox-executor-manager', port=9385): Read timed out.`
|
|
|
|
Cause: no runner was started.
|
|
|
|
**Fix:**
|
|
|
|
Pull the necessary base images:
|
|
|
|
```bash
|
|
docker pull infiniflow/sandbox-base-nodejs:latest
|
|
docker pull infiniflow/sandbox-base-python:latest
|
|
```
|
|
|
|
- [ ] **Did you restart the service after making changes?**
|
|
|
|
Any changes to configuration or environment require a full service restart to take effect.
|
|
|
|
|
|
### ❓Container pool is busy?
|
|
|
|
All available runners are currently in use, executing tasks/running code. Please try again shortly, or consider increasing the pool size in the configuration to improve availability and reduce wait times.
|
|
|
|
## 🤝 Contribution
|
|
|
|
Contributions are welcome!
|