mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-28 22:26:36 +08:00

Files

Zhichang Yu fd11aca8e5 feat: Implement pluggable multi-provider sandbox architecture (#12820 )

## Summary

Implement a flexible sandbox provider system supporting both
self-managed (Docker) and SaaS (Aliyun Code Interpreter) backends for
secure code execution in agent workflows.

**Key Changes:**
- ✅ Aliyun Code Interpreter provider using official
`agentrun-sdk>=0.0.16`
- ✅ Self-managed provider with gVisor (runsc) security
- ✅ Arguments parameter support for dynamic code execution
- ✅ Database-only configuration (removed fallback logic)
- ✅ Configuration scripts for quick setup

Issue #12479

## Features

### 🔌 Provider Abstraction Layer

**1. Self-Managed Provider** (`agent/sandbox/providers/self_managed.py`)
- Wraps existing executor_manager HTTP API
- gVisor (runsc) for secure container isolation
- Configurable pool size, timeout, retry logic
- Languages: Python, Node.js, JavaScript
- ⚠️ **Requires**: gVisor installation, Docker, base images

**2. Aliyun Code Interpreter**
(`agent/sandbox/providers/aliyun_codeinterpreter.py`)
- SaaS integration using official agentrun-sdk
- Serverless microVM execution with auto-authentication
- Hard timeout: 30 seconds max
- Credentials: `AGENTRUN_ACCESS_KEY_ID`, `AGENTRUN_ACCESS_KEY_SECRET`,
`AGENTRUN_ACCOUNT_ID`, `AGENTRUN_REGION`
- Automatically wraps code to call `main()` function

**3. E2B Provider** (`agent/sandbox/providers/e2b.py`)
- Placeholder for future integration

### ⚙️ Configuration System

- `conf/system_settings.json`: Default provider =
`aliyun_codeinterpreter`
- `agent/sandbox/client.py`: Enforces database-only configuration
- Admin UI: `/admin/sandbox-settings`
- Configuration validation via `validate_config()` method
- Health checks for all providers

### 🎯 Key Capabilities

**Arguments Parameter Support:**
All providers support passing arguments to `main()` function:
```python
# User code
def main(name: str, count: int) -> dict:
    return {"message": f"Hello {name}!" * count}

# Executed with: arguments={"name": "World", "count": 3}
# Result: {"message": "Hello World!Hello World!Hello World!"}
```

**Self-Describing Providers:**
Each provider implements `get_config_schema()` returning form
configuration for Admin UI

**Error Handling:**
Structured `ExecutionResult` with stdout, stderr, exit_code,
execution_time

## Configuration Scripts

Two scripts for quick Aliyun sandbox setup:

**Shell Script (requires jq):**
```bash
source scripts/configure_aliyun_sandbox.sh
```

**Python Script (interactive):**
```bash
python3 scripts/configure_aliyun_sandbox.py
```

## Testing

```bash
# Unit tests
uv run pytest agent/sandbox/tests/test_providers.py -v

# Aliyun provider tests
uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter.py -v

# Integration tests (requires credentials)
uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py -v

# Quick SDK validation
python3 agent/sandbox/tests/verify_sdk.py
```

**Test Coverage:**
- 30 unit tests for provider abstraction
- Provider-specific tests for Aliyun
- Integration tests with real API
- Security tests for executor_manager

## Documentation

- `docs/develop/sandbox_spec.md` - Complete architecture specification
- `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration from legacy
sandbox
- `agent/sandbox/tests/QUICKSTART.md` - Quick start guide
- `agent/sandbox/tests/README.md` - Testing documentation

## Breaking Changes

⚠️ **Migration Required:**

1. **Directory Move**: `sandbox/` → `agent/sandbox/`
   - Update imports: `from sandbox.` → `from agent.sandbox.`

2. **Mandatory Configuration**: 
   - SystemSettings must have `sandbox.provider_type` configured
   - Removed fallback default values
- Configuration must exist in database (from
`conf/system_settings.json`)

3. **Aliyun Credentials**:
   - Requires `AGENTRUN_*` environment variables (not `ALIYUN_*`)
   - `AGENTRUN_ACCOUNT_ID` is now required (Aliyun primary account ID)

4. **Self-Managed Provider**:
   - gVisor (runsc) must be installed for security
   - Install: `go install gvisor.dev/gvisor/runsc@latest`

## Database Schema Changes

```python
# SystemSettings.value: CharField → TextField
api/db/db_models.py: Changed for unlimited config length

# SystemSettingsService.get_by_name(): Fixed query precision
api/db/services/system_settings_service.py: startswith → exact match
```

## Files Changed

### Backend (Python)
- `agent/sandbox/providers/base.py` - SandboxProvider ABC interface
- `agent/sandbox/providers/manager.py` - ProviderManager
- `agent/sandbox/providers/self_managed.py` - Self-managed provider
- `agent/sandbox/providers/aliyun_codeinterpreter.py` - Aliyun provider
- `agent/sandbox/providers/e2b.py` - E2B provider (placeholder)
- `agent/sandbox/client.py` - Unified client (enforces DB-only config)
- `agent/tools/code_exec.py` - Updated to use provider system
- `admin/server/services.py` - SandboxMgr with registry & validation
- `admin/server/routes.py` - 5 sandbox API endpoints
- `conf/system_settings.json` - Default: aliyun_codeinterpreter
- `api/db/db_models.py` - TextField for SystemSettings.value
- `api/db/services/system_settings_service.py` - Exact match query

### Frontend (TypeScript/React)
- `web/src/pages/admin/sandbox-settings.tsx` - Settings UI
- `web/src/services/admin-service.ts` - Sandbox service functions
- `web/src/services/admin.service.d.ts` - Type definitions
- `web/src/utils/api.ts` - Sandbox API endpoints

### Documentation
- `docs/develop/sandbox_spec.md` - Architecture spec
- `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration guide
- `agent/sandbox/tests/QUICKSTART.md` - Quick start
- `agent/sandbox/tests/README.md` - Testing guide

### Configuration Scripts
- `scripts/configure_aliyun_sandbox.sh` - Shell script (jq)
- `scripts/configure_aliyun_sandbox.py` - Python script

### Tests
- `agent/sandbox/tests/test_providers.py` - 30 unit tests
- `agent/sandbox/tests/test_aliyun_codeinterpreter.py` - Provider tests
- `agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py` -
Integration tests
- `agent/sandbox/tests/verify_sdk.py` - SDK validation

## Architecture

```
Admin UI → Admin API → SandboxMgr → ProviderManager → [SelfManaged|Aliyun|E2B]
                                      ↓
                                  SystemSettings
```

## Usage

### 1. Configure Provider

**Via Admin UI:**
1. Navigate to `/admin/sandbox-settings`
2. Select provider (Aliyun Code Interpreter / Self-Managed)
3. Fill in configuration
4. Click "Test Connection" to verify
5. Click "Save" to apply

**Via Configuration Scripts:**
```bash
# Aliyun provider
export AGENTRUN_ACCESS_KEY_ID="xxx"
export AGENTRUN_ACCESS_KEY_SECRET="yyy"
export AGENTRUN_ACCOUNT_ID="zzz"
export AGENTRUN_REGION="cn-shanghai"
source scripts/configure_aliyun_sandbox.sh
```

### 2. Restart Service

```bash
cd docker
docker compose restart ragflow-server
```

### 3. Execute Code in Agent

```python
from agent.sandbox.client import execute_code

result = execute_code(
    code='def main(name: str) -> dict: return {"message": f"Hello {name}!"}',
    language="python",
    timeout=30,
    arguments={"name": "World"}
)

print(result.stdout)  # {"message": "Hello World!"}
```

## Troubleshooting

### "Container pool is busy" (Self-Managed)
- **Cause**: Pool exhausted (default: 1 container in `.env`)
- **Fix**: Increase `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` to 5+

### "Sandbox provider type not configured"
- **Cause**: Database missing configuration
- **Fix**: Run config script or set via Admin UI

### "gVisor not found"
- **Cause**: runsc not installed
- **Fix**: `go install gvisor.dev/gvisor/runsc@latest && sudo cp
~/go/bin/runsc /usr/local/bin/`

### Aliyun authentication errors
- **Cause**: Wrong environment variable names
- **Fix**: Use `AGENTRUN_*` prefix (not `ALIYUN_*`)

## Checklist

- [x] All tests passing (30 unit tests + integration tests)
- [x] Documentation updated (spec, migration guide, quickstart)
- [x] Type definitions added (TypeScript)
- [x] Admin UI implemented
- [x] Configuration validation
- [x] Health checks implemented
- [x] Error handling with structured results
- [x] Breaking changes documented
- [x] Configuration scripts created
- [x] gVisor requirements documented

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-01-28 13:28:21 +08:00

asserts

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

executor_manager

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

providers

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

sandbox_base_image

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

scripts

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

tests

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

.env.example

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

client.py

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

docker-compose.yml

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

Makefile

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

pyproject.toml

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

README.md

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

uv.lock

feat: Implement pluggable multi-provider sandbox architecture (#12820 )

2026-01-28 13:28:21 +08:00

README.md

RAGFlow Sandbox

A secure, pluggable code execution backend for RAGFlow and beyond.

🔧 Features

✅ Seamless RAGFlow Integration — Out-of-the-box compatibility with the code component.
🔐 High Security — Leverages gVisor for syscall-level sandboxing.
🔧 Customizable Sandboxing — Easily modify seccomp settings as needed.
🧩 Pluggable Runtime Support — Easily extend to support any programming language.
⚙️ Developer Friendly — Get started with a single command using Makefile.

🏗 Architecture

🚀 Quick Start

📋 Prerequisites

Required

Linux distro compatible with gVisor
gVisor
Docker >= 25.0 (API 1.44+) — executor manager now bundles Docker CLI 29.1.0 to match newer daemons.
Docker Compose >= v2.26.1 like RAGFlow
uv as package and project manager

Optional (Recommended)

GNU Make for simplified CLI management

⚠️ New Docker CLI requirement

If you see client version 1.43 is too old. Minimum supported API version is 1.44, pull the latest infiniflow/sandbox-executor-manager:latest (rebuilt with Docker CLI 29.1.0) or rebuild it in ./sandbox/executor_manager. Older images shipped Docker 24.x, which cannot talk to newer Docker daemons.

🐳 Build Docker Base Images

We use isolated base images for secure containerized execution:

# Build base images manually
docker build -t sandbox-base-python:latest ./sandbox_base_image/python
docker build -t sandbox-base-nodejs:latest ./sandbox_base_image/nodejs

# OR use Makefile
make build

Then, build the executor manager image:

docker build -t sandbox-executor-manager:latest ./executor_manager

📦 Running with RAGFlow

Ensure gVisor is correctly installed.
Configure your .env in docker/.env:
- Uncomment sandbox-related variables.
- Enable sandbox profile at the bottom.
Add the following line to /etc/hosts as recommended:
```
127.0.0.1 sandbox-executor-manager
```
Start RAGFlow service.

🧭 Running Standalone

Manual Setup

Initialize environment:
```
cp .env.example .env
```

Launch:

docker compose -f docker-compose.yml up

Test:

source .venv/bin/activate
export PYTHONPATH=$(pwd)
uv pip install -r executor_manager/requirements.txt
uv run tests/sandbox_security_tests_full.py

With Make

make          # setup + build + launch + test

📈 Monitoring

docker logs -f sandbox-executor-manager  # Manual
make logs                                 # With Make

🧰 Makefile Toolbox

Command	Description
`make`	Setup, build, launch and test all at once
`make setup`	Initialize environment and install uv
`make ensure_env`	Auto-create `.env` if missing
`make ensure_uv`	Install `uv` package manager if missing
`make build`	Build all Docker base images
`make start`	Start services with safe env loading and testing
`make stop`	Gracefully stop all services
`make restart`	Shortcut for `stop` + `start`
`make test`	Run full test suite
`make logs`	Stream container logs
`make clean`	Stop and remove orphan containers and volumes

🔐 Security

The RAGFlow sandbox is designed to balance security and usability, offering solid protection without compromising developer experience.

✅ gVisor Isolation

At its core, we use gVisor, a user-space kernel, to isolate code execution from the host system. gVisor intercepts and restricts syscalls, offering robust protection against container escapes and privilege escalations.

🔒 Optional seccomp Support (Advanced)

For users who need zero-trust-level syscall control, we support an additional seccomp profile. This feature restricts containers to only a predefined set of system calls, as specified in executor_manager/seccomp-profile-default.json.

⚠️ This feature is disabled by default to maintain compatibility and usability. Enabling it may cause compatibility issues with some dependencies.

To enable seccomp

Edit your .env file:
```
SANDBOX_ENABLE_SECCOMP=true
```

Customize allowed syscalls in:

executor_manager/seccomp-profile-default.json

This profile is passed to the container with:

--security-opt seccomp=/app/seccomp-profile-default.json

🧠 Python Code AST Inspection

In addition to sandboxing, Python code is statically analyzed via AST (Abstract Syntax Tree) before execution. Potentially malicious code (e.g. file operations, subprocess calls, etc.) is rejected early, providing an extra layer of protection.

This security model strikes a balance between robust isolation and developer usability. While seccomp can be highly restrictive, our default setup aims to keep things usable for most developers — no obscure crashes or cryptic setup required.

📦 Add Extra Dependencies for Supported Languages

Currently, the following languages are officially supported:

Language	Priority
Python	High
Node.js	Medium

🐍 Python

To add Python dependencies, simply edit the following file:

sandbox_base_image/python/requirements.txt

Add any additional packages you need, one per line (just like a normal pip requirements file).

🟨 Node.js

To add Node.js dependencies:

Navigate to the Node.js base image directory:
```
cd sandbox_base_image/nodejs
```
Use npm to install the desired packages. For example:
```
npm install lodash
```
The dependencies will be saved to package.json and package-lock.json, and included in the Docker image when rebuilt.

Usage

🐍 A Python example

def main(arg1: str, arg2: str) -> str:
    return f"result: {arg1 + arg2}"

🟨 JavaScript examples

A simple sync function

function main({arg1, arg2}) {
  return arg1+arg2
}

Async funcion with aioxs

const axios = require('axios');
async function main() {
  try {
    const response = await axios.get('https://github.com/infiniflow/ragflow');
    return 'Body:' + response.data;
  } catch (error) {
    return 'Error:' + error.message;
  }
}

📋 FAQ

❓Sandbox Not Working?

Follow this checklist to troubleshoot:

Is your machine compatible with gVisor?

Ensure that your system supports gVisor. Refer to the gVisor installation guide.
Is gVisor properly installed?

Common error:

HTTPConnectionPool(host='sandbox-executor-manager', port=9385): Read timed out.

Cause: runsc is an unknown or invalid Docker runtime. Fix:
- Install gVisor
- Restart Docker
- Test with:
```
docker run --rm --runtime=runsc hello-world
```
Is sandbox-executor-manager mapped in /etc/hosts?

Common error:

HTTPConnectionPool(host='none', port=9385): Max retries exceeded.

Fix:

Add the following entry to /etc/hosts:
```
127.0.0.1 es01 infinity mysql minio redis sandbox-executor-manager
```
Are you running the latest executor manager image?

Common error:

docker: Error response from daemon: client version 1.43 is too old. Minimum supported API version is 1.44

Fix:

Pull the refreshed image that bundles Docker CLI 29.1.0, or rebuild it in ./sandbox/executor_manager:
```
docker pull infiniflow/sandbox-executor-manager:latest
# or
docker build -t sandbox-executor-manager:latest ./sandbox/executor_manager
```
Have you enabled sandbox-related configurations in RAGFlow?

Double-check that all sandbox settings are correctly enabled in your RAGFlow configuration.
Have you pulled the required base images for the runners?

Common error:

HTTPConnectionPool(host='sandbox-executor-manager', port=9385): Read timed out.

Cause: no runner was started.

Fix:

Pull the necessary base images:
```
docker pull infiniflow/sandbox-base-nodejs:latest
docker pull infiniflow/sandbox-base-python:latest
```
Did you restart the service after making changes?

Any changes to configuration or environment require a full service restart to take effect.

❓Container pool is busy?

All available runners are currently in use, executing tasks/running code. Please try again shortly, or consider increasing the pool size in the configuration to improve availability and reduce wait times.

🤝 Contribution

Contributions are welcome!