ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-02-03 17:15:08 +08:00

Author	SHA1	Message	Date
Yesid Cano Castro	deeae8dba4	feat(connector): add Seafile as data source (#12945 ) ### What problem does this PR solve? This PR adds Seafile as a new data source connector for RAGFlow. [Seafile](https://www.seafile.com/) is an open-source, self-hosted file sync and share platform widely used by enterprises, universities, and organizations that require data sovereignty and privacy. Users who store documents in Seafile currently have no way to index and search their content through RAGFlow. This connector enables RAGFlow users to: - Connect to self-hosted Seafile servers via API token - Index documents from personal and shared libraries - Support incremental polling for updated files - Seamlessly integrate Seafile-stored documents into their RAG pipelines ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Changes included - `SeaFileConnector` implementing `LoadConnector` and `PollConnector` interfaces - Support for API token - Recursive file traversal across libraries - Time-based filtering for incremental updates - Seafile logo (sourced from Simple Icons, CC0) - Connector configuration and registration ### Testing - Tested against self-hosted Seafile Community Edition - Verified authentication (token) - Verified document ingestion from personal and shared libraries - Verified incremental polling with time filters	2026-02-03 13:42:05 +08:00
chanx	25bb2e1616	Fix:Optimize metadata and optimize the empty state style of the agent page. (#12960 ) ### What problem does this PR solve? Fix:Optimize metadata and optimize the empty state style of the agent page. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-03 11:43:44 +08:00
Jimmy Ben Klieve	fafaaa26c3	feat: memory status (#12959 ) ### What problem does this PR solve? Add memory status indicator and detail message dialog ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-02-03 11:16:18 +08:00
Philipp Heyken Soares	ad06c042c4	Support operator constraints in semi-automatic metadata filtering (#12956 ) ### What problem does this PR solve? #### Summary This PR enhances the Semi-automatic metadata filtering mode by allowing users to explicitly pre-define operators (e.g., contains, =, >, etc.) for selected metadata keys. While the LLM still dynamically extracts the filter value from the user's query, it is now strictly constrained to use the operator specified in the UI configuration. Using this feature is optional. By default the operator selection is set to "automatic" resulting in the LLM choosing the operator (as presently). #### Rationale & Use Case This enhancement was driven by a concrete challenge I encountered while working with technical documentation. In my specific use case, I was trying to filter for software versions within a technical manual. In this dataset, a single document chunk often applies to multiple software versions. These versions are stored as a combined string within the metadata for each chunk. When using the standard semi-automatic filter, the LLM would inconsistently choose between the contains and equals operators. When it chose equals, it would exclude every chunk that applied to more than one version, even if the version I was searching for was clearly included in that metadata string. This led to incomplete and frustrating retrieval results. By extending the semi-automatic filter to allow pre-defining the operator for a specific key, I was able to force the use of contains for the version field. This change immediately led to significantly improved and more reliable results in my case. I believe this functionality will be equally useful for others dealing with "tagged" or multi-value metadata where the relationship between the query and the field is known, but the specific value needs to remain dynamic. #### Key Changes ##### Backend & Core Logic - `common/metadata_utils.py`: Updated apply_meta_data_filter to support a mixed data structure for semi_auto (handling both legacy string arrays and the new object-based format {"key": "...", "op": "..."}). - `rag/prompts/generator.py`: Extended gen_meta_filter to accept and pass operator constraints to the LLM. - `rag/prompts/meta_filter.md`: Updated the system prompt to instruct the LLM to strictly respect provided operator constraints. ##### Frontend - `web/src/components/metadata-filter/metadata-semi-auto-fields.tsx`: Enhanced the UI to include an operator dropdown for each selected metadata key, utilizing existing operator constants. - `web/src/components/metadata-filter/index.tsx`: Updated the validation schema to accommodate the new state structure. #### Test Plan - Backward Compatibility: Verified that existing semi-auto filters stored as simple strings still function correctly. - Prompt Verification: Confirmed that constraints are correctly rendered in the LLM system prompt when specified. - Added unit tests as `test/unit_test/common/test_apply_semi_auto_meta_data_filter.py` - Manual End-to-End: - Configured a "Semi-automatic" filter for a "Version" key with the "contains" operator. - Asked a version-specific query. - Result <img width="1173" height="704" alt="Screenshot 2026-02-02 145359" src="https://github.com/user-attachments/assets/510a6a61-a231-4dc2-a7fe-cdfc07219132" /> ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): --------- Co-authored-by: Philipp Heyken Soares <philipp.heyken-soares@am.ai>	2026-02-03 11:11:34 +08:00
zhanglei	7cbe8b5b53	feat: Add a custom header to the SDK for chatting with the agent. (#12430 ) ### What problem does this PR solve? As title. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Liu An <asiro@qq.com>	2026-02-03 11:01:18 +08:00
Josh	aa8d0a36f1	Update default Docling version to 2.71.0 to resolve table parsing issues (#12952 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-03 10:24:51 +08:00
Stephen Hu	6c9ca45b30	Refactor: improve close for presentation (#12957 ) ### What problem does this PR solve? improve close for presentation ### Type of change - [x] Refactoring	2026-02-03 10:24:27 +08:00
Paul Y Hui	f028f74883	Fixed 12787 with syntax error in generated MySql json path expression (#12929 ) ### What problem does this PR solve? Fixed 12787 with syntax error in generated MySql json path expression https://github.com/infiniflow/ragflow/issues/12787 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) #### What was fixed: - Changed line 237 in ob_conn.py from value_str = get_value_str(value) if value else "" to value_str = get_value_str(value) - This fixes the bug where falsy but valid values (0, False, "", [], {}) were being converted to empty strings, causing invalid SQL syntax #### What was tested: - Comprehensive unit tests covering all edge cases - Regression tests specifically for the bug scenario --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-02-03 09:50:14 +08:00
writinwaters	59d7f3f456	Sandbox (#12951 ) ### What problem does this PR solve? Proofread the Sandbox Specification document and moved it to a dedicated folder outside of the original docs. ### Type of change - [x] Documentation Update	2026-02-03 09:43:41 +08:00
Magicbook1108	7be3dacdaa	Fix: custom delimeter in docx (#12946 ) ### What problem does this PR solve? Fix: custom delimeter in docx ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-03 09:43:18 +08:00
eviaaaaa	2e5a18602b	refactor: optimize agent list payload and improve multimodal detection logic (#12942 ) ## Description This PR focuses on API performance optimization and refining the model capability detection logic in the Agent/Canvas module. ### 1. Performance Optimization (Backend) - Changes: Removed `cls.model.dsl` from query fields in `UserCanvasService.get_by_tenant_ids`. - Reasoning: The `dsl` object is large and unnecessary for the Agent list view. Excluding it reduces the payload size of the `/v1/canvas/list` API, leading to faster serialization and reduced network latency. - Consistency: Full DSL data remains accessible via the individual `/v1/canvas/get/<id>` endpoint used in the detail view. ### 2. Multimodal Detection Refinement (Frontend) - Changes: Replaced `model_type === LlmModelType.Image2text` with `tags?.includes('IMAGE2TEXT')`. - Reasoning: In RAGFlow, `model_type` defines the primary role of a model (e.g., `chat`). However, many advanced Chat models are also vision-capable. Since `model_type` is a single-value field, it cannot represent these multiple capabilities. - Solution: Utilizing the `tags` field (which supports multiple attributes) to check for `IMAGE2TEXT` ensures that models like `gpt-5.2-pro` correctly display multimodal input options. ## Type of Change - [x] Bug fix (logic correction for multimodal detection) - [x] Optimization (performance improvement for list API) ## Main Changes - `api/db/services/canvas_service.py`: Optimized DB query by excluding heavy DSL fields. - `web/src/pages/agent/form/agent-form/index.tsx`: Enhanced capability detection using the tags system. ## Verification - [x] Verified Agent list loads faster with reduced response payload. - [x] Confirmed that `chat` models with the `IMAGE2TEXT` tag now correctly enable the multimodal input UI. nightly	2026-02-02 17:35:54 +08:00
Magicbook1108	0121866ce4	Fix: pdf page_number error (#12938 ) ### What problem does this PR solve? Fix: pdf page_number error #12937 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-02 17:35:00 +08:00
方程	8fc3986f70	fix(llm): Fix Gitee AI links and update the reranker model configuration (#12916 ) ### What problem does this PR solve? Fix Gitee AI links and update the reranker model configuration ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: franco <1787003204@q.comq>	2026-02-02 13:43:16 +08:00
eviaaaaa	1a2d69edc4	feat: Implement legacy .ppt parsing via Tika (alternative to Aspose) (#12932 ) ## What problem does this PR solve? This PR implements parsing support for legacy PowerPoint files (`.ppt`, 97-2003 format). Currently, parsing these files fails because `python-pptx` natively lacks support for the legacy OLE2 binary format. ## Context: I originally using `aspose-slides` for this purpose. However, since `aspose-slides` is no longer a project dependency, I implemented a fallback mechanism using the existing `tika-server` to ensure compatibility and stability. ## Key Changes: - Fallback Logic: Modified `rag/app/presentation.py` to catch `python-pptx` failures and automatically fall back to Tika parsing. - No New Dependencies: Utilizes the `tika` service that is already part of the RAGFlow stack. - Note: Since Tika focuses on text extraction, this implementation extracts text content but does not generate slide thumbnails . ## 🧪 Test / Verification Results ### 1. Before (The Issue) I have verified the fix using a legacy `.ppt` file (`math(1).ppt`, ~8MB). <img width="963" height="970" alt="image" src="https://github.com/user-attachments/assets/468c4ba8-f90b-4d7b-b969-9c5f5e42c474" /> ### 2. After (The Fix) With this PR, the system detects the failure in python-pptx and successfully falls back to Tika. The text is extracted correctly. <img width="1467" height="1121" alt="image" src="https://github.com/user-attachments/assets/fa0fed3b-b923-4c86-ba2c-24b3ce6ee7a6" /> Type of change - [x] New Feature (non-breaking change which adds functionality) Signed-off-by: evilhero <2278596667@qq.com> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-02-02 13:40:51 +08:00
akie	51210a1762	Add secondary index to infinity (#12825 ) Add secondary index: 1. kb_id 2. available_int --------- Signed-off-by: zpf121 <1219290549@qq.com> Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>	2026-02-02 13:22:29 +08:00
LIRUI YU	01f9b98c3f	Fix duplicate POSTGRES enum entry causing backend startup failure (#12936 ) ### What problem does this PR solve? Fixes a duplicate POSTGRES entry in the TextFieldType enum that triggers TypeError: 'POSTGRES' already defined as 'TEXT' on import, preventing the backend from starting and resulting in 502 errors on the frontend. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-02 12:34:48 +08:00
Paul Y Hui	af7acb23cb	Fixed regression issue that unable to start the service (#12933 ) ### What problem does this PR solve? Fixed the regression issue that unable to start the server as below details, which was related to this pr https://github.com/infiniflow/ragflow/pull/12926 looks like. ``` Error Trace Traceback (most recent call last): File "/mnt/c/Workspace/ragflow/api/ragflow_server.py", line 33, in <module> from api.apps import app File "/mnt/c/Workspace/ragflow/api/apps/__init__.py", line 26, in <module> Traceback (most recent call last): File "/mnt/c/Workspace/ragflow/rag/svr/task_executor.py", line 34, in <module> from api.db.db_models import close_connection, APIToken File "/mnt/c/Workspace/ragflow/api/db/db_models.py", line 49, in <module> from api.db.services.knowledgebase_service import KnowledgebaseService File "/mnt/c/Workspace/ragflow/api/db/services/__init__.py", line 19, in <module> class TextFieldType(Enum): File "/mnt/c/Workspace/ragflow/api/db/db_models.py", line 53, in TextFieldType from .user_service import UserService as UserService File "/mnt/c/Workspace/ragflow/api/db/services/user_service.py", line 24, in <module> POSTGRES = "TEXT" ^^^^^^^^ File "/usr/lib/python3.12/enum.py", line 443, in __setitem__ raise TypeError('%r already defined as %r' % (key, self[key])) TypeError: 'POSTGRES' already defined as 'TEXT' from api.db.db_models import DB, UserTenant File "/mnt/c/Workspace/ragflow/api/db/db_models.py", line 49, in <module> class TextFieldType(Enum): File "/mnt/c/Workspace/ragflow/api/db/db_models.py", line 53, in TextFieldType POSTGRES = "TEXT" ^^^^^^^^ File "/usr/lib/python3.12/enum.py", line 443, in __setitem__ raise TypeError('%r already defined as %r' % (key, self[key])) TypeError: 'POSTGRES' already defined as 'TEXT' ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-02-02 12:34:12 +08:00
Carve_	8bc12deb02	Fix:Duplicate enum member causes backend startup failure (#12931 ) ### What problem does this PR solve? close #12930 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Changes only delete duplicate definition in `api/db/db_models.py`	2026-02-02 12:33:07 +08:00
Liu An	1b587013d8	Fix: remove unused imports and f-string formatting (#12935 ) ### What problem does this PR solve? - Remove unused imports (Mock, patch, MagicMock, json, os, RAGFLOW_COLUMNS, VECTOR_FIELD_PATTERN) from multiple files - Replace f-string formatting with regular strings for console output messages in cli.py - Clean up unnecessary imports that were no longer being used in the codebase ### Type of change - [x] Refactoring	2026-02-02 12:11:39 +08:00
Se7en	332b11cf96	feat(tools): add Elasticsearch to OceanBase migration tool (#12927 ) ### What problem does this PR solve? fixes https://github.com/infiniflow/ragflow/issues/12774 Add a CLI tool for migrating RAGFlow data from Elasticsearch to OceanBase, enabling users to switch their document storage backend. - Automatic discovery and migration of all `ragflow_` indices - Schema conversion with vector dimension auto-detection - Batch processing with progress tracking and resume capability - Data consistency validation and migration report generation Note*: Due to network issues, I was unable to pull the required Docker images (Elasticsearch, OceanBase) to run the full end-to-end verification. Unit tests have been verified to pass. I will complete the e2e verification when network conditions allow, and submit a follow-up PR if any fixes are needed. ```bash ============================= test session starts ============================== platform darwin -- Python 3.13.6, pytest-9.0.2, pluggy-1.6.0 rootdir: /Users/sevenc/code/ai/oceanbase/ragflow/tools/es-to-oceanbase-migration configfile: pyproject.toml testpaths: tests plugins: anyio-4.12.1, asyncio-1.3.0, cov-7.0.0 collected 86 items tests/test_progress.py::TestMigrationProgress::test_create_basic_progress PASSED [ 1%] tests/test_progress.py::TestMigrationProgress::test_create_progress_with_counts PASSED [ 2%] tests/test_progress.py::TestMigrationProgress::test_progress_default_values PASSED [ 3%] tests/test_progress.py::TestMigrationProgress::test_progress_status_values PASSED [ 4%] tests/test_progress.py::TestProgressManager::test_create_progress_manager PASSED [ 5%] tests/test_progress.py::TestProgressManager::test_create_progress_manager_creates_dir PASSED [ 6%] tests/test_progress.py::TestProgressManager::test_create_progress PASSED [ 8%] tests/test_progress.py::TestProgressManager::test_save_and_load_progress PASSED [ 9%] tests/test_progress.py::TestProgressManager::test_load_nonexistent_progress PASSED [ 10%] tests/test_progress.py::TestProgressManager::test_delete_progress PASSED [ 11%] tests/test_progress.py::TestProgressManager::test_update_progress PASSED [ 12%] tests/test_progress.py::TestProgressManager::test_update_progress_multiple_batches PASSED [ 13%] tests/test_progress.py::TestProgressManager::test_mark_completed PASSED [ 15%] tests/test_progress.py::TestProgressManager::test_mark_failed PASSED [ 16%] tests/test_progress.py::TestProgressManager::test_mark_paused PASSED [ 17%] tests/test_progress.py::TestProgressManager::test_can_resume_running PASSED [ 18%] tests/test_progress.py::TestProgressManager::test_can_resume_paused PASSED [ 19%] tests/test_progress.py::TestProgressManager::test_can_resume_completed PASSED [ 20%] tests/test_progress.py::TestProgressManager::test_can_resume_nonexistent PASSED [ 22%] tests/test_progress.py::TestProgressManager::test_get_resume_info PASSED [ 23%] tests/test_progress.py::TestProgressManager::test_get_resume_info_nonexistent PASSED [ 24%] tests/test_progress.py::TestProgressManager::test_progress_file_path PASSED [ 25%] tests/test_progress.py::TestProgressManager::test_progress_file_content PASSED [ 26%] tests/test_schema.py::TestRAGFlowSchemaConverter::test_analyze_ragflow_mapping PASSED [ 27%] tests/test_schema.py::TestRAGFlowSchemaConverter::test_detect_vector_size PASSED [ 29%] tests/test_schema.py::TestRAGFlowSchemaConverter::test_unknown_fields PASSED [ 30%] tests/test_schema.py::TestRAGFlowSchemaConverter::test_get_column_definitions PASSED [ 31%] tests/test_schema.py::TestRAGFlowDataConverter::test_convert_basic_document PASSED [ 32%] tests/test_schema.py::TestRAGFlowDataConverter::test_convert_with_vector PASSED [ 33%] tests/test_schema.py::TestRAGFlowDataConverter::test_convert_array_fields PASSED [ 34%] tests/test_schema.py::TestRAGFlowDataConverter::test_convert_json_fields PASSED [ 36%] tests/test_schema.py::TestRAGFlowDataConverter::test_convert_unknown_fields_to_extra PASSED [ 37%] tests/test_schema.py::TestRAGFlowDataConverter::test_convert_kb_id_list PASSED [ 38%] tests/test_schema.py::TestRAGFlowDataConverter::test_convert_content_with_weight_dict PASSED [ 39%] tests/test_schema.py::TestRAGFlowDataConverter::test_convert_batch PASSED [ 40%] tests/test_schema.py::TestVectorFieldPattern::test_valid_patterns PASSED [ 41%] tests/test_schema.py::TestVectorFieldPattern::test_invalid_patterns PASSED [ 43%] tests/test_schema.py::TestVectorFieldPattern::test_extract_dimension PASSED [ 44%] tests/test_schema.py::TestConstants::test_array_columns PASSED [ 45%] tests/test_schema.py::TestConstants::test_json_columns PASSED [ 46%] tests/test_schema.py::TestConstants::test_ragflow_columns_completeness PASSED [ 47%] tests/test_schema.py::TestConstants::test_fts_columns PASSED [ 48%] tests/test_schema.py::TestConstants::test_ragflow_columns_types PASSED [ 50%] tests/test_schema.py::TestRAGFlowSchemaConverterEdgeCases::test_empty_mapping PASSED [ 51%] tests/test_schema.py::TestRAGFlowSchemaConverterEdgeCases::test_mapping_without_properties PASSED [ 52%] tests/test_schema.py::TestRAGFlowSchemaConverterEdgeCases::test_multiple_vector_fields PASSED [ 53%] tests/test_schema.py::TestRAGFlowSchemaConverterEdgeCases::test_get_column_definitions_without_analysis PASSED [ 54%] tests/test_schema.py::TestRAGFlowSchemaConverterEdgeCases::test_get_vector_fields PASSED [ 55%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_empty_document PASSED [ 56%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_document_without_source PASSED [ 58%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_boolean_to_integer PASSED [ 59%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_invalid_integer PASSED [ 60%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_float_field PASSED [ 61%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_array_with_special_characters PASSED [ 62%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_already_json_array PASSED [ 63%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_single_value_to_array PASSED [ 65%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_detect_vector_fields_from_document PASSED [ 66%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_with_default_values PASSED [ 67%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_list_content PASSED [ 68%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_convert_batch_empty PASSED [ 69%] tests/test_schema.py::TestRAGFlowDataConverterEdgeCases::test_existing_extra_field_merged PASSED [ 70%] tests/test_verify.py::TestVerificationResult::test_create_basic_result PASSED [ 72%] tests/test_verify.py::TestVerificationResult::test_result_default_values PASSED [ 73%] tests/test_verify.py::TestVerificationResult::test_result_with_counts PASSED [ 74%] tests/test_verify.py::TestMigrationVerifier::test_verify_counts_match PASSED [ 75%] tests/test_verify.py::TestMigrationVerifier::test_verify_counts_mismatch PASSED [ 76%] tests/test_verify.py::TestMigrationVerifier::test_verify_samples_all_match PASSED [ 77%] tests/test_verify.py::TestMigrationVerifier::test_verify_samples_some_missing PASSED [ 79%] tests/test_verify.py::TestMigrationVerifier::test_verify_samples_data_mismatch PASSED [ 80%] tests/test_verify.py::TestMigrationVerifier::test_values_equal_none_values PASSED [ 81%] tests/test_verify.py::TestMigrationVerifier::test_values_equal_array_columns PASSED [ 82%] tests/test_verify.py::TestMigrationVerifier::test_values_equal_json_columns PASSED [ 83%] tests/test_verify.py::TestMigrationVerifier::test_values_equal_kb_id_list PASSED [ 84%] tests/test_verify.py::TestMigrationVerifier::test_values_equal_content_with_weight_dict PASSED [ 86%] tests/test_verify.py::TestMigrationVerifier::test_determine_result_passed PASSED [ 87%] tests/test_verify.py::TestMigrationVerifier::test_determine_result_failed_count PASSED [ 88%] tests/test_verify.py::TestMigrationVerifier::test_determine_result_failed_samples PASSED [ 89%] tests/test_verify.py::TestMigrationVerifier::test_generate_report PASSED [ 90%] tests/test_verify.py::TestMigrationVerifier::test_generate_report_with_missing PASSED [ 91%] tests/test_verify.py::TestMigrationVerifier::test_generate_report_with_mismatches PASSED [ 93%] tests/test_verify.py::TestValueComparison::test_string_comparison PASSED [ 94%] tests/test_verify.py::TestValueComparison::test_integer_comparison PASSED [ 95%] tests/test_verify.py::TestValueComparison::test_float_comparison PASSED [ 96%] tests/test_verify.py::TestValueComparison::test_boolean_comparison PASSED [ 97%] tests/test_verify.py::TestValueComparison::test_empty_array_comparison PASSED [ 98%] tests/test_verify.py::TestValueComparison::test_nested_json_comparison PASSED [100%] ======================= 86 passed, 88 warnings in 0.66s ======================== ``` ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-01-31 16:11:27 +08:00
NTLx	c4c3f744c0	feat: add Peewee ORM support for OceanBase as primary database (#12769 ) (#12926 ) ## Summary This PR adds Peewee ORM support for OceanBase as the primary database in RAGFlow, as requested in issue #12769. ## Changes ### Core Implementation 1. RetryingPooledOceanBaseDatabase Class - Inherits from `PooledMySQLDatabase` (OceanBase is MySQL-compatible) - Implements retry mechanism for connection issues - Handles MySQL-specific error codes (2013, 2006 for connection loss) - Provides connection pool management 2. PooledDatabase Enum - Added `OCEANBASE = RetryingPooledOceanBaseDatabase` 3. DatabaseLock Enum - Added `OCEANBASE = MysqlDatabaseLock` - OceanBase uses MySQL-style locking 4. TextFieldType Enum - Added `OCEANBASE = "LONGTEXT"` - OceanBase uses same text field type as MySQL 5. DatabaseMigrator Enum - Added `OCEANBASE = MySQLMigrator` - OceanBase uses MySQL migration tools ### Usage ```bash # Set environment variable to use OceanBase export DB_TYPE=oceanbase # Configure connection (in docker/.env or environment) OCEANBASE_HOST=localhost OCEANBASE_PORT=2881 OCEANBASE_USER=root OCEANBASE_PASSWORD=password OCEANBASE_DATABASE=ragflow ``` ### Technical Details - Location: `api/db/db_models.py` - Dependencies: No new dependencies (uses existing Peewee MySQL support) - Code Size: ~90 lines - Difficulty: Simple ### Testing - Added comprehensive unit tests in `tests/unit/test_oceanbase_peewee.py` - Tests cover: - OceanBase database class existence and inheritance - Enum values for PooledDatabase, DatabaseLock, TextFieldType - Initialization with custom retry settings - Environment variable configuration ### Acceptance Criteria ✅ Can switch to OceanBase database via `DB_TYPE=oceanbase` environment variable ✅ All database operations work normally in OceanBase environment ✅ OceanBase uses MySQL compatibility mode (no additional dependencies) ### Background This is part of the RAGFlow + OceanBase Hackathon to allow users to choose OceanBase as RAGFlow's primary database, leveraging OceanBase's high availability and scalability. --- ## Related Issues - Primary: https://github.com/infiniflow/ragflow/issues/12769 - Context: https://github.com/oceanbase/seekdb/issues/123 (OceanBase Developer Challenge) --- Closes infiniflow/ragflow#12769	2026-01-31 15:45:20 +08:00
Carve_	23bdf25a1f	feature:Add OceanBase Storage Support for Table Parser (#12923 ) ### What problem does this PR solve? close #12770 This PR adds OceanBase as a storage backend for the Table Parser. It enables dynamic table schema storage via JSON and implements OceanBase SQL execution for text-to-SQL retrieval. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Changes - Table Parser stores row data into `chunk_data` when doc engine is OceanBase. (table.py) - OceanBase table schema adds `chunk_data` JSON column and migrates if needed. - Implemented OceanBase `sql()` to execute text-to-SQL results. (ob_conn.py) - Add `DOC_ENGINE_OCEANBASE` flag for engine detection (setting.py) ### Test 1. Set `DOC_ENGINE=oceanbase` (e.g. in `docker/.env`) <img width="1290" height="783" alt="doc_engine_ob" src="https://github.com/user-attachments/assets/7d1c609f-7bf2-4b2e-b4cc-4243e72ad4f1" /> 2. Upload an Excel file to Knowledge Base.(for test, we use as below) <img width="786" height="930" alt="excel" src="https://github.com/user-attachments/assets/bedf82f2-cd00-426b-8f4d-6978a151231a" /> 3. Choose Table as parsing method. <img width="2550" height="1134" alt="parse_excel" src="https://github.com/user-attachments/assets/aba11769-02be-4905-97e1-e24485e24cd0" /> 4.Ask a natural language query in chat. <img width="2550" height="1134" alt="query" src="https://github.com/user-attachments/assets/26a910a6-e503-4ac7-b66a-f5754bbb0e91" />	2026-01-31 15:11:54 +08:00
Carve_	ee23b9eb63	feature:Add OceanBase Support to Text-to-SQL Agent (#12919 ) ### What problem does this PR solve? Close #12768. This PR adds OceanBase support to RAGFlow’s Text-to-SQL (ExeSQL) component. OceanBase is integrated via MySQL compatibility mode, and the UI `db_type` options are updated accordingly. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Changes Backend - Add `oceanbase` `db_type` validation and connection logic in `exesql.py` and reuse existing MySQL compatibility mode Frontend - Add OceanBase option to the ExeSQL `db_type` selector ### How to test 1. Configure OceanBase connection in ExeSQL node (host/port/user/password/database) 2. Input: “Show 10 rows from test table” 3. Generated SQL: `SELECT * FROM test LIMIT 10;` 4. Query executes successfully and results are returned ### Screenshots - ExeSQL db_type includes OceanBase <img width="649" height="1015" alt="2" src="https://github.com/user-attachments/assets/e0a5f7b9-e282-402a-8639-64c1aef8fce6" /> - ExeSQL test OceanBase connection <img width="2247" height="1140" alt="test_ob" src="https://github.com/user-attachments/assets/f16ebd93-b48e-4d18-b53f-8496581e755d" /> - Query results from OceanBase shown in UI <img width="2550" height="1351" alt="1" src="https://github.com/user-attachments/assets/b44163dc-baab-420d-b31e-b644bdcb77a9" />	2026-01-31 15:03:40 +08:00
Liu An	c4f60b349d	Fix(test): downgrade test priorities (#12913 ) ### What problem does this PR solve? Changed test priorities in multiple test files, downgrading from p1 to p2 and p2 to p3. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-30 20:02:56 +08:00
writinwaters	eb75b1ce82	Docs: Fixed a docusaurus display issue (#12914 ) ### What problem does this PR solve? Fixed a docusaurus display issue. ### Type of change - [x] Documentation Update	2026-01-30 18:04:05 +08:00
Haipeng LI	e385b19d67	Test: Add code coverage reporting to CI (#12874 ) ### What problem does this PR solve? Add code coverage reporting to CI ### Type of change - [x] Test (please describe): coverage report --------- Co-authored-by: Liu An <asiro@qq.com>	2026-01-30 14:49:16 +08:00
Phives	87305cb08c	fix: close file handles when loading JSON mapping in doc store connectors (#12904 ) What problem does this PR solve? When loading JSON mapping/schema files, the code used json.load(open(path)) without closing the file. The file handle stayed open until garbage collection, which can leak file descriptors under load (e.g. repeated reconnects or migrations). Type of change [x] Bug Fix (non-breaking change which fixes an issue) Change Replaced json.load(open(...)) with a context manager so the file is closed after loading: with open(fp_mapping, "r") as f: ... = json.load(f) Files updated rag/utils/opensearch_conn.py – mapping load (1 place) common/doc_store/es_conn_base.py – mapping load + doc_meta_mapping load (2 places) common/doc_store/infinity_conn_base.py – schema loads in _migrate_db, doc metadata table creation, and SQL field mapping (4 places) Behavior is unchanged; only resource handling is fixed. Co-authored-by: Gittensor Miner <miner@gittensor.io>	2026-01-30 14:07:51 +08:00
qinling0210	212d6f3660	Fix metadata in get_list() (#12906 ) ### What problem does this PR solve? test_update_document.py failed as metadata is not included in the response of get_list(), fix the issue. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-30 14:06:49 +08:00
Kevin Hu	f262d416fe	Refa: remove aspose dependency. (#12910 ) ### Type of change - [x] Refactoring	2026-01-30 14:06:19 +08:00
Kevin Hu	f1c2fac03e	Refa: remove ppt image. (#12909 ) ### What problem does this PR solve? remove `aspose` ### Type of change - [x] Refactoring	2026-01-30 13:35:42 +08:00
BitToby	73645e2f78	fix: preserve line breaks in prompt editor and add auto-save on blur (#12887 ) Closes #12762 ### What problem does this PR solve? Line break issue in Agent prompt editor: - Text with blank lines in `system_prompt` or `user_prompt` would have extra/fewer blank lines after save/reload or paste - Root cause: Mismatch between Lexical editor's paragraph nodes (`\n\n` separator) and line break nodes (`\n` separator) Auto-save issue: - Changes were only saved after 20-second debounce, causing data loss on page refresh before timer completed ### Solution 1. Line break fix: Use `LineBreakNode` consistently for all line breaks (typing Enter, paste, load) 2. Auto-save: Save immediately when prompt editor loses focus [1.webm](https://github.com/user-attachments/assets/eb2c2428-54a3-4d4e-8037-6cc34a859b83) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-30 10:29:51 +08:00
Liu An	4947e9473a	Fix(test): Update error message assertions for unsupported content type tests (#12901 ) ### What problem does this PR solve? This commit updates test cases for create, delete, and update dataset endpoints to expect consistent error messages when an unsupported content type is provided. ### Type of change - [x] Bug Fix (test)	2026-01-30 09:45:04 +08:00
Angel98518	98b6a0e6d1	feat: Add OceanBase Performance Monitoring and Health Check Integration (#12886 ) ## Description This PR implements comprehensive OceanBase performance monitoring and health check functionality as requested in issue #12772. The implementation follows the existing ES/Infinity health check patterns and provides detailed metrics for operations teams. ## Problem Currently, RAGFlow lacks detailed health monitoring for OceanBase when used as the document engine. Operations teams need visibility into: - Connection status and latency - Storage space usage - Query throughput (QPS) - Slow query statistics - Connection pool utilization ## Solution ### 1. Enhanced OBConnection Class (`rag/utils/ob_conn.py`) Added comprehensive performance monitoring methods: - `get_performance_metrics()` - Main method returning all performance metrics - `_get_storage_info()` - Retrieves database storage usage - `_get_connection_pool_stats()` - Gets connection pool statistics - `_get_slow_query_count()` - Counts queries exceeding threshold - `_estimate_qps()` - Estimates queries per second - Enhanced `health()` method with connection status ### 2. Health Check Utilities (`api/utils/health_utils.py`) Added two new functions following ES/Infinity patterns: - `get_oceanbase_status()` - Returns OceanBase status with health and performance metrics - `check_oceanbase_health()` - Comprehensive health check with detailed metrics ### 3. API Endpoint (`api/apps/system_app.py`) Added new endpoint: - `GET /v1/system/oceanbase/status` - Returns OceanBase health status and performance metrics ### 4. Comprehensive Unit Tests (`test/unit_test/utils/test_oceanbase_health.py`) Added 340+ lines of unit tests covering: - Health check success/failure scenarios - Performance metrics retrieval - Error handling and edge cases - Connection pool statistics - Storage information retrieval - QPS estimation - Slow query detection ## Metrics Provided - Connection Status: connected/disconnected - Latency: Query latency in milliseconds - Storage: Used and total storage space - QPS: Estimated queries per second - Slow Queries: Count of queries exceeding threshold - Connection Pool: Active connections, max connections, pool size ## Testing - All unit tests pass - Error handling tested for connection failures - Edge cases covered (missing tables, connection errors) - Follows existing code patterns and conventions ## Code Statistics - Total Lines Changed: 665+ lines - New Code: ~600 lines - Test Coverage: 340+ lines of comprehensive tests - Files Modified: 3 - Files Created: 1 (test file) ## Acceptance Criteria Met ✅ `/system/oceanbase/status` API returns OceanBase health status ✅ Monitoring metrics accurately reflect OceanBase running status ✅ Clear error messages when health checks fail ✅ Response time optimized (metrics cached where possible) ✅ Follows existing ES/Infinity health check patterns ✅ Comprehensive test coverage ## Related Files - `rag/utils/ob_conn.py` - OceanBase connection class - `api/utils/health_utils.py` - Health check utilities - `api/apps/system_app.py` - System API endpoints - `test/unit_test/utils/test_oceanbase_health.py` - Unit tests Fixes #12772 --------- Co-authored-by: Daniel <daniel@example.com>	2026-01-30 09:44:42 +08:00
Yongteng Lei	183803e56b	Pref: fix thread pool workers (#12882 ) ### What problem does this PR solve? Fixed thread pool workers and improve retrieval component ### Type of change - [x] Refactoring - [x] Performance Improvement	2026-01-30 09:44:23 +08:00
writinwaters	efb136c29c	Docs: minor (#12899 ) ### What problem does this PR solve? Removed redundant command + "the current version" @JinHai-CN ### Type of change - [x] Documentation Update	2026-01-29 19:23:18 +08:00
eviaaaaa	c59ae4c7c2	Fix: codeExec return types & error handling; Update Spark model mappings (#12896 ) ## What problem does this PR solve? This PR addresses three specific issues to improve agent reliability and model support: 1. `codeExec` Output Limitation: Previously, the `codeExec` tool was strictly limited to returning `string` types. I updated the output constraint to `object` to support structured data (Dicts, Lists, etc.) required for complex downstream tasks. 2. `codeExec` Error Handling: Improved the execution logic so that when runtime errors occur, the tool captures the exception and returns the error message as the output instead of causing the process to abort or fail silently. 3. Spark Model Configuration: - Added support for the `MAX-32k` model variant. - Fixed the `Spark-Lite` mapping from `general` to `lite` to match the latest API specifications. ## Type of change - [x] Bug Fix (fixes execution logic and model mapping) - [x] New Feature / Enhancement (adds model support and improves tool flexibility) ## Key Changes ### `agent/tools/code_exec.py` - Changed the output type definition from `string` to `object`. - Refactored the execution flow to gracefully catch exceptions and return error messages as part of the tool output. ### `rag/llm/chat_model.py` - Added `"Spark-Max-32K": "max-32k"` to the model list. - Updated `"Spark-Lite"` value from `"general"` to `"lite"`. ## Checklist - [x] My code follows the style guidelines of this project. - [x] I have performed a self-review of my own code. Signed-off-by: evilhero <2278596667@qq.com>	2026-01-29 19:22:35 +08:00
writinwaters	d99f6a611a	Refact: Updated UI tips. (#12898 ) ### What problem does this PR solve? ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2026-01-29 17:56:07 +08:00
6ba3i	7053d3683c	Feat: Add CLI retrieval test to CI workflow (#12881 ) ### What problem does this PR solve? Adds a CLI-based retrieval test to CI after the Elasticsearch HTTP API tests to validate end-to-end admin/user flows and dataset retrieval via ragflow_cli.py. This helps catch regressions in the CLI path that aren’t covered by existing API tests. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-29 17:55:32 +08:00
Jimmy Ben Klieve	ec88e17710	fix: task executor bar chart error (#12894 ) ### What problem does this PR solve? Fix wrong data rendered in task executor bar chart ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-29 17:34:05 +08:00
Kevin Hu	32c0161ff1	Refa: Clean the folders. (#12890 ) ### Type of change - [x] Refactoring	2026-01-29 14:23:26 +08:00
akie	d86b7f9721	Remove filter (kb_id) in infinity (#12853 ) Secondary indexes in infinity do not support IN expr --------- Signed-off-by: zpf121 <1219290549@qq.com>	2026-01-29 11:04:25 +08:00
Philipp Heyken Soares	6305c7e411	Fix metadata filter (#12861 ) ### What problem does this PR solve? ##### Summary This PR fixes a bug in the metadata filtering logic where the contains and not contains operators were behaving identically to the in and not in operators. It also standardizes the syntax for string-based operators. ##### The Issue On the main branch, the contains operator was implemented as: `matched = input in value if not isinstance(input, list) else all(i in value for i in input)` This logic is identical to the `in` operator. It checks if the metadata (`input`) exists within the filter (`value`). For a "contains" search, the logic should be reversed: _we want to check if the filter value exists within the metadata input_. ##### Solution Presented Here The operators have been rewritten using str.find(): Contains: `str(input).find(value) >= 0` Not Contains: `str(input).find(value) == -1` ##### Advantage This approach places the metadata (input) on the left side of the expression. This maintains stylistic consistency with the existing start with and end with operators in the same file, which also place the input on the left (e.g., str(input).lower().startswith(...)). ##### Considered Alternative In a previous PR we considered using the standard Python `in` operator: `value in str(input)`. The `in` operator is approximately 15% faster because it uses optimized Python bytecode (CONTAINS_OP) and avoids an attribute lookup. However following rejection of this PR we now propose the change presented here. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): --------- Co-authored-by: Philipp Heyken Soares <philipp.heyken-soares@am.ai>	2026-01-29 09:59:48 +08:00
dependabot[bot]	47e55ab324	Chore(deps): Bump starlette from 0.46.2 to 0.49.1 in /agent/sandbox (#12878 ) Bumps [starlette](https://github.com/Kludex/starlette) from 0.46.2 to 0.49.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/Kludex/starlette/releases">starlette's releases</a>.</em></p> <blockquote> <h2>Version 0.49.1</h2> <p>This release fixes a security vulnerability in the parsing logic of the <code>Range</code> header in <code>FileResponse</code>.</p> <p>You can view the full security advisory: <a href="https://github.com/Kludex/starlette/security/advisories/GHSA-7f5h-v6xp-fcq8">GHSA-7f5h-v6xp-fcq8</a></p> <h2>Fixed</h2> <ul> <li>Optimize the HTTP ranges parsing logic <a href="`4ea6e22b48`">4ea6e22b489ec388d6004cfbca52dd5b147127c5</a></li> </ul> <hr /> <p><strong>Full Changelog</strong>: <a href="https://github.com/Kludex/starlette/compare/0.49.0...0.49.1">https://github.com/Kludex/starlette/compare/0.49.0...0.49.1</a></p> <h2>Version 0.49.0</h2> <h2>Added</h2> <ul> <li>Add <code>encoding</code> parameter to <code>Config</code> class <a href="https://redirect.github.com/Kludex/starlette/pull/2996">#2996</a>.</li> <li>Support multiple cookie headers in <code>Request.cookies</code> <a href="https://redirect.github.com/Kludex/starlette/pull/3029">#3029</a>.</li> <li>Use <code>Literal</code> type for <code>WebSocketEndpoint</code> encoding values <a href="https://redirect.github.com/Kludex/starlette/pull/3027">#3027</a>.</li> </ul> <h2>Changed</h2> <ul> <li>Do not pollute exception context in <code>Middleware</code> when using <code>BaseHTTPMiddleware</code> <a href="https://redirect.github.com/Kludex/starlette/pull/2976">#2976</a>.</li> </ul> <hr /> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/TheWesDias"><code>@TheWesDias</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3017">Kludex/starlette#3017</a></li> <li><a href="https://github.com/gmos2104"><code>@gmos2104</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3027">Kludex/starlette#3027</a></li> <li><a href="https://github.com/secrett2633"><code>@secrett2633</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/2996">Kludex/starlette#2996</a></li> <li><a href="https://github.com/adam-sikora"><code>@adam-sikora</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/2976">Kludex/starlette#2976</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/Kludex/starlette/compare/0.48.0...0.49.0">https://github.com/Kludex/starlette/compare/0.48.0...0.49.0</a></p> <h2>Version 0.48.0</h2> <h2>Added</h2> <ul> <li>Add official Python 3.14 support <a href="https://redirect.github.com/Kludex/starlette/pull/3013">#3013</a>.</li> </ul> <h2>Changed</h2> <ul> <li>Implement <a href="https://www.rfc-editor.org/rfc/rfc9110">RFC9110</a> http status names <a href="https://redirect.github.com/Kludex/starlette/pull/2939">#2939</a>.</li> </ul> <hr /> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/yakimka"><code>@yakimka</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/2943">Kludex/starlette#2943</a></li> <li><a href="https://github.com/mbeijen"><code>@mbeijen</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/2939">Kludex/starlette#2939</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/Kludex/starlette/compare/0.47.3...0.48.0">https://github.com/Kludex/starlette/compare/0.47.3...0.48.0</a></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/Kludex/starlette/blob/main/docs/release-notes.md">starlette's changelog</a>.</em></p> <blockquote> <h2>0.49.1 (October 28, 2025)</h2> <p>This release fixes a security vulnerability in the parsing logic of the <code>Range</code> header in <code>FileResponse</code>.</p> <p>You can view the full security advisory: <a href="https://github.com/Kludex/starlette/security/advisories/GHSA-7f5h-v6xp-fcq8">GHSA-7f5h-v6xp-fcq8</a></p> <h4>Fixed</h4> <ul> <li>Optimize the HTTP ranges parsing logic <a href="`4ea6e22b48`">4ea6e22b489ec388d6004cfbca52dd5b147127c5</a></li> </ul> <h2>0.49.0 (October 28, 2025)</h2> <h4>Added</h4> <ul> <li>Add <code>encoding</code> parameter to <code>Config</code> class <a href="https://redirect.github.com/Kludex/starlette/pull/2996">#2996</a>.</li> <li>Support multiple cookie headers in <code>Request.cookies</code> <a href="https://redirect.github.com/Kludex/starlette/pull/3029">#3029</a>.</li> <li>Use <code>Literal</code> type for <code>WebSocketEndpoint</code> encoding values <a href="https://redirect.github.com/Kludex/starlette/pull/3027">#3027</a>.</li> </ul> <h4>Changed</h4> <ul> <li>Do not pollute exception context in <code>Middleware</code> when using <code>BaseHTTPMiddleware</code> <a href="https://redirect.github.com/Kludex/starlette/pull/2976">#2976</a>.</li> </ul> <h2>0.48.0 (September 13, 2025)</h2> <h4>Added</h4> <ul> <li>Add official Python 3.14 support <a href="https://redirect.github.com/Kludex/starlette/pull/3013">#3013</a>.</li> </ul> <h4>Changed</h4> <ul> <li>Implement <a href="https://www.rfc-editor.org/rfc/rfc9110">RFC9110</a> http status names <a href="https://redirect.github.com/Kludex/starlette/pull/2939">#2939</a>.</li> </ul> <h2>0.47.3 (August 24, 2025)</h2> <h4>Fixed</h4> <ul> <li>Use <code>asyncio.iscoroutinefunction</code> for Python 3.12 and older <a href="https://redirect.github.com/Kludex/starlette/pull/2984">#2984</a>.</li> </ul> <h2>0.47.2 (July 20, 2025)</h2> <h4>Fixed</h4> <ul> <li>Make <code>UploadFile</code> check for future rollover <a href="https://redirect.github.com/Kludex/starlette/pull/2962">#2962</a>.</li> </ul> <h2>0.47.1 (June 21, 2025)</h2> <h4>Fixed</h4> <ul> <li>Use <code>Self</code> in <code>TestClient.__enter__</code> <a href="https://redirect.github.com/Kludex/starlette/pull/2951">#2951</a>.</li> <li>Allow async exception handlers to type-check <a href="https://redirect.github.com/Kludex/starlette/pull/2949">#2949</a>.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`7e4b7428f2`"><code>7e4b742</code></a> Version 0.49.1 (<a href="https://redirect.github.com/Kludex/starlette/issues/3047">#3047</a>)</li> <li><a href="`4ea6e22b48`"><code>4ea6e22</code></a> Merge commit from fork</li> <li><a href="`7d88ea6f8e`"><code>7d88ea6</code></a> Version 0.49.0 (<a href="https://redirect.github.com/Kludex/starlette/issues/3046">#3046</a>)</li> <li><a href="`26d66bbfb0`"><code>26d66bb</code></a> Do not pollute exception context in Middleware (<a href="https://redirect.github.com/Kludex/starlette/issues/2976">#2976</a>)</li> <li><a href="`a59397db88`"><code>a59397d</code></a> Set encodings when reading config files (<a href="https://redirect.github.com/Kludex/starlette/issues/2996">#2996</a>)</li> <li><a href="`3b7f0cbf59`"><code>3b7f0cb</code></a> test: add test for unknown status (<a href="https://redirect.github.com/Kludex/starlette/issues/3035">#3035</a>)</li> <li><a href="`b09ce1a99d`"><code>b09ce1a</code></a> docs: fix legibility issues on sponsorship page (<a href="https://redirect.github.com/Kludex/starlette/issues/3039">#3039</a>)</li> <li><a href="`0f0edcf800`"><code>0f0edcf</code></a> Revert "Add Marcelo Trylesinski to the license (<a href="https://redirect.github.com/Kludex/starlette/issues/3025">#3025</a>)" (<a href="https://redirect.github.com/Kludex/starlette/issues/3044">#3044</a>)</li> <li><a href="`3912d63137`"><code>3912d63</code></a> docs: add social icons (<a href="https://redirect.github.com/Kludex/starlette/issues/3038">#3038</a>)</li> <li><a href="`4915a9309f`"><code>4915a93</code></a> Add discord to README/docs (<a href="https://redirect.github.com/Kludex/starlette/issues/3034">#3034</a>)</li> <li>Additional commits viewable in <a href="https://github.com/Kludex/starlette/compare/0.46.2...0.49.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=starlette&package-manager=uv&previous-version=0.46.2&new-version=0.49.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/infiniflow/ragflow/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-01-28 19:04:01 +08:00
dependabot[bot]	82b932dbc7	Chore(deps): Bump urllib3 from 2.4.0 to 2.6.3 in /agent/sandbox (#12877 ) Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.4.0 to 2.6.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/urllib3/urllib3/releases">urllib3's releases</a>.</em></p> <blockquote> <h2>2.6.3</h2> <h2>🚀 urllib3 is fundraising for HTTP/2 support</h2> <p><a href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3 is raising ~$40,000 USD</a> to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects <a href="https://opencollective.com/urllib3">please consider contributing financially</a> to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.</p> <p>Thank you for your support.</p> <h2>Changes</h2> <ul> <li>Fixed a security issue where decompression-bomb safeguards of the streaming API were bypassed when HTTP redirects were followed. (CVE-2026-21441 reported by <a href="https://github.com/D47A"><code>@D47A</code></a>, 8.9 High, GHSA-38jv-5279-wg99)</li> <li>Started treating <code>Retry-After</code> times greater than 6 hours as 6 hours by default. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3743">urllib3/urllib3#3743</a>)</li> <li>Fixed <code>urllib3.connection.VerifiedHTTPSConnection</code> on Emscripten. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3752">urllib3/urllib3#3752</a>)</li> </ul> <h2>2.6.2</h2> <h2>🚀 urllib3 is fundraising for HTTP/2 support</h2> <p><a href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3 is raising ~$40,000 USD</a> to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects <a href="https://opencollective.com/urllib3">please consider contributing financially</a> to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.</p> <p>Thank you for your support.</p> <h2>Changes</h2> <ul> <li>Fixed <code>HTTPResponse.read_chunked()</code> to properly handle leftover data in the decoder's buffer when reading compressed chunked responses. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3734">urllib3/urllib3#3734</a>)</li> </ul> <h2>2.6.1</h2> <h2>🚀 urllib3 is fundraising for HTTP/2 support</h2> <p><a href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3 is raising ~$40,000 USD</a> to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects <a href="https://opencollective.com/urllib3">please consider contributing financially</a> to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.</p> <p>Thank you for your support.</p> <h2>Changes</h2> <ul> <li>Restore previously removed <code>HTTPResponse.getheaders()</code> and <code>HTTPResponse.getheader()</code> methods. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3731">#3731</a>)</li> </ul> <h2>2.6.0</h2> <h2>🚀 urllib3 is fundraising for HTTP/2 support</h2> <p><a href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3 is raising ~$40,000 USD</a> to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects <a href="https://opencollective.com/urllib3">please consider contributing financially</a> to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.</p> <p>Thank you for your support.</p> <h2>Security</h2> <ul> <li>Fixed a security issue where streaming API could improperly handle highly compressed HTTP content ("decompression bombs") leading to excessive resource consumption even when a small amount of data was requested. Reading small chunks of compressed data is safer and much more efficient now. (CVE-2025-66471 reported by <a href="https://github.com/Cycloctane"><code>@Cycloctane</code></a>, 8.9 High, GHSA-2xpw-w6gg-jr37)</li> <li>Fixed a security issue where an attacker could compose an HTTP response with virtually unlimited links in the <code>Content-Encoding</code> header, potentially leading to a denial of service (DoS) attack by exhausting system resources during decoding. The number of allowed chained encodings is now limited to 5. (CVE-2025-66418 reported by <a href="https://github.com/illia-v"><code>@illia-v</code></a>, 8.9 High, GHSA-gm62-xv2j-4w53)</li> </ul> <blockquote> <p>[!IMPORTANT]</p> <ul> <li>If urllib3 is not installed with the optional <code>urllib3[brotli]</code> extra, but your environment contains a Brotli/brotlicffi/brotlipy package anyway, make sure to upgrade it to at least Brotli 1.2.0 or brotlicffi 1.2.0.0 to benefit from the security fixes and avoid warnings. Prefer using <code>urllib3[brotli]</code> to install a compatible Brotli package automatically.</li> </ul> </blockquote> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/urllib3/urllib3/blob/main/CHANGES.rst">urllib3's changelog</a>.</em></p> <blockquote> <h1>2.6.3 (2026-01-07)</h1> <ul> <li>Fixed a high-severity security issue where decompression-bomb safeguards of the streaming API were bypassed when HTTP redirects were followed. (<code>GHSA-38jv-5279-wg99 <https://github.com/urllib3/urllib3/security/advisories/GHSA-38jv-5279-wg99></code>__)</li> <li>Started treating <code>Retry-After</code> times greater than 6 hours as 6 hours by default. (<code>[#3743](https://github.com/urllib3/urllib3/issues/3743) <https://github.com/urllib3/urllib3/issues/3743></code>__)</li> <li>Fixed <code>urllib3.connection.VerifiedHTTPSConnection</code> on Emscripten. (<code>[#3752](https://github.com/urllib3/urllib3/issues/3752) <https://github.com/urllib3/urllib3/issues/3752></code>__)</li> </ul> <h1>2.6.2 (2025-12-11)</h1> <ul> <li>Fixed <code>HTTPResponse.read_chunked()</code> to properly handle leftover data in the decoder's buffer when reading compressed chunked responses. (<code>[#3734](https://github.com/urllib3/urllib3/issues/3734) <https://github.com/urllib3/urllib3/issues/3734></code>__)</li> </ul> <h1>2.6.1 (2025-12-08)</h1> <ul> <li>Restore previously removed <code>HTTPResponse.getheaders()</code> and <code>HTTPResponse.getheader()</code> methods. (<code>[#3731](https://github.com/urllib3/urllib3/issues/3731) <https://github.com/urllib3/urllib3/issues/3731></code>__)</li> </ul> <h1>2.6.0 (2025-12-05)</h1> <h2>Security</h2> <ul> <li>Fixed a security issue where streaming API could improperly handle highly compressed HTTP content ("decompression bombs") leading to excessive resource consumption even when a small amount of data was requested. Reading small chunks of compressed data is safer and much more efficient now. (<code>GHSA-2xpw-w6gg-jr37 <https://github.com/urllib3/urllib3/security/advisories/GHSA-2xpw-w6gg-jr37></code>__)</li> <li>Fixed a security issue where an attacker could compose an HTTP response with virtually unlimited links in the <code>Content-Encoding</code> header, potentially leading to a denial of service (DoS) attack by exhausting system resources during decoding. The number of allowed chained encodings is now limited to 5. (<code>GHSA-gm62-xv2j-4w53 <https://github.com/urllib3/urllib3/security/advisories/GHSA-gm62-xv2j-4w53></code>__)</li> </ul> <p>.. caution::</p> <ul> <li>If urllib3 is not installed with the optional <code>urllib3[brotli]</code> extra, but your environment contains a Brotli/brotlicffi/brotlipy package anyway, make sure to upgrade it to at least Brotli 1.2.0 or brotlicffi 1.2.0.0 to benefit from the security fixes and avoid warnings. Prefer using</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0248277dd7`"><code>0248277</code></a> Release 2.6.3</li> <li><a href="`8864ac407b`"><code>8864ac4</code></a> Merge commit from fork</li> <li><a href="`70cecb27ca`"><code>70cecb2</code></a> Fix Scorecard issues related to vulnerable dev dependencies (<a href="https://redirect.github.com/urllib3/urllib3/issues/3755">#3755</a>)</li> <li><a href="`41f249abe1`"><code>41f249a</code></a> Move "v2.0 Migration Guide" to the end of the table of contents (<a href="https://redirect.github.com/urllib3/urllib3/issues/3747">#3747</a>)</li> <li><a href="`fd4dffd2fc`"><code>fd4dffd</code></a> Patch <code>VerifiedHTTPSConnection</code> for Emscripten (<a href="https://redirect.github.com/urllib3/urllib3/issues/3752">#3752</a>)</li> <li><a href="`13f0bfd55e`"><code>13f0bfd</code></a> Handle massive values in Retry-After when calculating time to sleep for (<a href="https://redirect.github.com/urllib3/urllib3/issues/3743">#3743</a>)</li> <li><a href="`8c480bf87b`"><code>8c480bf</code></a> Bump actions/upload-artifact from 5.0.0 to 6.0.0 (<a href="https://redirect.github.com/urllib3/urllib3/issues/3748">#3748</a>)</li> <li><a href="`4b40616e95`"><code>4b40616</code></a> Bump actions/cache from 4.3.0 to 5.0.1 (<a href="https://redirect.github.com/urllib3/urllib3/issues/3750">#3750</a>)</li> <li><a href="`82b8479663`"><code>82b8479</code></a> Bump actions/download-artifact from 6.0.0 to 7.0.0 (<a href="https://redirect.github.com/urllib3/urllib3/issues/3749">#3749</a>)</li> <li><a href="`34284cb017`"><code>34284cb</code></a> Mention experimental features in the security policy (<a href="https://redirect.github.com/urllib3/urllib3/issues/3746">#3746</a>)</li> <li>Additional commits viewable in <a href="https://github.com/urllib3/urllib3/compare/2.4.0...2.6.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=urllib3&package-manager=uv&previous-version=2.4.0&new-version=2.6.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/infiniflow/ragflow/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-01-28 19:03:41 +08:00
LIRUI YU	c8bd413e4c	Fixed bug: Prevent 400 errors from Image2Text providers by skipping images smaller than 11px on any side during figure enhancement. (#12868 ) ### What problem does this PR solve? During figure enhancement, some cropped figure images are extremely small. Sending these to the Image2Text/VLM provider fails with a 400 invalid_parameter_error because the image width/height must be >10px. This aborts the enhancement step. This PR adds a minimal size guard to skip tiny crops and continue processing. <img width="1084" height="494" alt="image" src="https://github.com/user-attachments/assets/ad074270-94e6-4571-91c8-37df85212639" /> ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-01-28 14:59:02 +08:00
Magicbook1108	2c4499ec45	Fix: key error "content" #12844 (#12847 ) ### What problem does this PR solve? Fix: key error "content" #12844 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-01-28 14:39:34 +08:00
dive2tech	15a534909f	fix: avoid ZeroDivisionError when fulltext column weights sum to zero (#12856 ) ### What problem does this PR solve? When all fulltext_search_columns use explicit weight 0 (e.g. "col^0"), weight_sum is 0 and dividing by it raises ZeroDivisionError. Use equal weights 1/n when weight_sum <= 0 and n > 0; otherwise normalize as before. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update - [x] Refactoring	2026-01-28 14:38:03 +08:00
qinling0210	9a5208976c	Put document metadata in ES/Infinity (#12826 ) ### What problem does this PR solve? Put document metadata in ES/Infinity. Index name of meta data: ragflow_doc_meta_{tenant_id} ### Type of change - [x] Refactoring	2026-01-28 13:29:34 +08:00
Zhichang Yu	fd11aca8e5	feat: Implement pluggable multi-provider sandbox architecture (#12820 ) ## Summary Implement a flexible sandbox provider system supporting both self-managed (Docker) and SaaS (Aliyun Code Interpreter) backends for secure code execution in agent workflows. Key Changes: - ✅ Aliyun Code Interpreter provider using official `agentrun-sdk>=0.0.16` - ✅ Self-managed provider with gVisor (runsc) security - ✅ Arguments parameter support for dynamic code execution - ✅ Database-only configuration (removed fallback logic) - ✅ Configuration scripts for quick setup Issue #12479 ## Features ### 🔌 Provider Abstraction Layer 1. Self-Managed Provider (`agent/sandbox/providers/self_managed.py`) - Wraps existing executor_manager HTTP API - gVisor (runsc) for secure container isolation - Configurable pool size, timeout, retry logic - Languages: Python, Node.js, JavaScript - ⚠️ Requires: gVisor installation, Docker, base images 2. Aliyun Code Interpreter (`agent/sandbox/providers/aliyun_codeinterpreter.py`) - SaaS integration using official agentrun-sdk - Serverless microVM execution with auto-authentication - Hard timeout: 30 seconds max - Credentials: `AGENTRUN_ACCESS_KEY_ID`, `AGENTRUN_ACCESS_KEY_SECRET`, `AGENTRUN_ACCOUNT_ID`, `AGENTRUN_REGION` - Automatically wraps code to call `main()` function 3. E2B Provider (`agent/sandbox/providers/e2b.py`) - Placeholder for future integration ### ⚙️ Configuration System - `conf/system_settings.json`: Default provider = `aliyun_codeinterpreter` - `agent/sandbox/client.py`: Enforces database-only configuration - Admin UI: `/admin/sandbox-settings` - Configuration validation via `validate_config()` method - Health checks for all providers ### 🎯 Key Capabilities Arguments Parameter Support: All providers support passing arguments to `main()` function: ```python # User code def main(name: str, count: int) -> dict: return {"message": f"Hello {name}!" * count} # Executed with: arguments={"name": "World", "count": 3} # Result: {"message": "Hello World!Hello World!Hello World!"} ``` Self-Describing Providers: Each provider implements `get_config_schema()` returning form configuration for Admin UI Error Handling: Structured `ExecutionResult` with stdout, stderr, exit_code, execution_time ## Configuration Scripts Two scripts for quick Aliyun sandbox setup: Shell Script (requires jq): ```bash source scripts/configure_aliyun_sandbox.sh ``` Python Script (interactive): ```bash python3 scripts/configure_aliyun_sandbox.py ``` ## Testing ```bash # Unit tests uv run pytest agent/sandbox/tests/test_providers.py -v # Aliyun provider tests uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter.py -v # Integration tests (requires credentials) uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py -v # Quick SDK validation python3 agent/sandbox/tests/verify_sdk.py ``` Test Coverage: - 30 unit tests for provider abstraction - Provider-specific tests for Aliyun - Integration tests with real API - Security tests for executor_manager ## Documentation - `docs/develop/sandbox_spec.md` - Complete architecture specification - `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration from legacy sandbox - `agent/sandbox/tests/QUICKSTART.md` - Quick start guide - `agent/sandbox/tests/README.md` - Testing documentation ## Breaking Changes ⚠️ Migration Required: 1. Directory Move: `sandbox/` → `agent/sandbox/` - Update imports: `from sandbox.` → `from agent.sandbox.` 2. Mandatory Configuration: - SystemSettings must have `sandbox.provider_type` configured - Removed fallback default values - Configuration must exist in database (from `conf/system_settings.json`) 3. Aliyun Credentials: - Requires `AGENTRUN_` environment variables (not `ALIYUN_`) - `AGENTRUN_ACCOUNT_ID` is now required (Aliyun primary account ID) 4. Self-Managed Provider: - gVisor (runsc) must be installed for security - Install: `go install gvisor.dev/gvisor/runsc@latest` ## Database Schema Changes ```python # SystemSettings.value: CharField → TextField api/db/db_models.py: Changed for unlimited config length # SystemSettingsService.get_by_name(): Fixed query precision api/db/services/system_settings_service.py: startswith → exact match ``` ## Files Changed ### Backend (Python) - `agent/sandbox/providers/base.py` - SandboxProvider ABC interface - `agent/sandbox/providers/manager.py` - ProviderManager - `agent/sandbox/providers/self_managed.py` - Self-managed provider - `agent/sandbox/providers/aliyun_codeinterpreter.py` - Aliyun provider - `agent/sandbox/providers/e2b.py` - E2B provider (placeholder) - `agent/sandbox/client.py` - Unified client (enforces DB-only config) - `agent/tools/code_exec.py` - Updated to use provider system - `admin/server/services.py` - SandboxMgr with registry & validation - `admin/server/routes.py` - 5 sandbox API endpoints - `conf/system_settings.json` - Default: aliyun_codeinterpreter - `api/db/db_models.py` - TextField for SystemSettings.value - `api/db/services/system_settings_service.py` - Exact match query ### Frontend (TypeScript/React) - `web/src/pages/admin/sandbox-settings.tsx` - Settings UI - `web/src/services/admin-service.ts` - Sandbox service functions - `web/src/services/admin.service.d.ts` - Type definitions - `web/src/utils/api.ts` - Sandbox API endpoints ### Documentation - `docs/develop/sandbox_spec.md` - Architecture spec - `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration guide - `agent/sandbox/tests/QUICKSTART.md` - Quick start - `agent/sandbox/tests/README.md` - Testing guide ### Configuration Scripts - `scripts/configure_aliyun_sandbox.sh` - Shell script (jq) - `scripts/configure_aliyun_sandbox.py` - Python script ### Tests - `agent/sandbox/tests/test_providers.py` - 30 unit tests - `agent/sandbox/tests/test_aliyun_codeinterpreter.py` - Provider tests - `agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py` - Integration tests - `agent/sandbox/tests/verify_sdk.py` - SDK validation ## Architecture ``` Admin UI → Admin API → SandboxMgr → ProviderManager → [SelfManaged\|Aliyun\|E2B] ↓ SystemSettings ``` ## Usage ### 1. Configure Provider Via Admin UI: 1. Navigate to `/admin/sandbox-settings` 2. Select provider (Aliyun Code Interpreter / Self-Managed) 3. Fill in configuration 4. Click "Test Connection" to verify 5. Click "Save" to apply Via Configuration Scripts: ```bash # Aliyun provider export AGENTRUN_ACCESS_KEY_ID="xxx" export AGENTRUN_ACCESS_KEY_SECRET="yyy" export AGENTRUN_ACCOUNT_ID="zzz" export AGENTRUN_REGION="cn-shanghai" source scripts/configure_aliyun_sandbox.sh ``` ### 2. Restart Service ```bash cd docker docker compose restart ragflow-server ``` ### 3. Execute Code in Agent ```python from agent.sandbox.client import execute_code result = execute_code( code='def main(name: str) -> dict: return {"message": f"Hello {name}!"}', language="python", timeout=30, arguments={"name": "World"} ) print(result.stdout) # {"message": "Hello World!"} ``` ## Troubleshooting ### "Container pool is busy" (Self-Managed) - Cause: Pool exhausted (default: 1 container in `.env`) - Fix: Increase `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` to 5+ ### "Sandbox provider type not configured" - Cause: Database missing configuration - Fix: Run config script or set via Admin UI ### "gVisor not found" - Cause: runsc not installed - Fix: `go install gvisor.dev/gvisor/runsc@latest && sudo cp ~/go/bin/runsc /usr/local/bin/` ### Aliyun authentication errors - Cause: Wrong environment variable names - Fix: Use `AGENTRUN_` prefix (not `ALIYUN_`) ## Checklist - [x] All tests passing (30 unit tests + integration tests) - [x] Documentation updated (spec, migration guide, quickstart) - [x] Type definitions added (TypeScript) - [x] Admin UI implemented - [x] Configuration validation - [x] Health checks implemented - [x] Error handling with structured results - [x] Breaking changes documented - [x] Configuration scripts created - [x] gVisor requirements documented Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-28 13:28:21 +08:00
Yongteng Lei	b57c82b122	Feat: add kimi-k2.5 (#12852 ) ### What problem does this PR solve? Add kimi-k2.5 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-01-28 12:41:20 +08:00

1 2 3 4 5 ...

5206 Commits