ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-02-07 02:55:08 +08:00

Author	SHA1	Message	Date
buua436	57edc215d7	Feat:update webhook component (#11739 ) ### What problem does this PR solve? issue: https://github.com/infiniflow/ragflow/issues/10427 https://github.com/infiniflow/ragflow/issues/8115 change: - Support for Multiple HTTP Methods (POST / GET / PUT / PATCH / DELETE / HEAD) - Security Validation 1. max_body_size 2. IP whitelist 3. rate limit 4. token / basic / jwt authentication - File Upload Support - Unified Content-Type Handling - Full Schema-Based Extraction & Type Validation - Two Execution Modes: Immediately / Streaming ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-18 19:34:39 +08:00
Magicbook1108	2331b3a270	Refact: Update loggings (#12014 ) ### What problem does this PR solve? Refact: Update loggings ### Type of change - [x] Refactoring	2025-12-18 14:18:03 +08:00
concertdictate	4dd8cdc38b	task executor issues (#12006 ) ### What problem does this PR solve? Fixes #8706 - `InfinityException: TOO_MANY_CONNECTIONS` when running multiple task executor workers ### Problem Description When running RAGFlow with 8-16 task executor workers, most workers fail to start properly. Checking logs revealed that workers were stuck/hanging during Infinity connection initialization - only 1-2 workers would successfully register in Redis while the rest remained blocked. ### Root Cause The Infinity SDK `ConnectionPool` pre-allocates all connections in `__init__`. With the default `max_size=32` and multiple workers (e.g., 16), this creates 16×32=512 connections immediately on startup, exceeding Infinity's default 128 connection limit. Workers hang while waiting for connections that can never be established. ### Changes 1. Prevent Infinity connection storm (`rag/utils/infinity_conn.py`, `rag/svr/task_executor.py`) - Reduced ConnectionPool `max_size` from 32 to 4 (sufficient since operations are synchronous) - Added staggered startup delay (2s per worker) to spread connection initialization 2. Handle None children_delimiter (`rag/app/naive.py`) - Use `or ""` to handle explicitly set None values from parser config 3. MinerU parser robustness (`deepdoc/parser/mineru_parser.py`) - Use `.get()` for optional output fields that may be missing - Fix DISCARDED block handling: change `pass` to `continue` to skip discarded blocks entirely ### Why `max_size=4` is sufficient \| Workers \| Pool Size \| Total Connections \| Infinity Limit \| \|---------\|-----------\|-------------------\|----------------\| \| 16 \| 32 \| 512 \| 128 ❌ \| \| 16 \| 4 \| 64 \| 128 ✅ \| \| 32 \| 4 \| 128 \| 128 ✅ \| - All RAGFlow operations are synchronous: `get_conn()` → operation → `release_conn()` - No parallel `docStoreConn` operations in the codebase - Maximum 1-2 concurrent connections needed per worker; 4 provides safety margin ### MinerU DISCARDED block bug When MinerU returns blocks with `type: "discarded"` (headers, footers, watermarks, page numbers, artifacts), the previous code used `pass` which left the `section` variable undefined, causing: - UnboundLocalError if DISCARDED is the first block - Duplicate content if DISCARDED follows another block (stale value from previous iteration) Root cause confirmed via MinerU source code: From [`mineru/utils/enum_class.py`](https://github.com/opendatalab/MinerU/blob/main/mineru/utils/enum_class.py#L14): ```python class BlockType: DISCARDED = 'discarded' # VLM 2.5+ also has: HEADER, FOOTER, PAGE_NUMBER, ASIDE_TEXT, PAGE_FOOTNOTE ``` Per [MinerU documentation](https://opendatalab.github.io/MinerU/reference/output_files/), discarded blocks contain content that should be filtered out for clean text extraction. Fix: Changed `pass` to `continue` to skip discarded blocks entirely. ### Testing - Verified all 16 workers now register successfully in Redis - All workers heartbeating correctly - Document parsing works as expected - MinerU parsing with DISCARDED blocks no longer crashes ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: user210 <user210@rt>	2025-12-18 10:03:30 +08:00
Magicbook1108	82d4e5fb87	Ref: update loggings (#11987 ) ### What problem does this PR solve? Ref: update loggins ### Type of change - [x] Refactoring	2025-12-17 15:43:25 +08:00
Jin Hai	30019dab9f	Change knowledge base to dataset (#11976 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-17 10:03:33 +08:00
YngvarHuang	81eb03d230	Support uploading encrypted files to object storage (#11837 ) (#11838 ) ### What problem does this PR solve? Support uploading encrypted files to object storage. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: virgilwong <hyhvirgil@gmail.com>	2025-12-15 09:45:18 +08:00
Andrea Bugeja	74afb8d710	feat: Add Single Bucket Mode for MinIO/S3 (#11416 ) ## Overview This PR adds support for Single Bucket Mode in RAGFlow, allowing users to configure MinIO/S3 to use a single bucket with a directory structure instead of creating multiple buckets per Knowledge Base and user folder. ## Problem Statement The current implementation creates one bucket per Knowledge Base and one bucket per user folder, which can be problematic when: - Cloud providers charge per bucket - IAM policies restrict bucket creation - Organizations want centralized data management in a single bucket ## Solution Added a `prefix_path` configuration option to the MinIO connector that enables: - Using a single bucket with directory-based organization - Backward compatibility with existing multi-bucket deployments - Support for MinIO, AWS S3, and other S3-compatible storage backends ## Changes - `rag/utils/minio_conn.py`: Enhanced MinIO connector to support single bucket mode with prefix paths - `conf/service_conf.yaml`: Added new configuration options (`bucket` and `prefix_path`) - `docker/service_conf.yaml.template`: Updated template with single bucket configuration examples - `docker/.env.single-bucket-example`: Added example environment variables for single bucket setup - `docs/single-bucket-mode.md`: Comprehensive documentation covering usage, migration, and troubleshooting ## Configuration Example ```yaml minio: user: "access-key" password: "secret-key" host: "minio.example.com:443" bucket: "ragflow-bucket" # Single bucket name prefix_path: "ragflow" # Optional prefix path ``` ## Backward Compatibility ✅ Fully backward compatible - existing deployments continue to work without any changes - If `bucket` is not configured, uses default multi-bucket behavior - If `bucket` is configured without `prefix_path`, uses bucket root - If both are configured, uses `bucket/prefix_path/` structure ## Testing - Tested with MinIO (local and cloud) - Verified backward compatibility with existing multi-bucket mode - Validated IAM policy restrictions work correctly ## Documentation Included comprehensive documentation in `docs/single-bucket-mode.md` covering: - Configuration examples - Migration guide from multi-bucket to single-bucket mode - IAM policy examples - Troubleshooting guide --- Related Issue: Addresses use cases where bucket creation is restricted or costly	2025-12-11 19:22:47 +08:00
buua436	e3cfe8e848	Fix:async issue and sensitive logging (#11895 ) ### What problem does this PR solve? change： async issue and sensitive logging ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-11 13:54:47 +08:00
He Wang	badf33e3b9	feat: enhance OBConnection.search (#11876 ) ### What problem does this PR solve? Enhance OBConnection.search for better performance. Main changes: 1. Use string type of vector array in distance func for better parsing performance. 2. Manually set max_connections as pool size instead of using default value. 3. Set 'fulltext_search_columns' when starting. 4. Cache the results of the table existence check (we will never drop the table). 5. Remove unused 'group_results' logic. 6. Add the `USE_FULLTEXT_FIRST_FUSION_SEARCH` flag, and the corresponding fusion search SQL when it's false. ### Type of change - [x] Performance Improvement	2025-12-10 19:13:37 +08:00
buua436	3cb72377d7	Refa:remove sensitive information (#11873 ) ### What problem does this PR solve? change: remove sensitive information ### Type of change - [x] Refactoring	2025-12-10 19:08:45 +08:00
buua436	65a5a56d95	Refa:replace trio with asyncio (#11831 ) ### What problem does this PR solve? change: replace trio with asyncio ### Type of change - [x] Refactoring	2025-12-09 19:23:14 +08:00
Zhichang Yu	bb6022477e	Bump infinity to v0.6.11. Requires python>=3.11 (#11814 ) ### What problem does this PR solve? Bump infinity to v0.6.11. Requires python>=3.11 ### Type of change - [x] Refactoring	2025-12-09 16:23:37 +08:00
Wiratama	cbdacf21f6	feat(gcs): Add support for Google Cloud Storage (GCS) integration (#11718 ) ### What problem does this PR solve? This Pull Request introduces native support for Google Cloud Storage (GCS) as an optional object storage backend. Currently, RAGFlow relies on a limited set of storage options. This feature addresses the need for seamless integration with GCP environments, allowing users to leverage a fully managed, highly durable, and scalable storage service (GCS) instead of needing to deploy and maintain third-party object storage solutions. This simplifies deployment, especially for users running on GCP infrastructure like GKE or Cloud Run. The implementation uses a single GCS bucket defined via configuration, mapping RAGFlow's internal logical storage units (or "buckets") to folder prefixes within that GCS container to maintain data separation. This architectural choice avoids the operational complexities associated with dynamically creating and managing unique GCS buckets for every logical unit. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-04 10:44:05 +08:00
hsparks-codes	4870d42949	feat: Auto-disable Raptor for structured data (Issue #11653 ) (#11676 ) ### What problem does this PR solve? Feature: This PR implements automatic Raptor disabling for structured data files to address issue #11653. Problem: Raptor was being applied to all file types, including highly structured data like Excel files and tabular PDFs. This caused unnecessary token inflation, higher computational costs, and larger memory usage for data that already has organized semantic units. Solution: Automatically skip Raptor processing for: - Excel files (.xls, .xlsx, .xlsm, .xlsb) - CSV files (.csv, .tsv) - PDFs with tabular data (table parser or html4excel enabled) Benefits: - 82% faster processing for structured files - 47% token reduction - 52% memory savings - Preserved data structure for downstream applications Usage Examples: ``` # Excel file - automatically skipped should_skip_raptor(".xlsx") # True # CSV file - automatically skipped should_skip_raptor(".csv") # True # Tabular PDF - automatically skipped should_skip_raptor(".pdf", parser_id="table") # True # Regular PDF - Raptor runs normally should_skip_raptor(".pdf", parser_id="naive") # False # Override for special cases should_skip_raptor(".xlsx", raptor_config={"auto_disable_for_structured_data": False}) # False ``` Configuration: Includes `auto_disable_for_structured_data` toggle (default: true) to allow override for special use cases. Testing: 44 comprehensive tests, 100% passing ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-03 17:02:29 +08:00
Kevin Hu	88a28212b3	Fix: Table parse method issue. (#11627 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-01 12:42:35 +08:00
Lei Zhang	7499608a8b	feat: add Redis username support (#11608 ) ### What problem does this PR solve? Support for Redis 6+ ACL authentication (username) close #11606 ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2025-12-01 11:26:20 +08:00
Zhichang Yu	856201c0f2	Fix ft_title_rag_fine (#11555 ) ### What problem does this PR solve? Fix ft_title_rag_fine ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-11-27 10:26:08 +08:00
Zhichang Yu	40e84ca41a	Use Infinity single-field-multi-index (#11444 ) ### What problem does this PR solve? Use Infinity single-field-multi-index ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-11-26 11:06:37 +08:00
He Wang	38234aca53	feat: add OceanBase doc engine (#11228 ) ### What problem does this PR solve? Add OceanBase doc engine. Close #5350 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-11-20 10:00:14 +08:00
Jin Hai	bd4bc57009	Refactor: move mcp connection utilities to common (#11304 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-17 15:34:17 +08:00
Lynn	b5f2cf16bc	Fix: check task executor alive and display status (#11270 ) ### What problem does this PR solve? Correctly check task executor alive and display status. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-11-14 15:52:28 +08:00
Jin Hai	296476ab89	Refactor function name (#11210 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-12 19:00:15 +08:00
Zhichang Yu	bacc9d3ab9	Revert PR#11151 (#11196 ) ### What problem does this PR solve? Revert PR#11151 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-11-12 11:58:02 +08:00
Zhichang Yu	68b952abb1	Don't select vector on infinity (#11151 ) ### What problem does this PR solve? Don't select vector on infinity ### Type of change - [x] Performance Improvement	2025-11-10 18:01:40 +08:00
Jin Hai	f98b24c9bf	Move api.settings to common.settings (#11036 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-06 09:36:38 +08:00
Yongteng Lei	17ea5c1dee	Fix: MCP cannot handle empty Auth field properly (#11034 ) ### What problem does this PR solve? Fix MCP cannot handle empty Auth field properly, then result in ```bash 2025-11-05 11:10:41,919 INFO 51209 Negotiated protocol version: 2025-06-18 2025-11-05 11:10:41,920 INFO 51209 client_session initialized successfully 2025-11-05 11:10:41,994 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:10:41] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 - 2025-11-05 11:10:41,999 INFO 51209 Want to clean up 1 MCP sessions 2025-11-05 11:10:42,000 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context. 2025-11-05 11:10:42,001 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:10:42] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 - 2025-11-05 11:11:30,441 INFO 51209 Negotiated protocol version: 2025-06-18 2025-11-05 11:11:30,442 INFO 51209 client_session initialized successfully 2025-11-05 11:11:30,520 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:30] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 - 2025-11-05 11:11:30,525 INFO 51209 Want to clean up 1 MCP sessions 2025-11-05 11:11:30,526 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context. 2025-11-05 11:11:30,527 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:30] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 - 2025-11-05 11:11:31,476 INFO 51209 Negotiated protocol version: 2025-06-18 2025-11-05 11:11:31,476 INFO 51209 client_session initialized successfully 2025-11-05 11:11:31,549 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:31] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 - 2025-11-05 11:11:31,552 INFO 51209 Want to clean up 1 MCP sessions 2025-11-05 11:11:31,553 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context. 2025-11-05 11:11:31,554 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:31] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 - 2025-11-05 11:11:51,930 ERROR 51209 unhandled errors in a TaskGroup (1 sub-exception) + Exception Group Traceback (most recent call last): \| File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 86, in _mcp_server_loop \| async with streamablehttp_client(url, headers) as (read_stream, write_stream, _): \| File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/contextlib.py", line 217, in __aexit__ \| await self.gen.athrow(typ, value, traceback) \| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 478, in streamablehttp_client \| async with anyio.create_task_group() as tg: \| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 781, in __aexit__ \| raise BaseExceptionGroup( \| exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception) +-+---------------- 1 ---------------- \| Traceback (most recent call last): \| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 409, in handle_request_async \| await self._handle_post_request(ctx) \| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 278, in _handle_post_request \| response.raise_for_status() \| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/httpx/_models.py", line 829, in raise_for_status \| raise HTTPStatusError(message, request=request, response=self) \| httpx.HTTPStatusError: Server error '502 Bad Gateway' for url 'http://192.168.1.38:9382/mcp' \| For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/502 +------------------------------------ 2025-11-05 11:11:51,942 ERROR 51209 Error fetching tools from MCP server: streamable-http: http://192.168.1.38:9382/mcp Traceback (most recent call last): File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 168, in get_tools return future.result(timeout=timeout) File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.__get_result() File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result raise self._exception File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._get_tools_from_mcp_server) at 0x7d58f02e2c20>", line 40, in _get_tools_from_mcp_server File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 160, in _get_tools_from_mcp_server result: ListToolsResult = await self._call_mcp_server("list_tools", timeout=timeout) File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._call_mcp_server) at 0x7d58f02e2b00>", line 63, in _call_mcp_server File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 139, in _call_mcp_server raise result ValueError: Connection failed (possibly due to auth error). Please check authentication settings first 2025-11-05 11:11:51,943 ERROR 51209 Test MCP error: Connection failed (possibly due to auth error). Please check authentication settings first Traceback (most recent call last): File "/home/xxxxxxxxx/workspace/ragflow/api/apps/mcp_server_app.py", line 429, in test_mcp tools = tool_call_session.get_tools(timeout) File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession.get_tools) at 0x7d58f02e2cb0>", line 40, in get_tools File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 168, in get_tools return future.result(timeout=timeout) File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 458, in result return self.__get_result() File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result raise self._exception File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._get_tools_from_mcp_server) at 0x7d58f02e2c20>", line 40, in _get_tools_from_mcp_server File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 160, in _get_tools_from_mcp_server result: ListToolsResult = await self._call_mcp_server("list_tools", timeout=timeout) File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._call_mcp_server) at 0x7d58f02e2b00>", line 63, in _call_mcp_server File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 139, in _call_mcp_server raise result ValueError: Connection failed (possibly due to auth error). Please check authentication settings first 2025-11-05 11:11:51,944 INFO 51209 Want to clean up 1 MCP sessions 2025-11-05 11:11:51,945 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context. 2025-11-05 11:11:51,946 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:51] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 - 2025-11-05 11:12:20,484 INFO 51209 Negotiated protocol version: 2025-06-18 2025-11-05 11:12:20,485 INFO 51209 client_session initialized successfully 2025-11-05 11:12:20,570 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:12:20] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 - 2025-11-05 11:12:20,573 INFO 51209 Want to clean up 1 MCP sessions 2025-11-05 11:12:20,574 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context. 2025-11-05 11:12:20,575 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:12:20] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 - 2025-11-05 11:15:02,119 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:15:02] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 - 2025-11-05 11:16:24,967 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:16:24] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 - 2025-11-05 11:30:24,284 ERROR 51209 Task was destroyed but it is pending! task: <Task pending name='Task-58' coro=<MCPToolCallSession._mcp_server_loop() running at <@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._mcp_server_loop) at 0x7d58f02e29e0>:11> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[_chain_future.<locals>._call_set_state() at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/futures.py:392]> 2025-11-05 11:30:24,285 ERROR 51209 Task was destroyed but it is pending! task: <Task pending name='Task-67' coro=<Queue.get() running at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/queues.py:159> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[_release_waiter(<Future pendi...ask_wakeup()]>)() at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/tasks.py:387]> Exception ignored in: <coroutine object Queue.get at 0x7d585480ace0> Traceback (most recent call last): File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/queues.py", line 161, in get getter.cancel() # Just in case getter is not done yet. File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/base_events.py", line 753, in call_soon self._check_closed() File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed raise RuntimeError('Event loop is closed') RuntimeError: Event loop is closed ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-11-05 19:15:27 +08:00
Jin Hai	02d10f8eda	Move var from rag.settings to common.globals (#11022 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-05 15:48:50 +08:00
Jin Hai	1a9215bc6f	Move some vars to globals (#11017 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-05 14:14:38 +08:00
Jin Hai	bab3fce136	Move some constants to common (#11004 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-05 08:01:39 +08:00
Jin Hai	880a6a0428	Move some enumerate type to constants.py (#10998 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-04 19:25:25 +08:00
Stephen Hu	c20f5675c6	Fix: elasticsearch connection hardcoded (#10975 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/10930 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-11-04 10:59:35 +08:00
Jin Hai	9a486e0f51	Move some funcs from api to rag module (#10972 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-03 19:26:09 +08:00
Jin Hai	076d811086	Introduce common/config_utils.py (#10968 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-03 17:25:06 +08:00
Jin Hai	78631a3fd3	Move some functions out of 'api/utils/common.py' (#10948 ) ### What problem does this PR solve? as title. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-03 12:34:47 +08:00
Jin Hai	360f5c1179	Move token related functions to common (#10942 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-03 08:50:05 +08:00
Jin Hai	44f2d6f5da	Move 'get_project_base_directory' to common directory (#10940 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-02 21:05:28 +08:00
Jin Hai	6447b737ab	Move singleton to common directory (#10935 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-02 12:24:08 +08:00
Jin Hai	f52e56c2d6	Remove 'get_lan_ip' and add common misc_utils.py (#10880 ) ### What problem does this PR solve? Add get_uuid, download_img and hash_str2int into misc_utils.py ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-10-31 16:42:01 +08:00
Stephen Hu	0ecccd27eb	Refactor:improve the logic for rerank models to cal the total token count (#10882 ) ### What problem does this PR solve? improve the logic for rerank models to cal the total token count ### Type of change - [x] Refactoring	2025-10-31 09:46:16 +08:00
Yongteng Lei	a3bb4aadcc	Fix: predictable token generation (#10868 ) ### What problem does this PR solve? Fix predictable token generation. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-30 09:31:36 +08:00
Stephen Hu	d86d7061ea	Refactor: Improve how to get total token count for AnthropicCV (#10658 ) ### What problem does this PR solve? Improve how to get total token count for AnthropicCV ### Type of change - [x] Refactoring	2025-10-29 09:41:15 +08:00
Jin Hai	766d900a41	Refactor: rename rmSpace to remove_redundant_spaces (#10796 ) ### What problem does this PR solve? - rename rmSpace to remove_redundant_spaces - move clean_markdown_block to common module - add unit tests for remove_redundant_spaces and clean_markdown_block ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-10-28 09:46:32 +08:00
Stephen Hu	50e93d1528	Fix: Opendal miss tenant id (#10774 ) ### What problem does this PR solve? as https://github.com/infiniflow/ragflow/pull/10712 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-27 10:28:08 +08:00
Stephen Hu	307cdc62ea	fix:RAGFlowOSS.put() got an unexpected keyword argument 'tenant_id' (#10712 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/10700 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-22 09:30:41 +08:00
Yongteng Lei	86b254d214	Improve file management (#10577 ) ### What problem does this PR solve? Improve file management. #10287. Passed tests: 1. Create folder `A` and `B`. 2. Upload a file inside `A`, called `file`. 3. Create a KB, called `K`. 3. Link `file` to `K`. 4. Parse `file` inside of `K`. (OK) 5. Move `file` from `A` to `B`. 6. Parse `file` inside of `K`. (OK) 7. Move `file` from `B` to `A`. 8. Parse `file` inside of `K`. (OK) 9. Move entire folder `A` into `B`. (B -> A -> file) 10. Parse `file` inside of `K`. (OK) 11. Delete folder `B`. 12. All clear. (There is no document inside of `K`) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-16 09:38:25 +08:00
Zhichang Yu	e48bec1cbf	Don't rerank for infinity (#10579 ) ### What problem does this PR solve? Don't need rerank for infinity since Infinity normalizes each way score before fusion. ### Type of change - [x] Refactoring	2025-10-15 20:15:49 +08:00
Kevin Hu	f92a45dcc4	Feat: let toc run asynchronizly... (#10513 ) ### What problem does this PR solve? #10436 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-14 14:14:52 +08:00
Stephen Hu	4585edc20e	Refactor: improve cv model logics (#10414 ) 1. improve how to get total token count Improve how to get total token count ### Type of change - [x] Refactoring	2025-10-09 09:47:36 +08:00
Zhichang Yu	518a00630e	Fix highlight with infinity (#10345 ) Fix highlight with infinity Fix on OpenSUSE Tumbleweed ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-30 19:15:01 +08:00
Lynn	fb950079ef	Feat/service manage (#10381 ) ### What problem does this PR solve? - Admin service support SHOW SERVICE <id>. ### Type of change - [x] New Feature (non-breaking change which adds functionality) issue: #10241	2025-09-30 16:23:09 +08:00

1 2 3 4 5

213 Commits