### What problem does this PR solve?
FIx: missing parameters in by_plaintext method for PDF naive mode
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: lih <dev_lih@139.com>
### What problem does this PR solve?
Feat: add more chunking method #11311
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Incorrect retrieval total count with pagination enabled.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Added TCADP Parser configuration fields to PDF, PPT, and spreadsheet
parsing forms
- Implemented support for setting table result type (Markdown/HTML) and
Markdown image response type (URL/Text)
- Updated TCADP Parser to handle return format settings from
configuration or parameters
- Enhanced frontend to dynamically show TCADP options based on selected
parsing method
- Modified backend to pass format parameters when calling TCADP API
- Optimized form default value logic for TCADP configuration items
- Updated multilingual resource files for new configuration options
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add OceanBase doc engine. Close#5350
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
**Cohere rerank base_url default handling**
- Background: When no rerank base URL is configured, the settings
pipeline was passing an empty string through RERANK_CFG →
TenantLLMService → CoHereRerank, so the Cohere client received
base_url="" and produced “missing protocol” errors during rerank calls.
- What changed: The CoHereRerank constructor now only forwards base_url
to the Cohere client when it isn’t empty/whitespace, causing the client
to fall back to its default API endpoint otherwise.
- Why it matters: This prevents invalid URL construction in the rerank
workflow and keeps tests/sanity checks that rely on the default Cohere
endpoint from failing when no custom base URL is specified.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Philipp Heyken Soares <philipp.heyken-soares@am.ai>
### What problem does this PR solve?
Add Gemini 3 Pro preview.
Change `GenerativeModel` to `genai`.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: manual parser with mineru #11320
Fix: missing parameter in mineru #11334
Fix: add outlines parameter for pdf parsers
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: add describe_image_with_prompt for ZHIPU AI #11289
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Jason <ggbbddjm@gmail.com>
### What problem does this PR solve?
Fix: concat images in word document. Partially solved issues in #11063
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Correctly check task executor alive and display status.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
pr:
#10854
change:
update check_embedding api
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add fault-tolerant mechanism to RAPTOR.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
GraphRAG and RAPTOR tasks do not affect document status.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add mechanism to check cancellation in Agent.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#10056
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
#11136
change:
not enough values to unpack (expected 3, got 2) in general chunk
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: missing file formats in hierarchical_manager #11084
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add kimi-k2-thinking and moonshot-v1-vision-preview.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The doc file cannot be parsed(#11092)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: virgilwong <hyhvirgil@gmail.com>
### What problem does this PR solve?
Fix: OpenSearch retrieval no return #11006
Add documentation #11072
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
```
admin> show version;
show_version
+-----------------------+
| version |
+-----------------------+
| v0.21.0-241-gc6cf58d5 |
+-----------------------+
admin> \q
Goodbye!
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
RAPTOR handle cancel gracefully.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
GraghRAG handle cancel gracefully. #10997.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: fix pdf_parser ignored in rag/app/naive.py #11000
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix MCP cannot handle empty Auth field properly, then result in
```bash
2025-11-05 11:10:41,919 INFO 51209 Negotiated protocol version: 2025-06-18
2025-11-05 11:10:41,920 INFO 51209 client_session initialized successfully
2025-11-05 11:10:41,994 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:10:41] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:10:41,999 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:10:42,000 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:10:42,001 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:10:42] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:11:30,441 INFO 51209 Negotiated protocol version: 2025-06-18
2025-11-05 11:11:30,442 INFO 51209 client_session initialized successfully
2025-11-05 11:11:30,520 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:30] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:11:30,525 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:11:30,526 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:11:30,527 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:30] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:11:31,476 INFO 51209 Negotiated protocol version: 2025-06-18
2025-11-05 11:11:31,476 INFO 51209 client_session initialized successfully
2025-11-05 11:11:31,549 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:31] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:11:31,552 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:11:31,553 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:11:31,554 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:31] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:11:51,930 ERROR 51209 unhandled errors in a TaskGroup (1 sub-exception)
+ Exception Group Traceback (most recent call last):
| File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 86, in _mcp_server_loop
| async with streamablehttp_client(url, headers) as (read_stream, write_stream, _):
| File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/contextlib.py", line 217, in __aexit__
| await self.gen.athrow(typ, value, traceback)
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 478, in streamablehttp_client
| async with anyio.create_task_group() as tg:
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 781, in __aexit__
| raise BaseExceptionGroup(
| exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 409, in handle_request_async
| await self._handle_post_request(ctx)
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 278, in _handle_post_request
| response.raise_for_status()
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/httpx/_models.py", line 829, in raise_for_status
| raise HTTPStatusError(message, request=request, response=self)
| httpx.HTTPStatusError: Server error '502 Bad Gateway' for url 'http://192.168.1.38:9382/mcp'
| For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/502
+------------------------------------
2025-11-05 11:11:51,942 ERROR 51209 Error fetching tools from MCP server: streamable-http: http://192.168.1.38:9382/mcp
Traceback (most recent call last):
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 168, in get_tools
return future.result(timeout=timeout)
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._get_tools_from_mcp_server) at 0x7d58f02e2c20>", line 40, in _get_tools_from_mcp_server
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 160, in _get_tools_from_mcp_server
result: ListToolsResult = await self._call_mcp_server("list_tools", timeout=timeout)
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._call_mcp_server) at 0x7d58f02e2b00>", line 63, in _call_mcp_server
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 139, in _call_mcp_server
raise result
ValueError: Connection failed (possibly due to auth error). Please check authentication settings first
2025-11-05 11:11:51,943 ERROR 51209 Test MCP error: Connection failed (possibly due to auth error). Please check authentication settings first
Traceback (most recent call last):
File "/home/xxxxxxxxx/workspace/ragflow/api/apps/mcp_server_app.py", line 429, in test_mcp
tools = tool_call_session.get_tools(timeout)
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession.get_tools) at 0x7d58f02e2cb0>", line 40, in get_tools
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 168, in get_tools
return future.result(timeout=timeout)
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._get_tools_from_mcp_server) at 0x7d58f02e2c20>", line 40, in _get_tools_from_mcp_server
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 160, in _get_tools_from_mcp_server
result: ListToolsResult = await self._call_mcp_server("list_tools", timeout=timeout)
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._call_mcp_server) at 0x7d58f02e2b00>", line 63, in _call_mcp_server
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 139, in _call_mcp_server
raise result
ValueError: Connection failed (possibly due to auth error). Please check authentication settings first
2025-11-05 11:11:51,944 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:11:51,945 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:11:51,946 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:51] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:12:20,484 INFO 51209 Negotiated protocol version: 2025-06-18
2025-11-05 11:12:20,485 INFO 51209 client_session initialized successfully
2025-11-05 11:12:20,570 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:12:20] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:12:20,573 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:12:20,574 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:12:20,575 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:12:20] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:15:02,119 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:15:02] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:16:24,967 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:16:24] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:30:24,284 ERROR 51209 Task was destroyed but it is pending!
task: <Task pending name='Task-58' coro=<MCPToolCallSession._mcp_server_loop() running at <@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._mcp_server_loop) at 0x7d58f02e29e0>:11> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[_chain_future.<locals>._call_set_state() at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/futures.py:392]>
2025-11-05 11:30:24,285 ERROR 51209 Task was destroyed but it is pending!
task: <Task pending name='Task-67' coro=<Queue.get() running at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/queues.py:159> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[_release_waiter(<Future pendi...ask_wakeup()]>)() at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/tasks.py:387]>
Exception ignored in: <coroutine object Queue.get at 0x7d585480ace0>
Traceback (most recent call last):
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/queues.py", line 161, in get
getter.cancel() # Just in case getter is not done yet.
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/base_events.py", line 753, in call_soon
self._check_closed()
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)