### What problem does this PR solve?
Added project instructions for setting up and running the application.
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Fixed the error on the login page.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Allow superuser(admin) to grant or revoke other superuser.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add tokenized content es field to query zh message.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR addresses critical memory and CPU resource management issues in
high-concurrency environments (multi-worker setups):
GPU Memory Exhaustion (OOM): Currently, onnxruntime-gpu uses an
aggressive memory arena that does not effectively release VRAM back to
the system after a task completes. In multi-process worker setups ($WS >
4), this leads to BFCArena allocation failures and OOM errors as workers
"hoard" VRAM even when idle. This PR introduces an optional GPU Memory
Arena Shrinkage toggle to mitigate this issue.
CPU Oversubscription: ONNX intra_op and inter_op thread counts are
currently hardcoded to 2. When running many workers, this causes
significant CPU context-switching overhead and degrades performance.
This PR makes these values configurable to match the host's actual CPU
core density.
Multi-GPU Support: The memory management logic has been improved to
dynamically target the correct device_id, ensuring stability on systems
with multiple GPUs.
Transparency: Added detailed initialization logs to help administrators
verify and troubleshoot their ONNX session configurations.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: shakeel <shakeel@lollylaw.com>
### What problem does this PR solve?
The OpenAI-compatible chat endpoint
(`/chats_openai/<chat_id>/chat/completions`) was not returning accurate
token
usage in streaming responses. The token counts were either missing or
inaccurate because the underlying LLM API
responses weren't being properly parsed for usage data.
This PR adds proper token counting using tiktoken (cl100k_base encoding)
as a fallback when the LLM API doesn't provide usage data in streaming
chunks. This ensures clients always receive token usage information in
the
response, which is essential for billing and quota management.
**Changes:**
- Add tiktoken-based token counting for streaming responses in
OpenAI-compatible endpoint
- Ensure `usage` field is always populated in the final streaming chunk
- Add unit tests for token usage calculation
Fixes#7850
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add a web search button to the chat box on the chat page.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Metadata supports precise time selection
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The minimum size of the historical message window for the
classification operator is 1. #12778
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
**Backend**
\rag\nlp\search.py
*Before the fix*
The top_k parameter was not applied to limit the total number of chunks,
and the rerank model also uses the exact whole valid_idx rather than
assigning valid_idx = valid_idx[:top] firstly.
*After the fix*
The top_k limit is applied to the total results before pagination, using
a default value of top = 1024 if top_k is not modified.
session.py
*Before the fix:*
When the frontend calls the retrieval API with `search_id`, the backend
only reads `meta_data_filter` from the saved `search_config`. The
`rerank_id`, `top_k`, `similarity_threshold`, and
`vector_similarity_weight` parameters are only taken from the direct
request body. Since the frontend doesn't pass these parameters
explicitly (it only passes `search_id`), they always fall back to
default values:
- `similarity_threshold` = 0.0
- `vector_similarity_weight` = 0.3
- `top_k` = 1024
- `rerank_id` = "" (no rerank)
This means user settings saved in the Search Settings page have no
effect on actual search results.
*After the fix:*
When a `search_id` is provided, the backend now reads all relevant
configuration from the saved `search_config`, including `rerank_id`,
`top_k`, `similarity_threshold`, and `vector_similarity_weight`. Request
parameters can still override these values if explicitly provided,
allowing flexibility. The rerank model is now properly instantiated
using the configured `rerank_id`, making the rerank feature actually
work.
**Frontend**
\web\src\pages\next-search\search-setting.tsx
*Before the fix*
search-setting.tsx file, the top_k input box is only displayed when
rerank is enabled (wrapped in the rerankModelDisabled condition). If the
rerank switch is turned off, the top_k input field will be hidden, but
the form value will remain unchanged. In other words: - When rerank is
enabled, users can modify top_k (default 1024). - When rerank is
disabled, top_k retains the previous value, but it's not visible on the
interface. Therefore, the backend will always receive the top_k
parameter; it's just that the frontend UI binds this configuration item
to the rerank switch. When rerank is turned off, top_k will not
automatically reset to 1024, but will retain its original value.
*After the fix*
On the contrary, if we switch off the button rerank model, the value
top-k will be reset to 1024. By the way, If we use top-k in an
individual method, rather than put it into the method retrieval, we can
control it separately
Now all methods valid
Using rerank
<img width="2378" height="1565" alt="Screenshot 2026-01-21 190206"
src="https://github.com/user-attachments/assets/fa2b0df0-1334-4ca3-b169-da6c5fd59935"
/>
Not using rerank
<img width="2596" height="1559" alt="Screenshot 2026-01-21 190229"
src="https://github.com/user-attachments/assets/c5a80522-a0e1-40e7-b349-42fe86df3138"
/>
Before fixing they are the same
### Type of change
- Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR introduces automatic table orientation detection and correction
within the PDF parser. This ensures that tables in PDFs are correctly
oriented before structure recognition, improving overall parsing
accuracy.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Aliyun OSS do not support boto s4 signature_version which will lead to
an error:
```
botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the PutObject operation: aws-chunked encoding is not supported with the specified x-amz-content-sha256 value
```
According to aliyun oss docs, oss_conn need to use s3 signature_version.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
API adds audio to text and text to speech functions
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: After deleting metadata in batches, the selected items need to be
cleared.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the icons in the chat page's collapsible panel.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
As title.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
This PR is going to make RAGFlow CLI to access RAGFlow as normal user,
and work as the a testing tool for RAGFlow server.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Allow classification operators to be followed by other
classification operators. #9082
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimize the metadata code structure to implement metadata list
structure functionality.
#11564
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add a think button to the chat box. #12742
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Improving code formatting and consistency
- Removing debug print statements
### Type of change
- [x] Refactoring
Resolves#12572
## What problem does this PR solve?
The conversation list in chat sessions previously only supported
deleting conversations one by one. This was inefficient when users
needed to clean up multiple conversations. This PR adds batch delete
functionality to improve user experience.
## Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Specific changes
- Add selection mode with checkboxes for conversation list
- Add batch delete functionality with custom icons
- Add internationalization support (en/zh)
- Use existing removeConversation API which supports batch deletion
## UI modification status
- Default: Show [+] and [batch delete icon]
- Selection mode: Show checkboxes, keep [+] and [select all icon]
- Items selected: Show [return icon] and [red trash icon]"
### Repair Comparison
**1.Before Repair**
<img width="982" height="1221" alt="image"
src="https://github.com/user-attachments/assets/8a80f7c0-7da6-41ec-9d1a-ac887ede96ba"
/>
**2.After Repair**
<img width="1273" height="919" alt="新增批量删除效果图"
src="https://github.com/user-attachments/assets/e179bdf3-3779-4bd5-84b6-8e24780a22ea"
/>
---
Co-authored-by: Gongzi
---------
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
This PR adds missing web API tests (system, search, KB, LLM, plugin,
connector). It also addresses a contract mismatch that was causing test
failures: metadata updates did not persist new keys (update‑only
behavior).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Other (please describe): Test coverage expansion and test helper
instrumentation
### What problem does this PR solve?
This PR makes the document change‑status endpoint idempotent under the
Infinity doc store. If a document already has the requested status, the
handler returns success without touching the engine, preventing
unnecessary updates and avoiding missing‑table errors while keeping
responses consistent.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Auto redirect to login page if API reports `401: Unauthroized` in ANY
**Admin** page.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add missing route for navigating to `/admin/users/:id`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
ERROR 1819426 Unhandled exception during request
Traceback (most recent call last):
File
"/home/qinling/[github.com/infiniflow/ragflow/api/apps/document_app.py](http://github.com/infiniflow/ragflow/api/apps/document_app.py)",
line 639, in run
return await thread_pool_exec(_run_sync)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/qinling/[github.com/infiniflow/ragflow/common/misc_utils.py](http://github.com/infiniflow/ragflow/common/misc_utils.py)",
line 132, in thread_pool_exec
return await loop.run_in_executor(_thread_pool_executor(), func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/futures.py", line 287, in __await__
yield self # This tells Task to wait for completion.
^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/tasks.py", line 385, in __wakeup
future.result()
File "/usr/lib/python3.12/asyncio/futures.py", line 203, in result
raise self._exception.with_traceback(self._exception_tb)
File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/qinling/[github.com/infiniflow/ragflow/api/apps/document_app.py](http://github.com/infiniflow/ragflow/api/apps/document_app.py)",
line 593, in _run_sync
if not DocumentService.accessible(doc_id,
[current_user.id](http://current_user.id/)):
^^^^^^^^^^^^^^^
File
"/home/qinling/[github.com/infiniflow/ragflow/.venv/lib/python3.12/site-packages/werkzeug/local.py](http://github.com/infiniflow/ragflow/.venv/lib/python3.12/site-packages/werkzeug/local.py)",
line 318, in __get__
obj = instance._get_current_object()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/qinling/[github.com/infiniflow/ragflow/.venv/lib/python3.12/site-packages/werkzeug/local.py](http://github.com/infiniflow/ragflow/.venv/lib/python3.12/site-packages/werkzeug/local.py)",
line 526, in _get_current_object
return get_name(local())
^^^^^^^
File
"/home/qinling/[github.com/infiniflow/ragflow/api/apps/__init__.py](http://github.com/infiniflow/ragflow/api/apps/__init__.py)",
line 97, in _load_user
authorization = request.headers.get("Authorization")
^^^^^^^^^^^^^^^
File
"/home/qinling/[github.com/infiniflow/ragflow/.venv/lib/python3.12/site-packages/werkzeug/local.py](http://github.com/infiniflow/ragflow/.venv/lib/python3.12/site-packages/werkzeug/local.py)",
line 318, in __get__
obj = instance._get_current_object()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/qinling/[github.com/infiniflow/ragflow/.venv/lib/python3.12/site-packages/werkzeug/local.py](http://github.com/infiniflow/ragflow/.venv/lib/python3.12/site-packages/werkzeug/local.py)",
line 519, in _get_current_object
raise RuntimeError(unbound_message) from None
RuntimeError: Not within a request context
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Align MySQL defaults between docker/.env and
docker/service_conf.yaml.template
close#12645
### Type of change
- [x] Other (please describe):Unify MySQL configuration
### What problem does this PR solve?
When uploading multiple files at once, if any of the files are of an
unsupported type and the blob is not removed, it triggers a
TypeError('Object of type bytes is not JSON serializable') exception.
This prevents the frontend from responding properly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
This PR fixes a `KeyError` crash when running RAPTOR tasks on documents
that don't have the expected vector field.
## Related Issue
Fixes https://github.com/infiniflow/ragflow/issues/12675
## Problem
When running RAPTOR tasks, the code assumes all chunks have the vector
field `q_<size>_vec` (e.g., `q_1024_vec`). However, chunks may not have
this field if:
1. They were indexed with a **different embedding model** (different
vector size)
2. The embedding step **failed silently** during initial parsing
3. The document was parsed before the current embedding model was
configured
This caused a crash:
```
KeyError: 'q_1024_vec'
```
## Solution
Added defensive validation in `run_raptor_for_kb()`:
1. **Check for vector field existence** before accessing it
2. **Skip chunks** that don't have the required vector field instead of
crashing
3. **Log warnings** for skipped chunks with actionable guidance
4. **Provide informative error messages** suggesting users re-parse
documents with the current embedding model
5. **Handle both scopes** (`file` and `kb` modes)
## Changes
- `rag/svr/task_executor.py`: Added validation and error handling in
`run_raptor_for_kb()`
## Testing
1. Create a knowledge base with an embedding model
2. Parse documents
3. Change the embedding model to one with a different vector size
4. Run RAPTOR task
5. **Before**: Crashes with `KeyError`
6. **After**: Gracefully skips incompatible chunks with informative
warnings
---
<!-- Gittensor Contribution Tag: @GlobalStar117 -->
Co-authored-by: GlobalStar117 <GlobalStar117@users.noreply.github.com>
### What problem does this PR solve?
Fix: The time zone is unable to update properly in the database #12696
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1) Create dataset using table parser for infinity
2) Answer questions in chat using SQL
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Summary
Fixes#12651
The Docker container was failing at startup with:
```
/ragflow/.venv/bin/python3: No module named pip
```
This occurred when `USE_DOCLING=true` because the `entrypoint.sh` tries
to use `uv pip install` to install docling at runtime.
## Root Cause
As explained in the issue:
1. `uv sync` creates a minimal, production-focused environment **without
pip**
2. The production stage copies the venv from builder
3. Runtime commands using `uv pip install` fail because pip is not
present
## Solution
Added `python -m ensurepip --upgrade` after `uv sync` in the Dockerfile
to ensure pip is available in the virtual environment:
```dockerfile
uv sync --python 3.12 --frozen && \
# Ensure pip is available in the venv for runtime package installation (fixes#12651)
.venv/bin/python3 -m ensurepip --upgrade
```
This is a minimal change that:
- Ensures pip is installed during build time
- Doesn't change any other behavior
- Allows runtime package installation via `uv pip install` to work
---
This is a Gittensor contribution.
gittensor:user:GlobalStar117
Co-authored-by: GlobalStar117 <GlobalStar117@users.noreply.github.com>