Compare commits

..

247 Commits

Author SHA1 Message Date
2d89863fdd Fix: search list permission (#9767)
### What problem does this PR solve?

Search list permission.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-27 18:50:02 +08:00
6cb3e08381 Revert: broken agent completion by #9631 (#9760)
### What problem does this PR solve?

Revert broken agent completion by #9631.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-27 17:16:55 +08:00
986b9cbb1a Docs: Update version references to v0.20.4 in READMEs and docs (#9758)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.3 to v0.20.4
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-27 16:56:55 +08:00
9c456adffd Added v0.20.4 release notes (#9757)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-08-27 15:29:09 +08:00
c15b138839 Create ecommerce_customer_service_workflow.json (#9755)
### What problem does this PR solve?

Update workflow template.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-27 15:15:24 +08:00
ff11348f7c Fix: Optimize the MultiSelect component and system prompt templates #3221 (#9752)
### What problem does this PR solve?

Fix: Optimize the MultiSelect component and system prompt templates
#3221

- Modify the conditional statements in the MultiSelect component, using
the ?. operator to improve code readability
- Optimize the formatting of the system prompt template to make it more
standardized and easier to read
- Update the Chinese translation, changing "ExeSQL" to "Execute SQL"

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-27 15:12:12 +08:00
cbdabbb58f Fix: Fixed the issue that the agent embedded page needs to be logged in #9750 (#9751)
### What problem does this PR solve?

Fix: Fixed the issue that the agent embedded page needs to be logged in
#9750

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-27 14:18:00 +08:00
cf0011be67 Feat: Upgrade html parser (#9675)
### What problem does this PR solve?

parse more html content.

### Type of change

- [x] Other (please describe):
2025-08-27 12:43:55 +08:00
1f47001c82 Fix: Optimize tooltips and I118n #3221 (#9744)
### What problem does this PR solve?

Fix: Optimize tooltips and I118n #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-27 11:46:51 +08:00
a914535344 Fix: add mode for embeded agent. (#9741)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-27 11:46:15 +08:00
ba1063c2b9 Docs: Miscellaneous updates (#9729)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-08-26 19:35:29 +08:00
2b4bca4447 Fix(i18n): Added new translations #3221 (#9727)
### What problem does this PR solve?

Fix(i18n): Added new translations #3221

- Added and updated internationalization translations in multiple
components


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-26 17:57:53 +08:00
11cf6ae313 Fix: After deleting the knowledge graph, jump to the dataset page #9722 (#9723)
### What problem does this PR solve?

Fix: After deleting the knowledge graph, jump to the dataset page #9722
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-26 17:57:41 +08:00
88db5d90d1 Fix: Try to fix the issue of not being able to log in through Oauth2 #9601 (#9717)
### What problem does this PR solve?

Fix: Try to fix the issue of not being able to log in through Oauth2
#9601

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-26 14:06:28 +08:00
209ef09dc3 Feat: add Zhipu GLM-4.5 model series (#9715)
### What problem does this PR solve?

Add Zhipu GLM-4.5 model series. #9708.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-26 13:48:00 +08:00
ycz
370c8bc25b Update llm_factories.json (#9714)
### What problem does this PR solve?

add ZhipuAI GLM-4.5 model series

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-26 11:49:01 +08:00
e90a959b4d Fix: Chunk error when re-parsing created file #9665 (#9711)
### What problem does this PR solve?

Fix: Chunk error when re-parsing created file

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-26 10:50:30 +08:00
ca320a8c30 Refactor: for total_token_count method use if to check first. (#9707)
### What problem does this PR solve?

for total_token_count method use if to check first, to improve the
performance when we need to handle exception cases

### Type of change

- [x] Refactoring
2025-08-26 10:47:20 +08:00
ae505e6165 Fix: Optimize table style #3221 (#9703)
### What problem does this PR solve?

Fix: Optimize table style
-Modify the style of the table scrollbar and remove unnecessary
scrollbars
-Adjust the header style of the table, add background color and
hierarchy
-Optimize the style of datasets and file tables
-Add a new background color variable

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-26 10:46:54 +08:00
63b5c2292d Fix: Delete the uploaded file in the chat input box, the corresponding file ID is not deleted #9701 (#9702)
### What problem does this PR solve?

Fix: Delete the uploaded file in the chat input box, the corresponding
file ID is not deleted #9701
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-26 09:27:49 +08:00
8d8a5f73b6 Fix: meta data filter with AND logic operations. (#9687)
### What problem does this PR solve?

Close #9648

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-25 18:29:24 +08:00
d0fa66f4d5 Docs: update API endpoint paths (#9683)
### What problem does this PR solve?

- Update API endpoint paths in docs from `/v1/` to `/api/v1/` for
consistency

### Type of change

- [x] Documentation Update
2025-08-25 17:57:24 +08:00
9dd22e141b fix: validate chunk type before processing to prevent AttributeError (#9698)
### What problem does this PR solve?

This PR fixes a critical bug in the session listing endpoint where the
application crashes with an `AttributeError` when processing chunk data
that contains non-dictionary objects.

**Error before fix:**
```json
{
  "code": 100,
  "data": null,
  "message": "AttributeError(\"'str' object has no attribute 'get'\")"
}
```

**Root cause:**
The code assumes all items in the `chunks` array are dictionary objects
and directly calls the `.get()` method on them. However, in some cases,
the chunks array contains string objects or other non-dictionary types,
causing the application to crash when attempting to call `.get()` on a
string.

**Solution:**
Added type validation to ensure each chunk is a dictionary before
processing. Non-dictionary chunks are safely skipped, preventing the
crash while maintaining functionality for valid chunk data.

This fix improves the robustness of the session listing endpoint and
ensures users can retrieve their conversation sessions without
encountering server errors due to data format inconsistencies.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-25 17:57:01 +08:00
b6c1ca828e Refa: replace Chat Ollama implementation with LiteLLM (#9693)
### What problem does this PR solve?

replace Chat Ollama implementation with LiteLLM.

### Type of change

- [x] Refactoring
2025-08-25 17:56:31 +08:00
d367c7e226 Fix: Optimize dataset page layout and internationalization and default values for multi selection #3221 (#9695)
### What problem does this PR solve?

Fix: Optimize dataset page layout and internationalization and Fix
setting default values for multi selection drop-down boxes #3221

-Adjust the style and layout of each component on the dataset page
-Add and update multilingual translation content

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-25 17:29:15 +08:00
a3aa3f0d36 Refa: improve lightrag (#9690)
### What problem does this PR solve?

Improve lightrag.
#9647

### Type of change

- [x] Refactoring
2025-08-25 17:08:44 +08:00
7b8752fe24 fix: Create conversation sessions will lost prologue (#9666)
### What problem does this PR solve?

When create conversation,the prologue hasn't save in conversation.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-25 14:09:28 +08:00
5e2c33e5b0 Fix: grow reference list (#9674)
### What problem does this PR solve?

Fix Multiple conversations cause the reference list to grow indefinitely
due to Python's mutable default argument behavior.
Explicitly initialize reference as empty list when creating new sessions

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-25 14:08:15 +08:00
e40be8e541 Feat: Exclude operator_permission field from renaming chat fields #3221 (#9692)
### What problem does this PR solve?

Feat: Exclude operator_permission field from renaming chat fields #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-25 14:06:06 +08:00
23d0b564d3 Fix: Wrap VersionDialog in DropdownProvider for proper context (#9677)
### What problem does this PR solve?

The VersionDialog component was not receiving the correct context for
dropdown handling, causing improper behavior in its interactions.
This PR wraps VersionDialog in DropdownProvider to ensure it gets the
proper context and functions as expected.

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
2025-08-25 10:18:04 +08:00
ecaa9de843 Fix:[ERROR]'LLMBundle' object has no attribute 'language' (#9682)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9672

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-25 10:17:10 +08:00
2f74727bb9 Fix: meta data error. (#9670)
### What problem does this PR solve?



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-25 09:41:52 +08:00
adbb038a87 Fix: Place the invitation reminder icon in a separate file #9634 (#9662)
### What problem does this PR solve?

Fix: Place the invitation reminder icon in a separate file #9634
Fix: After receiving the agent message, pull the agent data to highlight
the edges passed #9538

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 20:08:55 +08:00
3947da10ae Fix: unexpected LLM parameters (#9661)
### What problem does this PR solve?

Remove unexpected LLM parameters.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 19:33:09 +08:00
4862be28ad Fix: Search app AI summary ERROR And The tag set cannot be selected #9649 #9652 (#9664)
### What problem does this PR solve?
Fix: Search app AI summary ERROR And The tag set cannot be selected
#9649 #9652
- Search app AI summary ERROR: 'dict' object has no attribute 'split'
#9649
- fix The tag set cannot be selected in the knowledge base. #9652
- Added custom parameter options to the LlmSettingFieldItems component
- Adjusted the document preview height to improve page layout
adaptability

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 19:32:32 +08:00
035e8ed0f7 Fix: code executor timeout (#9671)
### What problem does this PR solve?

Code executor timeout.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 19:31:49 +08:00
cc167ae619 Fix: Display the invited icon in the header #9634 (#9659)
### What problem does this PR solve?

Fix: Display the invited icon in the header #9634

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 15:05:56 +08:00
f8847e7bcd Fix: embedded search AI summary (#9658)
### What problem does this PR solve?

Fix search app AI summary ERROR: 'dict' object has no attribute 'split'.
#9649

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 12:55:29 +08:00
3baebd709b Refactoring: Agent completions API change response structure (#9631)
### What problem does this PR solve?

Resolve #9549 and #9436 , In v0.20.x,Agent completions API changed a
lot,such as without reference and so on

### Type of change

- [x] Refactoring
2025-08-22 12:04:15 +08:00
3e6a4b2628 Fix: Document Previewer is not working #9606 (#9656)
### What problem does this PR solve?
Fix: Document Previewer is not working #9606
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 12:03:51 +08:00
312635cb13 Refactor: based on async await to handle Redis when raptor (#9576)
### What problem does this PR solve?

based on async await to handle Redis when raptor

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-08-22 10:58:02 +08:00
756d454122 fix(sdk): add default empty dict for metadata_condition (#9640)
### What problem does this PR solve?

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-22 10:57:48 +08:00
a4cab371fa Update fr.ts - RAPTOR Issue prompt (#9646)
Removed a line break causing problems with execution in Raptor.

### What problem does this PR solve?

When I activate Raptor without changing anything in French, I encounter
a problem that I don't have with the English version. I noticed in the
logs that there was an extra line break, so I suggest removing it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 09:54:49 +08:00
0d7e52338e Fix: Fixed an issue where knowledge base could not be shared #9634 (#9642)
### What problem does this PR solve?

Fix: Fixed an issue where knowledge base could not be shared #9634

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 09:34:11 +08:00
4110f7f5ce Fix: The buttons at the bottom of the dataset settings page are not visible on small screens #9638 (#9639)
### What problem does this PR solve?

Fix: The buttons at the bottom of the dataset settings page are not
visible on small screens #9638
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-21 19:25:14 +08:00
0af57ff772 fix(dataset, next-chats): Fix data form data acquisition logic And Optimize the chat settings interface and add language selection (#9629)
### What problem does this PR solve?

fix(dataset): data form data acquisition logic
fix(next-chats): Optimize the chat settings interface and add language
selection

- Replace form.formControl.trigger with form.trigger
- Use form.getValues() instead of form.formState.values
- Add language selection to support multiple languages
- Add default chat settings values
- Add new settings: icon, description, knowledge base ID, etc.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-08-21 16:57:46 +08:00
0bd58038a8 Fixes (web): Optimized search page style and functionality #3221 (#9627)
### What problem does this PR solve?

Fixes (web): Optimized search page style and functionality #3221

- Updated search page and view title styles
- Modified dataset list and multi-select control styles
- Optimized text field and button styles
- Updated filter button icons
- Adjusted metadata filter styles
- Added default descriptions for the smart assistant

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-21 16:57:14 +08:00
0cbcfcfedf Chore: Update infinity-sdk from 0.6.0.dev4 to 0.6.0.dev5 (#9628)
### What problem does this PR solve?

Bump infinity-sdk dependency to the latest development version
(0.6.0.dev5) in both pyproject.toml and uv.lock files to incorporate
recent changes and fixes from the SDK.

### Type of change

- [x] Other (please describe): Update deps
2025-08-21 16:56:57 +08:00
fbdde0259a Feat: Allow users to parse documents directly after uploading files #3221 (#9633)
### What problem does this PR solve?

Feat: Allow users to parse documents directly after uploading files
#3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-21 16:56:22 +08:00
d482173c9b Fix (style): Optimized Datasets color scheme and layout #3221 (#9620)
### What problem does this PR solve?


Fix (style): Optimized Datasets color scheme and layout #3221

- Updated background and text colors for multiple components

- Adjusted some layout structures, such as the paging position of
dataset tables

- Unified status icons and color mapping

- Optimized responsive layout to improve compatibility across different
screen sizes

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-21 12:14:56 +08:00
929dc97509 Fix: duplicated role... (#9622)
### What problem does this PR solve?

#9611
#9603 #9597

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-21 12:14:43 +08:00
30005c0203 Fix: Remove the file size and quantity restrictions of the upload control #9613 #9598 (#9618)
### What problem does this PR solve?

Fix: Remove the file size and quantity restrictions of the upload
control #9613 #9598

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-21 10:54:17 +08:00
382458ace7 Feat: advanced markdown parsing (#9607)
### What problem does this PR solve?

Using AST parsing to handle markdown more accurately, preventing
components from being cut off by chunking. #9564

<img width="1746" height="993" alt="image"
src="https://github.com/user-attachments/assets/4aaf4bf6-5714-4d48-a9cf-864f59633f7f"
/>

<img width="1739" height="982" alt="image"
src="https://github.com/user-attachments/assets/dc00233f-7a55-434f-bbb7-74ce7f57a6cf"
/>

<img width="559" height="100" alt="image"
src="https://github.com/user-attachments/assets/4a556b5b-d9c6-4544-a486-8ac342bd504e"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-21 09:36:18 +08:00
4080f6a54a Feature (web): Optimize dataset pages and segmented components #3221 (#9605)
### What problem does this PR solve?

Feature (web): Optimize dataset pages and segmented components #3221
-Add the activeClassName property to Segmented components to customize
the selected state style
-Update the icons and captions of the relevant components on the dataset
page
-Modify the parsing status column title of the dataset table
-Optimize the Segmented component style of the homepage application
section

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-21 09:32:04 +08:00
09570c7eef Feat: expand the capabilities of the MCP Server (#8707)
### What problem does this PR solve?

Expand the capabilities of the MCP Server. #8644.

Special thanks to @Drasek, this change is largely based on his original
implementation, it is super neat and well-structured to me. I basically
just integrated his code into the codebase with minimal modifications.

My main contribution is implementing a proper cache layer for dataset
and document metadata, using the LRU strategy with a 300s ± random 30s
TTL. The original code did not actually perform caching.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Caspar Armster <caspar@armster.de>
2025-08-20 19:30:25 +08:00
312f1a0477 Fix: enlarge raptor timeout limits. (#9600)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-20 17:29:15 +08:00
1ca226e43b Feat: Updated some colors according to the design draft #3221 (#9599)
### What problem does this PR solve?

Feat: Updated some colors according to the design draft #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-20 16:32:29 +08:00
830cda6a3a Fix (web): Optimize text display effect #3221 (#9594)
### What problem does this PR solve?

Fix (web): Optimize text display effect
-Add text ellipsis and overflow hidden classes to the HomeCard component
to achieve text overflow hiding and ellipsis effects
-Add text ellipsis and overflow hidden classes to the DatasetSidebar
component to improve the display of dataset names

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-20 15:42:21 +08:00
c66dbbe433 Fix: Fixed the issue where the save button at the bottom of the chat page could not be displayed on small screens #3221 (#9596)
### What problem does this PR solve?

Fix: Fixed the issue where the save button at the bottom of the chat
page could not be displayed on small screens #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-20 15:42:09 +08:00
3b218b2dc0 fix:passing empty database array when updating assistant (#9570)
### What problem does this PR solve?

When the `dataset_ids` parameter is omitted in the **update assistant**
request, Passing an empty array `[]` triggers a misleading
message"Dataset use different embedding models", while omitting the
field does not.
To fix this, we:
- Provide a default empty list: `ids = req.get("dataset_ids", [])`.  
- Replace the `is not None` check with a truthy check: `if ids:`.

**Files changed**  
`api/apps/sdk/chat.py`  
- L153: `ids = req.get("dataset_ids")` → `ids = req.get("dataset_ids",
[])`
- L156: `if ids is not None:` → `if ids:`

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-20 13:40:05 +08:00
d58ef6127f Fix:KeyError: 'globals' KeyError: 'globals' (#9571)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9545
add backward compatible logics

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-20 13:39:38 +08:00
55173c7201 Fix (web): Update the style of segmented controls and add metallic texture gradients (#9591)
### What problem does this PR solve?

Fix (web): Update the style of segmented controls and add metallic
texture gradients #3221
-Modified the selected state style of Segmented components, adding
metallic texture gradient and lower border
-Added a metallic gradient background image in tailwind.diag.js
-Added the -- metallic variable in tailwind.css to define metallic
texture colors

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-20 13:39:23 +08:00
f860bdf0ad Revert "Feat: reference should also be list after 0.20.x" (#9592)
Reverts infiniflow/ragflow#9582
2025-08-20 13:38:57 +08:00
997627861a Feat: reference should also be list after 0.20.x (#9582)
### What problem does this PR solve?

In 0.19.0 reference is list,and it should be a list,otherwise last
conversation's reference will be lost

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-20 13:38:14 +08:00
9f9d32d2cd Feat: Make the old page accessible via URL #3221 (#9589)
### What problem does this PR solve?

Feat: Make the old page accessible via URL #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-20 13:37:06 +08:00
d55f44601a Docs: Updated v0.20.3 release notes (#9583)
### What problem does this PR solve?
### Type of change

- [x] Documentation Update

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-20 10:52:50 +08:00
abb6359547 Docs: Update version references to v0.20.3 in READMEs and docs (#9581)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.2 to v0.20.3
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-20 10:45:44 +08:00
f55ff590d7 Fix: Fixed the issue where the model configuration page could not be scrolled #9572 (#9579)
### What problem does this PR solve?

Fix: Fixed the issue where the model configuration page could not be
scrolled #9572

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-20 10:30:08 +08:00
7d3bb3a2f9 Fix dataset card not responding to click events (#9574)
### What problem does this PR solve?

Fix home card not responding to click events

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-08-20 10:06:14 +08:00
e6cb74b27f Fix (next search): Optimize the search problem interface and related functions #3221 (#9569)
### What problem does this PR solve?

Fix (next search): Optimize the search problem interface and related
functions #3221

-Add search_id to the retrievval_test interface
-Optimize handleSearchStrChange and handleSearch callbacks to determine
whether to enable AI search based on search configuration

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-19 19:22:07 +08:00
00f54c207e Fix: Reset all data except the first one on the chat page shared with others #3221 (#9567)
### What problem does this PR solve?

Fix: Reset all data except the first one on the chat page shared with
others #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-19 19:04:40 +08:00
d0dc56166c Fix: no effect on retrieval_test in term of metadata filter. (#9566)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-19 18:57:35 +08:00
e15e39f183 Fix: Fixed an issue where renaming a chat would create a new chat #3221 (#9563)
### What problem does this PR solve?

Fix: Fixed an issue where renaming a chat would create a new chat #3221
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-19 18:33:55 +08:00
33f3e05b75 Refa: create new name for duplicated dialog name (#9558)
### What problem does this PR solve?

 Create new name for duplicated dialog name.

### Type of change

- [x] Refactoring
2025-08-19 18:14:04 +08:00
b8bfbac2e5 Feat: Switch the root route to the new page #3221 (#9560)
### What problem does this PR solve?

Feat: Switch the root route to the new page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-19 17:41:03 +08:00
d5729e598f Docs: Updated workarounds for uploading file to an agent (#9561)
### What problem does this PR solve?


### Type of change


- [x] Documentation Update
2025-08-19 17:40:39 +08:00
f2c5ad170d Fix(search): Search application list supports renaming function #3221 (#9555)
### What problem does this PR solve?

Fix (search): Search application list supports renaming function #3221

-Update the search application list page and add a renaming operation
entry
-Modify the search application details interface to support obtaining
detailed information
-Optimize search settings page layout and style

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-19 17:35:32 +08:00
0aa3c4cdae Docs: Update version references to v0.20.2 in READMEs and docs (#9559)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.1 to v0.20.2
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-19 17:26:49 +08:00
f123587538 Feat: add meta filter to search app. (#9554)
### What problem does this PR solve?


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-19 17:25:44 +08:00
a41a646909 Fix: Fixed the issue where clicking the SQL tool test button did not request the interface #9541 (#9542)
### What problem does this PR solve?

Fix: Fixed the issue where clicking the SQL tool test button did not
request the interface #9541
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-19 16:41:32 +08:00
787e0c6786 Refa: OpenAI whisper-1 (#9552)
### What problem does this PR solve?

Refactor OpenAI to enable audio parsing.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-08-19 16:41:18 +08:00
05ee1be1e9 Docs: Updated v0.20.2 release notes (#9553)
### What problem does this PR solve?

### Type of change


- [x] Documentation Update
2025-08-19 16:03:42 +08:00
a0d630365c Refactor:Improve VoyageRerank not texts handling (#9539)
### What problem does this PR solve?

Improve VoyageRerank not texts handling

### Type of change

- [x] Refactoring
2025-08-19 10:31:04 +08:00
b5b8032a56 Feat: Support metadata auto filer for Search. (#9524)
### What problem does this PR solve?

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-19 10:27:24 +08:00
ccb9f0b0d7 Feature (agent): Allow Retrieval kb_ids param use kb_id,and allow list kb_name or kb_id (#9531)
### What problem does this PR solve?

Allow Retrieval kb_ids param use kb_id,and allow list kb_name or kb_id。
- Add judgment on whether the knowledge base name is a list and support
batch queries
-When the knowledge base name does not exist, try using the ID for
querying
-If both query methods fail, throw an exception

### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-08-19 09:42:39 +08:00
a0ab619aeb Fix: ensure update_progress loop always waits between iterations (#9528)
Move stop_event.wait(6) into finally block so that even when an
exception occurs, the loop still sleeps before retrying. This prevents
busy looping and excessive error logs when Redis connection fails.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-19 09:42:15 +08:00
32349481ef Feat: Allow agent operators to select speech-to-text models #3221 (#9534)
### What problem does this PR solve?

Feat: Allow agent operators to select speech-to-text models #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-19 09:40:01 +08:00
2b9ed935f3 feat(search): Optimized search functionality and user interface #3221 (#9535)
### What problem does this PR solve?

feat(search): Optimized search functionality and user interface #3221
### Type of change
- Added similarity threshold adjustment function
- Optimized mind map display logic
- Adjusted search settings interface layout
- Fixed related search and document viewing functions
- Optimized time display and node selection logic

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-19 09:39:48 +08:00
188c0f614b Refa: refine search app (#9536)
### What problem does this PR solve?

Refine search app.

### Type of change

- [x] Refactoring
2025-08-19 09:33:33 +08:00
dad97869b6 Fix: search service reference (#9533)
### What problem does this PR solve?

- Update search_app.py to use SearchService instead of
KnowledgebaseService for duplicate

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-18 19:02:10 +08:00
57c8a37285 Feat: add dialog chatbots info (#9530)
### What problem does this PR solve?

Add dialog chatbots info.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-18 19:01:45 +08:00
9d0fed601d Feat: Displays the embedded page of the chat module #3221 (#9532)
### What problem does this PR solve?

Feat: Displays the embedded page of the chat module #3221
Feat: Let the agen operator support the selection of tts model #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-18 18:02:13 +08:00
fe32952825 Fix: Gemini parameters error (#9520)
### What problem does this PR solve?

Fix Gemini parameters error.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-18 14:51:10 +08:00
5808aef28c Fix (search): Optimize the search page functionality and UI #3221 (#9525)
### What problem does this PR solve?

Fix (search): Optimize the search page functionality and UI #3221 

- Add a search list component
- Implement search settings
- Optimize search result display
- Add related search functionality
- Adjust the search input box style
- Unify internationalized text

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-18 14:50:29 +08:00
ca720bd811 Fix: save team's canvas issue. (#9518)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-18 13:05:29 +08:00
ba11312766 Feat: embedded search (#9501)
### What problem does this PR solve?

Add embedded search functionality.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-18 12:05:11 +08:00
c8bbf7452d Env: Update dependencies for proxy support (#9519)
### What problem does this PR solve?

- Update httpx dependency to include socks support in pyproject.toml
- Update lockfile with new socksio dependency

### Type of change

- [x] Update dependencies for proxy support
2025-08-18 12:04:16 +08:00
b08650bc4c Feat: Fixed the chat model setting echo issue (#9521)
### What problem does this PR solve?

Feat: Fixed the chat model setting echo issue

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-18 12:03:33 +08:00
fb77f9917b Refactor: Use Input Length In DefaultRerank (#9516)
### What problem does this PR solve?

1. Use input length to prepare res
2. Adjust torch_empty_cache code location

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-08-18 10:00:27 +08:00
d874683ae4 Fix the bug in enablePrologue under agent task mode (#9487)
### What problem does this PR solve?

There is a problem with the implementation of the Agent begin-form:
although the enablePrologue switch and the prologue input box are hidden
in Task mode, these values are still saved in the form data. If the user
first enables the opening and sets the content in Conversational mode,
and then switches to Task mode, these values will still be saved and may
be used in some scenarios.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-15 20:29:02 +08:00
f9e5caa8ed feat(search): Added app embedding functionality and optimized search page #3221 (#9499)
### What problem does this PR solve?
feat(search): Added app embedding functionality and optimized search
page #3221

- Added an Embed App button and related functionality
- Optimized the layout and interaction of the search settings interface
- Adjusted the search result display method
- Refactored some code to support new features
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-15 18:25:00 +08:00
99df0766fe Feat: add SMTP support for user invitation emails (#9479)
### What problem does this PR solve?

Add SMTP support for user invitation emails

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-15 18:12:20 +08:00
3b50688228 Docs: Miscellaneous updates. (#9506)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-08-15 18:10:11 +08:00
ffc095bd50 Feat: conversation completion can specify different model (#9485)
### What problem does this PR solve?

Conversation completion can specify different model

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-15 17:44:58 +08:00
799c57287c Feat: Add metadata configuration for new chats #3221 (#9502)
### What problem does this PR solve?

Feat: Add metadata configuration for new chats #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-15 17:40:16 +08:00
eef43fa25c Fix: unexpected truncated Excel files (#9500)
### What problem does this PR solve?

Handle unexpected truncated Excel files.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-15 17:00:34 +08:00
5a4dfecfbe Refactor:Standardize image conf and add private registry support (#9496)
- Unified configuration format: All services now use the same image
configuration structure for consistency.

- Private registry support: Added imagePullSecrets to enable pulling
images from private registries.

- Per-service flexibility: Each service can override image-related
parameters independently.

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-08-15 16:05:33 +08:00
7f237fee16 Fix:HTTPs component re.error: bad escape \u (#9480)
### What problem does this PR solve?

When calling HTTP to request data, if the JSON string returned by the
interface contains an unasked back slash like '\u', Python's RE module
will escape 'u' as Unicode, but there is no valid 4-digit hexadecimal
number at the end, so it will directly report an error. Error: re.
error: bad escape \ u at position 26
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-15 15:48:10 +08:00
30ae78755b Feat: Delete or filter conversations #3221 (#9491)
### What problem does this PR solve?

Feat: Delete or filter conversations #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-15 12:05:27 +08:00
2114e966d8 Feat: add citation option to agent and enlarge the timeouts. (#9484)
### What problem does this PR solve?

#9422

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-15 10:05:01 +08:00
562349eb02 Feat: Upload files in the chat box #3221 (#9483)
### What problem does this PR solve?
Feat: Upload files in the chat box #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-15 10:04:37 +08:00
618d6bc924 Feat:Can directly generate an agent node by dragging and dropping the connecting line (#9226) (#9357)
…e connecting line (#9226)

### What problem does this PR solve?

Can directly generate an agent node by dragging and dropping the
connecting line (#9226)

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-08-14 17:48:02 +08:00
762aa4b8c4 fix: preserve correct MIME & unify data URL handling for vision inputs (relates #9248) (#9474)
fix: preserve correct MIME & unify data URL handling for vision inputs
(relates #9248)

- Updated image2base64() to return a full data URL
(data:image/<fmt>;base64,...) with accurate MIME
- Removed hardcoded image/jpeg in Base._image_prompt(); pass through
data URLs and default raw base64 to image/png
- Set AnthropicCV._image_prompt() raw base64 media_type default to
image/png
- Ensures MIME type matches actual image content, fixing “cannot process
base64 image” errors on vLLM/OpenAI-compatible backends

### What problem does this PR solve?

This PR fixes a compatibility issue where base64-encoded images sent to
vision models (e.g., vLLM/OpenAI-compatible backends) were rejected due
to mismatched MIME type or incorrect decoding.
Previously, the backend:
- Always converted raw base64 into data:image/jpeg;base64,... even if
the actual content was PNG.
- In some cases, base64 decoding was attempted on the full data URL
string instead of the pure base64 part.
This caused errors like:
```
cannot process base64 image
failed to decode base64 string: illegal base64 data at input byte 0
```
by strict validators such as vLLM.
With this fix, the MIME type in the request now matches the actual image
content, and data URLs are correctly handled or passed through, ensuring
vision models can decode and process images reliably.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 17:00:56 +08:00
9cd09488ca Feat: Send data to compare the performance of different models' answers #3221 (#9477)
### What problem does this PR solve?

Feat: Send data to compare the performance of different models' answers
#3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-14 16:57:35 +08:00
f2806a8332 Update cv_model.py (#9472)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9452

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 13:45:38 +08:00
b6e34e3aa7 Fix: PyPDF's Manipulated FlateDecode streams can exhaust RAM (#9469)
### What problem does this PR solve?

#3951
#8463 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 13:45:19 +08:00
3ee9653170 Agent template: report agent using knowledge base (#9427)
### What problem does this PR solve?

Agent template: report agent using knowledge base
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-14 12:17:57 +08:00
6d1078b538 fix 'KeyError: "There is no item named 'word/NULL' in the archive"' (#9455)
### What problem does this PR solve?

Issue referring to:
https://github.com/python-openxml/python-docx/issues/797
Fix referring to:
https://github.com/python-openxml/python-docx/issues/1105#issuecomment-1298075246

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 12:14:03 +08:00
6e862553cb Docs: Deprecated 'Create session with agent' (#9464)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-08-14 12:13:11 +08:00
b1baa91ff0 feat(next-search): Implements document preview functionality #3221 (#9465)
### What problem does this PR solve?

feat(next-search): Implements document preview functionality

- Adds a new document preview modal component
- Implements document preview page logic
- Adds document preview-related hooks
- Optimizes document preview rendering logic
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-14 12:11:53 +08:00
b55c3d07dc Test: Update error message assertions for chunk update tests (#9468)
### What problem does this PR solve?

Modify test cases to accept additional error message format when
updating chunks.
fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16942741621/job/48015850297

### Type of change

- [x] Update test cases
2025-08-14 12:11:20 +08:00
2b3318cd3d Fix: KB folder may not there while creating virtual file (#9431)
### What problem does this PR solve?

KB folder may not there while creating virtual file. #9423 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 09:40:30 +08:00
434b55be70 Feat: Display a separate chat multi-model comparison page #3221 (#9461)
### What problem does this PR solve?
Feat: Display a separate chat multi-model comparison page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-14 09:39:20 +08:00
98b4c67292 Trival. (#9460)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 09:39:00 +08:00
3d645ff31a Docs: Update HTTP API reference with simplified response format and parameters (#9454)
### What problem does this PR solve?

- Make `session_id` optional and add `inputs` parameter
- Remove deprecated `sync_dsl` parameter
- Update request/response examples to match current API behavior

### Type of change

- [x] Documentation Update
2025-08-13 21:02:54 +08:00
5e8cd693a5 Refa: split services about llm. (#9450)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2025-08-13 16:41:01 +08:00
29f297b850 Fix: update broken create agent session due to v0.20.0 changes (#9445)
### What problem does this PR solve?

 Update broken create agent session due to v0.20.0 changes. #9383


**NOTE: A session ID is no longer required to interact with the agent.**

See: #9241, #9309.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-13 16:01:54 +08:00
7235638607 Feat: Show multiple chat boxes #3221 (#9443)
### What problem does this PR solve?

Feat: Show multiple chat boxes #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-13 15:59:51 +08:00
00919fd599 Fix typo in issue template (#9444) 2025-08-13 14:27:15 +08:00
43c0792ffd Add issue template for agent scenario feature request (#9437) 2025-08-13 12:50:06 +08:00
4b1b68c5fc Fix: no doc hits after meta data filter. (#9435)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-13 12:43:31 +08:00
3492f54c7a Docs: Update HTTP API reference with new response fields (#9434)
### What problem does this PR solve?

Add `url`, `doc_type`, and `created_at` fields to the API response
example in the documentation.

### Type of change

- [x] Documentation Update
2025-08-13 12:18:39 +08:00
da5cef0686 Refactor:Improve the float compare for LocalAIRerank (#9428)
### What problem does this PR solve?
Improve the float compare for LocalAIRerank

### Type of change

- [x] Refactoring
2025-08-13 10:26:42 +08:00
9098efb8aa Feat: Fixed the issue where some fields in the chat configuration could not be displayed #3221 (#9430)
### What problem does this PR solve?

Feat: Fixed the issue where some fields in the chat configuration could
not be displayed #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-13 10:26:26 +08:00
421657f64b Feat: allows setting multiple types of default models in service config (#9404)
### What problem does this PR solve?

Allows set multiple types of default models in service config.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-13 09:46:05 +08:00
7ee5e0d152 Fix KeyError in session listing endpoint when accessing conversation reference (#9419)
- Add type and boundary checks for conv["reference"] access
- Prevent KeyError: 0 when reference list is empty or malformed
- Ensure reference is list type before indexing
- Handle cases where reference items are None or missing chunks
- Maintains backward compatibility with existing data structures

This resolves crashes in /api/v1/agents/<agent_id>/sessions endpoint
when conversation reference data is not properly structured.

### What problem does this PR solve?

This PR fixes a critical `KeyError: 0` that occurs in the
`/api/v1/agents/<agent_id>/sessions` endpoint when the system attempts
to access conversation reference data that is not properly structured.

**Background Context:**
The `list_agent_session` method in `api/apps/sdk/session.py` assumes
that `conv["reference"]` is always a properly indexed list with valid
dictionary structures. However, in real-world scenarios, this data can
be:
- Not a list type (could be None, string, or other types)
- An empty list when `chunk_num` tries to access index 0
- Contains None values or malformed dictionary structures
- Missing expected "chunks" keys in reference items

**Impact Before Fix:**
When malformed reference data is encountered, the API crashes with:
```json
{
    "code": 100,
    "data": null,
    "message": "KeyError(0)"
}
```
**Solution:**
Added comprehensive safety checks including type validation, boundary
checking, null safety, and structure validation to ensure the API
gracefully handles all reference data formats while maintaining backward
compatibility.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-08-13 09:23:52 +08:00
22915223d4 Fix: citation issue. (#9424)
### What problem does this PR solve?

#8474

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 18:53:34 +08:00
d7b4e84cda Refa: Update LLM stream response type to Generator (#9420)
### What problem does this PR solve?

Change return type of _generate_streamly from str to Generator[str,
None, None] to properly type hint streaming responses.

### Type of change

- [x] Refactoring
2025-08-12 18:05:52 +08:00
e845d5f9f8 Fix:valueERROR when file is optional but not exist value (#9414)
### What problem does this PR solve?

when begin component has optional file but not exist , it rase error

### Type of change

- [x] Bug Fix

Co-authored-by: Popmio <zhengyihao036@gamil.com>
2025-08-12 17:39:03 +08:00
3d18284dd6 Feat: Added meta data to the chat configuration page #8531 (#9417)
### What problem does this PR solve?

Feat: Added meta data to the chat configuration page #8531

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 16:19:23 +08:00
96783aa82c Fix: remove doc error. (#9413)
### What problem does this PR solve?

Close #9407

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 15:55:04 +08:00
a0c2da1219 Fix: Patch LiteLLM (#9416)
### What problem does this PR solve?

Patch LiteLLM refactor. #9408

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 15:54:30 +08:00
79e2edc835 Fix "File contains no valid workbook part" (#9360)
### What problem does this PR solve?

fix "File contains no valid workbook part"

stacktrace:
```
Traceback (most recent call last):
  File "/ragflow/deepdoc/parser/excel_parser.py", line 54, in _load_excel_to_workbook
    return RAGFlowExcelParser._dataframe_to_workbook(df)
  File "/ragflow/deepdoc/parser/excel_parser.py", line 69, in _dataframe_to_workbook
    ws.cell(row=row_num, column=col_num, value=value)
  File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/worksheet/worksheet.py", line 246, in cell
    cell.value = value
  File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 218, in value
    self._bind_value(value)
  File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 197, in _bind_value
    value = self.check_string(value)
  File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 165, in check_string
    raise IllegalCharacterError(f"{value} cannot be used in worksheets.")
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-08-12 14:58:36 +08:00
57b87fa9d9 Fix:TypeError: OllamaCV.chat() got an unexpected keyword argument 'stop' (#9363)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9351
Support filter argument before invoking
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-12 14:55:27 +08:00
153e430b00 Feat: add meta data filter. (#9405)
### What problem does this PR solve?

#8531 
#7417 
#6761 
#6573
#6477

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 14:12:56 +08:00
3ccaa06031 Fix: Before executing the SQL, remove tags in the format [ID: number] to avoid execution errors. (#9326)
### What problem does this PR solve?

Before executing the SQL, remove tags in the format [ID: number] to
avoid execution errors.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wangyazhou <wangyazhou@sdibd.cn>
2025-08-12 12:42:28 +08:00
569ab011c4 Add fallback to use 'calamine' parse engine in excel_parser.py (#9374)
### What problem does this PR solve?

add fallback to `calamine` engine when parse error raised using the
default `openpyxl` / `xlrd` engine.
e.g. the following error can be fixed:
```
Traceback (most recent call last):
  File "/ragflow/deepdoc/parser/excel_parser.py", line 53, in _load_excel_to_workbook
    df = pd.read_excel(file_like_object)
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 495, in read_excel
    io = ExcelFile(
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 1567, in __init__
    self._reader = self._engines[engine](
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_xlrd.py", line 46, in __init__
    super().__init__(
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 573, in __init__
    self.book = self.load_workbook(self.handles.handle, engine_kwargs)
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_xlrd.py", line 63, in load_workbook
    return open_workbook(file_contents=data, **engine_kwargs)
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/__init__.py", line 172, in open_workbook
    bk = open_workbook_xls(
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/book.py", line 68, in open_workbook_xls
    bk.biff2_8_load(
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/book.py", line 641, in biff2_8_load
    cd.locate_named_stream(UNICODE_LITERAL(qname))
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/compdoc.py", line 398, in locate_named_stream
    result = self._locate_stream(
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/compdoc.py", line 429, in _locate_stream
    raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))
xlrd.compdoc.CompDocError: Workbook corruption: seen[2] == 4
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 12:41:33 +08:00
96b1538b3e Fix:HTTP request component failed to retrieve the corresponding value (#9399)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9385
Based on my understanding, I think checking empty string is fine

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-12 12:27:22 +08:00
735570486f feat(next-search): Added AI summary functionality #3221 (#9402)
### What problem does this PR solve?

feat(next-search): Added AI summary functionality #3221

- Added the LlmSettingFieldItems component for AI summary settings
- Updated the SearchSetting component to integrate AI summary
functionality
- Added the updateSearch hook and related service methods
- Modified the ISearchAppDetailProps interface to add the llm_setting
field

### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 12:27:00 +08:00
da68f541b6 Feat: add full list of supported AWS Bedrock regions (#9395)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 11:01:16 +08:00
83771e500c Refa: migrate chat models to LiteLLM (#9394)
### What problem does this PR solve?

All models pass the mock response tests, which means that if a model can
return the correct response, everything should work as expected.
However, not all models have been fully tested in a real environment,
the real API_KEY. I suggest actively monitoring the refactored models
over the coming period to ensure they work correctly and fixing them
step by step, or waiting to merge until most have been tested in
practical environment.

### Type of change

- [x] Refactoring
2025-08-12 10:59:20 +08:00
a6d2119498 Refa: list canvas (#9341)
### What problem does this PR solve?

Refactor list canvas.

### Type of change

- [x] Refactoring
2025-08-12 10:58:06 +08:00
57b9f8cf52 Fix: Update test assertions and simplify test cases (#9400)
### What problem does this PR solve?

- Fix error message assertion in test_update_chunk.py to match new
ownership validation
- Simplify dataset listing test cases by removing lambda assertions for
sorting
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16885465524/job/47831942553

### Type of change

- [x] Fix test cases
2025-08-12 10:57:30 +08:00
5c3577c4c9 Python SDK: add meta_fields to Document class (#9387)
### What problem does this PR solve?

Python class Document was missing "meta_fields", e.g. when querying, the
document instances came without meta_fields

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 10:16:12 +08:00
76118000c1 Feat: Allow chat to use meta data #3221 (#9393)
### What problem does this PR solve?

Feat:  Allow chat to use meta data #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 10:15:10 +08:00
9433f64fe2 Feat: added functionality to choose all datasets if no id is provided (#9184)
### What problem does this PR solve?

Using the mcp server in n8n sometimes (with smaller models) results in
errors because the llm misses a char or adds one to the list of
dataset_ids provided. It first asks for the list of datasets and if you
got a larger list of them it makes a error recalling the list
completely. So adding the feature to just search through all available
datasets solves this and makes the retrieval of data more stable. The
functionality to just call special datasets by id is not changed, the
dataset_ids are now not required anymore (only the "question" is). You
can provide (like before) a list of datasets, a empty list or no list at
all.

### Type of change

- [X] New Feature (non-breaking change which adds functionality)
<img width="1897" height="880" alt="mcp error dataset id"
src="https://github.com/user-attachments/assets/71076d24-f875-4663-a69a-60839fc7a545"
/>
2025-08-11 17:20:35 +08:00
d7c9611d45 docs(sandbox): update /etc/hosts entry to include required services (#9144)
Fixes an issue where running the sandbox (code component) fails due to
unresolved hostnames. Added missing service names (es01, infinity,
mysql, minio, redis) to 127.0.0.1 in the /etc/hosts example.

Reference: https://github.com/infiniflow/ragflow/issues/8226

## What this PR does

Updates the sandbox quickstart documentation to fix a known issue where
the sandbox fails to resolve required service hostnames.

## Why

Following the original instruction leads to a `Failed to resolve 'none'`
error, as discussed in issue #8226. Adding the missing service names to
`127.0.0.1` resolves the problem.

## Related issue

https://github.com/infiniflow/ragflow/issues/8226

## Note

It might be better to add `127.0.0.1 es01 infinity mysql minio redis` to
docs/quickstart.mdx, but since no issues appeared at the time without
adding this line—and the problem occurred while working with the code
component—I added it here.

### Type of change

- [X] Documentation Update
2025-08-11 17:18:56 +08:00
79399f7f25 Support the case of one cell split by multiple columns. (#9225)
### What problem does this PR solve?
Support the case of one cell split by multiple columns. Besides, the
codes are compatible with the common cell case.
#8606 can be fixed.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

I provide a case of one cell split by multiple columns:

[test.xlsx](https://github.com/user-attachments/files/21578693/test.xlsx)

The chunk res:
<img width="236" height="57" alt="2025-06-17 16-04-07 的屏幕截图"
src="https://github.com/user-attachments/assets/b0a499ac-349d-4c3d-8c6e-0931c8fc26de"
/>
2025-08-11 17:17:56 +08:00
23522f1ea8 Fix: handle missing dataset_ids when creating chat assistant (#9324)
- Root cause: accessing req.get("dataset_ids") returns None when the key
is absent, causing KeyError.
- Fix: use req.get("dataset_ids", []) to default to empty list.
2025-08-11 17:17:20 +08:00
46dc3f1c48 Fix: Update test assertions and add GraphRAG config in dataset tests (#9386)
### What problem does this PR solve?

- Modify error message assertion in chunk update test to check for
document ownership
- Add GraphRAG configuration with `use_graphrag: False` in dataset
update tests
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16863637898/job/47767511582
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:15:48 +08:00
c9b156fa6d Fix: Remove default dataset_ids from Chat class initialization (#9381)
### What problem does this PR solve?

- The default dataset_ids "kb1" was removed from the Chat class. 
- The HTTP API response does not include the dataset_ids field.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:15:30 +08:00
83939b1a63 Feat: add full list of supported AWS Bedrock regions (#9378)
### What problem does this PR solve?

Add full list of supported AWS Bedrock regions.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 17:15:07 +08:00
7f08ba47d7 Fix "no tc element at grid_offset" (#9375)
### What problem does this PR solve?

fix "no `tc` element at grid_offset", just log warning and ignore.
stacktrace:
```
Traceback (most recent call last):
  File "/ragflow/rag/svr/task_executor.py", line 620, in handle_task
    await do_handle_task(task)
  File "/ragflow/rag/svr/task_executor.py", line 553, in do_handle_task
    chunks = await build_chunks(task, progress_callback)
  File "/ragflow/rag/svr/task_executor.py", line 257, in build_chunks
    cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
  File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
    return msg_from_thread.unwrap()
  File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    raise captured_error
  File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
    return result.unwrap()
  File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    raise captured_error
  File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
    ret = context.run(sync_fn, *args)
  File "/ragflow/rag/svr/task_executor.py", line 257, in <lambda>
    cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
  File "/ragflow/rag/app/naive.py", line 384, in chunk
    sections, tables = Docx()(filename, binary)
  File "/ragflow/rag/app/naive.py", line 230, in __call__
    while i < len(r.cells):
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 438, in cells
    return tuple(_iter_row_cells())
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 436, in _iter_row_cells
    yield from iter_tc_cells(tc)
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 424, in iter_tc_cells
    yield from iter_tc_cells(tc._tc_above)  # pyright: ignore[reportPrivateUsage]
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 741, in _tc_above
    return self._tr_above.tc_at_grid_offset(self.grid_offset)
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 98, in tc_at_grid_offset
    raise ValueError(f"no `tc` element at grid_offset={grid_offset}")
ValueError: no `tc` element at grid_offset=10
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:13:10 +08:00
ce3dd019c3 Fix broken data stream when writing image file (#9354)
### What problem does this PR solve?

fix "broken data stream when writing image file", just log warning and
ignore

Close #8379 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:07:49 +08:00
476c56868d Agent plans tasks by referring to its own prompt. (#9315)
### What problem does this PR solve?

Fixes the issue in the analyze_task execution flow where the Lead Agent
was not utilizing its own sys_prompt during task analysis, resulting in
incorrect or incomplete task planning.
https://github.com/infiniflow/ragflow/issues/9294
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 17:05:06 +08:00
b9c4954c2f Fix: Replace StrEnum with strenum in code_exec.py (#9376)
### What problem does this PR solve?

- The enum import was changed from Python's built-in StrEnum to the
strenum package.
- Fix error `Warning: Failed to import module code_exec: cannot import
name 'StrEnum' from 'enum' (/usr/lib/python3.10/enum.py)`

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 15:32:04 +08:00
a060672b31 Feat: Run eslint when the project is running to standardize everyone's code #9377 (#9379)
### What problem does this PR solve?

Feat: Run eslint when the project is running to standardize everyone's
code #9377

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 15:31:38 +08:00
f022504ef9 Support Russian in UI Update config.ts (#9361)
add ru

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2025-08-11 15:30:35 +08:00
1a78b8b295 Support Russian in UI (#9362)
### What problem does this PR solve?

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 14:06:18 +08:00
017dd85ccf Feat: Modify the agent list return field name #3221 (#9373)
### What problem does this PR solve?

Feat: Modify the agent list return field name #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 14:03:52 +08:00
4c7b2ef46e Feat: New search page components and features (#9344)
### What problem does this PR solve?

Feat: New search page components and features #3221

- Added search homepage, search settings, and ongoing search components
- Implemented features such as search app list, creating search apps,
and deleting search apps
- Optimized the multi-select component, adding disabled state and suffix
display
- Adjusted navigation hooks to support search page navigation

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 10:34:22 +08:00
597d88bf9a Doc: updated supported model name (#9343)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-08-11 10:05:39 +08:00
9b026fc5b6 Refa: code format. (#9342)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-08-08 18:44:42 +08:00
90eb5fd31b Fix: canvas sharing bug. (#9339)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-08 18:31:51 +08:00
b9eeb8e64f Docs: Update version references to v0.20.1 in READMEs and docs (#9335)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.0 to v0.20.1
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-08 18:17:25 +08:00
4c99988c3e Revert: revert token_required decorator of agent_bot completions and inputs (#9332)
### What problem does this PR solve?

Revert token_required decorator of agent_bot completions and inputs.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-08-08 17:45:53 +08:00
4f2e9ef248 feat(agent): Adds prologue functionality (#9336)
### What problem does this PR solve?

feat(agent): Adds prologue functionality #3221

- Add a prologue field to the IInputs type
- Initialize the prologue state in the chat container
- Use useEffect to monitor prologue changes and add prologue responses
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-08 17:45:37 +08:00
4a3871090d Docs: v0.20.1 release notes (#9331)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-08-08 17:35:12 +08:00
7ce64cb265 Update the sql assistant workflow (#9329)
### What problem does this PR solve?
Update the sql assistant workflow
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-08 17:06:37 +08:00
d102a6bb71 New workflow templates: choose your knowledge base (#9325)
### What problem does this PR solve?

new Agent templates: you can choose your knowledge base, providing
workflow and Agent versions

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-08 17:06:16 +08:00
a02ca16260 Fix: add prologue to api. (#9322)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-08 17:05:55 +08:00
cd3bb0ed7c Feat: Set the description of the agent, which can be null #3221 (#9327)
### What problem does this PR solve?

Feat: Set the description of the agent, which can be null #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-08 16:44:08 +08:00
86fb710e52 Feat: Add xai logo #1853 (#9321)
### What problem does this PR solve?

Feat: Add xai logo #1853

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-08 14:13:19 +08:00
7713e14d6a Update chat_model.py (#9318)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9317
base on
https://discuss.ai.google.dev/t/valueerror-invalid-operation-the-response-text-quick-accessor-requires-the-response-to-contain-a-valid-part-but-none-were-returned/42866
should can be handled by retry 
### Type of change

- [x] Refactoring
2025-08-08 14:13:07 +08:00
392f5f4ce9 fix model type (#9250)
### What problem does this PR solve?
 ERROR type model

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-08 13:43:53 +08:00
79481becea Feat: supports GPT-5 (#9320)
### What problem does this PR solve?

Supports GPT-5.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-08 11:54:40 +08:00
58a64000ea Feat: Render agent setting dialog #3221 (#9312)
### What problem does this PR solve?

Feat: Render agent setting dialog #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-08 11:00:55 +08:00
1bd64dafcb Fix: update broken agent completion due to v0.20.0 changes (#9309)
### What problem does this PR solve?

Update broken agent completion due to v0.20.0 changes. #9199

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-08 10:00:16 +08:00
07354f4a1a Add files viaContribute a new workflow template: SQL Assistant upload (#9311)
### What problem does this PR solve?

Contribute a new workflow template: SQL Assistant

### Type of change

- [x] Other (please describe): new workflow template
2025-08-07 18:06:49 +08:00
d628234942 Feat: Restore the button's background color #3221 (#9307)
### What problem does this PR solve?

Feat: Restore the button's background color #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-07 17:37:53 +08:00
5749aa30b0 Fix: model type error. (#9308)
### What problem does this PR solve?

#9240

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-07 16:14:47 +08:00
a2e1f5618d Fix: bytes style image issue. (#9304)
### What problem does this PR solve?

#9302

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-07 15:20:01 +08:00
dc48c3863d Feat: Replace color variables according to design draft #3221 (#9305)
### What problem does this PR solve?

Feat: Replace color variables according to design draft #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-07 15:19:45 +08:00
23062cb27a Feat: Configure colors according to the design draft#3221 (#9301)
### What problem does this PR solve?

Feat: Configure colors according to the design draft#3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-07 13:59:33 +08:00
63c2f5b821 Fix: virtual file cannot be displayed in KB (#9282)
### What problem does this PR solve?

Fix virtual file cannot be displayed in KB. #9265

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-07 11:08:03 +08:00
0a0bfc02a0 Refactor:naive_merge_with_images close useless images (#9296)
### What problem does this PR solve?

naive_merge_with_images close useless images

### Type of change

- [x] Refactoring
2025-08-07 11:07:29 +08:00
f0c34d4454 Feat: Render chat page #3221 (#9298)
### What problem does this PR solve?

Feat: Render chat page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-07 11:07:15 +08:00
7c719f8365 fix: Optimized popups and the search page (#9297)
### What problem does this PR solve?

fix: Optimized popups and the search page #3221 
- Added a new PortalModal component
- Refactored the Modal component, adding show and hide methods to
support popups
- Updated the search page, adding a new query function and optimizing
the search card style
- Localized, added search-related translations

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-07 11:07:04 +08:00
4fc9e42e74 fix: add missing env vars and default values of service_conf.yaml (#9289)
### What problem does this PR solve?

Add missing env var `MYSQL_MAX_PACKET` to service_conf.yaml.template,
and add default values to opendal config to fix npe.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-07 10:41:05 +08:00
35539092d0 Add **kwargs to model base class constructors (#9252)
Updated constructors for base and derived classes in chat, embedding,
rerank, sequence2txt, and tts models to accept **kwargs. This change
improves extensibility and allows passing additional parameters without
breaking existing interfaces.

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: IT: Sop.Son <sop.son@feavn.local>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-07 09:45:37 +08:00
581a54fbbb Feat: Search conversation by name #3221 (#9283)
### What problem does this PR solve?

Feat: Search conversation by name #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-07 09:41:03 +08:00
9ca86d801e Refa: add provider info while adding model. (#9273)
### What problem does this PR solve?
#9248

### Type of change

- [x] Refactoring
2025-08-07 09:40:42 +08:00
fb0426419e Feat: Create a conversation #3221 (#9269)
### What problem does this PR solve?

Feat: Create a conversation #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-06 11:42:40 +08:00
1409bb30df Refactor:Improve the logic so that it does not decode base 64 for the test image each time (#9264)
### What problem does this PR solve?

Improve the logic so that it does not decode base 64 for the test image
each time

### Type of change

- [x] Refactoring
- [x] Performance Improvement

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-06 11:42:25 +08:00
7efeaf6548 Fix:remove a img close which can not operate (#9267)
### What problem does this PR solve?


https://github.com/infiniflow/ragflow/issues/9149#issuecomment-3157129587

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-06 10:59:49 +08:00
46a35f44da Feat: add Claude Opus 4.1 (#9268)
### What problem does this PR solve?

Add Claude Opus 4.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2025-08-06 10:57:03 +08:00
a7eba61067 FIX: If chunk["content_with_weight"] contains one or more unpaired surrogate characters (such as incomplete emoji or other special characters), then calling .encode("utf-8") directly will raise a UnicodeEncodeError. (#9246)
FIX: If chunk["content_with_weight"] contains one or more unpaired
surrogate characters (such as incomplete emoji or other special
characters), then calling .encode("utf-8") directly will raise a
UnicodeEncodeError.

### What problem does this PR solve?
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-06 10:36:50 +08:00
465f7e036a Feat: advanced list dialogs (#9256)
### What problem does this PR solve?

Advanced list dialogs

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-06 10:33:52 +08:00
7a27d5e463 Feat: Added history management and paste handling features #3221 (#9266)
### What problem does this PR solve?

feat(agent): Added history management and paste handling features #3221

- Added a PasteHandlerPlugin to handle paste operations, optimizing the
multi-line text pasting experience
- Implemented the AgentHistoryManager class to manage history,
supporting undo and redo functionality
- Integrates history management functionality into the Agent component

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-06 10:29:44 +08:00
6a0d6d2565 Added French language support (#9173)
### What problem does this PR solve?
Implemented French UI translation

### Type of change
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: ramin cedric <>
Co-authored-by: Liu An <asiro@qq.com>
2025-08-06 10:22:32 +08:00
f359f2c44e Docs: fixed errors (#9259)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-08-05 21:29:46 +08:00
9295c23170 Update readme (#9260)
### What problem does this PR solve?

Update readme

### Type of change

- [x] Documentation Update

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-08-05 20:27:43 +08:00
023b090fa4 Fix: template error. (#9258)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 19:52:59 +08:00
2124329e95 Fix: local variable issue. (#9255)
### What problem does this PR solve?

#9227

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 19:24:34 +08:00
ed9757b0c7 Feat: Limit the appearance of loops in operators in the agent canvas #3221 (#9253)
### What problem does this PR solve?
Feat: Limit the appearance of loops in operators in the agent canvas
#3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-05 19:21:24 +08:00
f235a38225 Fix: resolve the prompt problem of Customer Support Workflow (#9251)
### What problem does this PR solve?


### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 18:19:17 +08:00
yzz
550e65bb22 Fix: PlainParser using fix in presentation (#9239)
### What problem does this PR solve?

tiny fix about the using of `deepdoc.pdf_parser.PlainParser` in
`rag.app.presentation.chunk`, I referred to other ways of using this
class.
So tiny the fix is, a issue seems unnecessary.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 17:48:18 +08:00
a264c629b5 Feat: Render dialog list #3221 (#9249)
### What problem does this PR solve?

Feat: Render dialog list #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-05 17:47:44 +08:00
e6bad45c6d Fix: update broken agent OpenAI-Compatible completion due to v0.20.0 changes (#9241)
### What problem does this PR solve?

Update broken agent OpenAI-Compatible completion due to v0.20.0. #9199 

Usage example:

**Referring the input is important, otherwise, will result in empty
output.**

<img width="1273" height="711" alt="Image"
src="https://github.com/user-attachments/assets/30740be8-f4d6-400d-9fda-d2616f89063f"
/>

<img width="622" height="247" alt="Image"
src="https://github.com/user-attachments/assets/0a2ca57a-9600-4cec-9362-0cafd0ab3aee"
/>

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 17:47:25 +08:00
0a303d9ae1 Refactor:Improve the chat stream logic for NvidiaCV (#9242)
### What problem does this PR solve?

Improve the chat stream logic for NvidiaCV

### Type of change


- [x] Refactoring
2025-08-05 17:47:00 +08:00
98a83543e8 Fix: fix mismatch of assitant message and its reference (#9233)
### What problem does this PR solve?

#9232

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

1. When creating a new session, initialize an empty reference that
includes both the app api and sdk API.
2. Fix the logic when retrieving references for historical messages: the
number of dialogue messages and reference messages may differ, but it
should match the number of assistant messages.

Co-authored-by: Li Ye <liye@unittec.com>
2025-08-05 14:32:39 +08:00
afd3a508e5 Fix: Set the maximum number of rounds for the agent to 1 #3221 (#9238)
### What problem does this PR solve?

Fix: Fixed the issue where numbers could not be displayed in the numeric
input box under white theme #3221
Fix: Set the maximum number of rounds for the agent to 1 #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-05 14:32:06 +08:00
1deb0a2d42 Fix:local variable 'response' referenced before assignment (#9230)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9227

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-05 11:00:06 +08:00
dd055deee9 Docs: Updated tips for max rounds (#9235)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-08-05 10:59:37 +08:00
a249803961 Refa: ensure Redis stream queue could be created properly (#9223)
### What problem does this PR solve?

Ensure Redis queue could be created properly.

### Type of change

- [x] Refactoring
2025-08-05 09:54:31 +08:00
6ec3f18e22 Fix: self-deployed LLM error, (#9217)
### What problem does this PR solve?

Close #9197
Close #9145

### Type of change

- [x] Refactoring
- [x] Bug fixing.
2025-08-05 09:49:47 +08:00
7724acbadb Perf Impr: Decouple reasoning and extraction for faster, more precise logic (#9191)
### What problem does this PR solve?

This commit refactors the core prompts to decouple the high-level
reasoning from the low-level information extraction. By making
REASON_PROMPT a dedicated strategist that only generates search queries
and re-tasking RELEVANT_EXTRACTION_PROMPT to be a specialized tool for
single-fact extraction, we eliminate redundant information
summarization. This clear separation of concerns makes the overall
reasoning process significantly faster and more precise, as each
component now has a single, well-defined responsibility.

### Type of change

- [x] Performance Improvement
2025-08-05 09:36:14 +08:00
a36ba95c1c Fix: Add prompt text to the form in the MCP module (#9222)
### What problem does this PR solve?

Fix: Add prompt text to the form in the MCP module #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 09:26:59 +08:00
30ccc4a66c Fix: correct single base64 image handling in image prompt (#9220)
### What problem does this PR solve?

Correct single base64 image handling in image prompt.


![img_v3_02or_ec4757c2-a9d4-4774-9a76-f7c6be633ebg](https://github.com/user-attachments/assets/872a86bf-e2a8-48d1-9b71-2a0c7a35ba9e)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 09:26:42 +08:00
dda5a0080a Fix: Fixed the issue where the agent's chat box could not automatically scroll to the bottom #3221 (#9219)
### What problem does this PR solve?

Fix: Fixed the issue where the agent's chat box could not automatically
scroll to the bottom #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 09:26:15 +08:00
9db999ccae v0.20.0 release notes (#9218)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-08-04 18:07:53 +08:00
5f5c6a7990 Fix: Fixed the loss of Await Response function on the share page and other style issues #3221 (#9216)
### What problem does this PR solve?

Fix: Fixed the loss of Await Response function on the share page and
other style issues #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 18:06:56 +08:00
53618d13bb Fix: Fixed the issue where the prompt word edit box had no scroll bar #3221 (#9215)
### What problem does this PR solve?
Fix: Fixed the issue where the prompt word edit box had no scroll bar
#3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 18:06:19 +08:00
60d652d2e1 Feat: list documents supports range filtering (#9214)
### What problem does this PR solve?

list_document supports range filtering.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-04 16:35:35 +08:00
448bdda73d Fix: Web Server Accepts Invalid Data That Could Cause Problems in uv.lock (#8966)
**Context and Purpose:**

This PR automatically remediates a security vulnerability:
- **Description:** h11: h11 accepts some malformed Chunked-Encoding
bodies
- **Rule ID:** CVE-2025-43859
- **Severity:** CRITICAL
- **File:** uv.lock
- **Lines Affected:** None - None

This change is necessary to protect the application from potential
security risks associated with this vulnerability.

**Solution Implemented:**

The automated remediation process has applied the necessary changes to
the affected code in `uv.lock` to resolve the identified issue.

Please review the changes to ensure they are correct and integrate as
expected.
2025-08-04 16:09:15 +08:00
26b85a10d1 Feat: New Agent startup parameters add knowledge base parameter #9194 (#9210)
### What problem does this PR solve?

Feat: New Agent startup parameters add knowledge base parameter #9194

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-04 16:08:41 +08:00
cae11201ef fix "out of memory" if slide.get_thumbnail() to a huge image (#9211)
### What problem does this PR solve?

fix "out of memory" if slide.get_thumbnail() to a huge image

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 16:08:24 +08:00
6ad8b54754 fix "TypeError: '<' not supported between instances of 'Emu' and 'Non… (#9209)
…eType'"

### What problem does this PR solve?

fix "TypeError: '<' not supported between instances of 'Emu' and
'NoneType'"

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 16:07:03 +08:00
83aca2d07b fix #8424 NPE in dify_retrieval.py, add log exception (#9212)
### What problem does this PR solve?

fix #8424 NPE in dify_retrieval.py, add log exception

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 15:36:31 +08:00
34f829e1b1 docs(agent): Correct several spelling errors, such as: Ouline -> Outline (#9188)
### What problem does this PR solve?

Correct several spelling errors, such as: Ouline -> Outline

### Type of change

- [x] Documentation Update
2025-08-04 14:53:32 +08:00
52a349349d Fix: migrate deprecated Langfuse API from v2 to v3 (#9204)
### What problem does this PR solve?

Fix:

```bash
'Langfuse' object has no attribute 'trace'
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 14:45:43 +08:00
45bf294117 Refactor: support config strong test (#9198)
### What problem does this PR solve?


https://github.com/infiniflow/ragflow/issues/9189#issuecomment-3148920950

### Type of change
- [x] Refactoring

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-04 13:54:18 +08:00
667c5812d0 Fix:Repeated images when parsing markdown files with images (#9196)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9149

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 13:35:58 +08:00
30e9212db9 Fix: enlarge the timeout limits. (#9201)
### What problem does this PR solve?

#9189

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 13:34:34 +08:00
e9cbf4611d Fix:Error when parsing files using Gemini: **ERROR**: GENERIC_ERROR - Unknown field for GenerationConfig: max_tokens (#9195)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9177
The reason should be due to the gemin internal use a different parameter
name
`
        max_output_tokens (int):
            Optional. The maximum number of tokens to include in a
            response candidate.

            Note: The default value varies by model, see the
            ``Model.output_token_limit`` attribute of the ``Model``
            returned from the ``getModel`` function.

            This field is a member of `oneof`_ ``_max_output_tokens``.
`
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 10:06:09 +08:00
d4b1d163dd Fix: list tags api by using tenant id instead of user id (#9103)
### What problem does this PR solve?

The index name of the tag chunks is generated by the tenant id of the
knowledge base, so it should use the tenant id instead of the current
user id in the listing tags API.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 09:57:00 +08:00
fca94509e8 Feat: Add the migration script and its doc, added backup as default… (#8245)
### What problem does this PR solve?

This PR adds a data backup and migration solution for RAGFlow Docker
Compose deployments. Currently, users lack a standardized way to backup
and restore RAGFlow data volumes (MySQL, MinIO, Redis, Elasticsearch),
which is essential for data safety and environment migration.

**Solution:**
- **Migration Script** (`docker/migration.sh`) - Automates
backup/restore operations for all RAGFlow data volumes
- **Documentation**
(`docs/guides/migration/migrate_from_docker_compose.md`) - Usage guide
and best practices
- **Safety Features** - Container conflict detection and user
confirmations to prevent data loss

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update

Co-authored-by: treedy <treedy2022@icloud.com>
2025-08-04 09:43:43 +08:00
518 changed files with 24251 additions and 5677 deletions

View File

@ -0,0 +1,46 @@
name: "❤️‍🔥ᴬᴳᴱᴺᵀ Agent scenario request"
description: Propose a agent scenario request for RAGFlow.
title: "[Agent Scenario Request]: "
labels: ["❤️‍🔥ᴬᴳᴱᴺᵀ agent scenario"]
body:
- type: checkboxes
attributes:
label: Self Checks
description: "Please check the following in order to be responded in time :)"
options:
- label: I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
required: true
- label: I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required: true
- label: Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
required: true
- label: "Please do not modify this template :) and fill in all the required fields."
required: true
- type: textarea
attributes:
label: Is your feature request related to a scenario?
description: |
A clear and concise description of what the scenario is. Ex. I'm always frustrated when [...]
render: Markdown
validations:
required: false
- type: textarea
attributes:
label: Describe the feature you'd like
description: A clear and concise description of what you want to happen.
validations:
required: true
- type: textarea
attributes:
label: Documentation, adoption, use case
description: If you can, explain some scenarios how users might use this, situations it would be helpful in. Any API designs, mockups, or diagrams are also helpful.
render: Markdown
validations:
required: false
- type: textarea
attributes:
label: Additional information
description: |
Add any other context or screenshots about the feature request here.
validations:
required: false

2
.gitignore vendored
View File

@ -193,3 +193,5 @@ dist
# SvelteKit build / generate output
.svelte-kit
# Default backup dir
backup

15
.trivyignore Normal file
View File

@ -0,0 +1,15 @@
**/*.md
**/*.min.js
**/*.min.css
**/*.svg
**/*.png
**/*.jpg
**/*.jpeg
**/*.gif
**/*.woff
**/*.woff2
**/*.map
**/*.webp
**/*.ico
**/*.ttf
**/*.eot

View File

@ -22,7 +22,7 @@
<img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
</a>
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.0">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
</a>
<a href="https://github.com/infiniflow/ragflow/releases/latest">
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -87,7 +87,9 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
## 🔥 Latest Updates
- 2025-08-01 Supports agentic workflow.
- 2025-08-08 Supports OpenAI's latest GPT-5 series models.
- 2025-08-04 Supports new models, including Kimi K2 and Grok 4.
- 2025-08-01 Supports agentic workflow and MCP.
- 2025-05-23 Adds a Python/JavaScript code executor component to Agent.
- 2025-05-05 Supports cross-language query.
- 2025-03-19 Supports using a multi-modal model to make sense of images within PDF or DOCX files.
@ -188,7 +190,7 @@ releases! 🌟
> All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
> If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.
> The command below downloads the `v0.20.0-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0` for the full edition `v0.20.0`.
> The command below downloads the `v0.20.4-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.4-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` for the full edition `v0.20.4`.
```bash
$ cd ragflow/docker
@ -201,8 +203,8 @@ releases! 🌟
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
|-------------------|-----------------|-----------------------|--------------------------|
| v0.20.0 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.0-slim | &approx;2 | ❌ | Stable release |
| v0.20.4 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.4-slim | &approx;2 | ❌ | Stable release |
| nightly | &approx;9 | :heavy_check_mark: | _Unstable_ nightly build |
| nightly-slim | &approx;2 | ❌ | _Unstable_ nightly build |

View File

@ -22,7 +22,7 @@
<img alt="Lencana Daring" src="https://img.shields.io/badge/Online-Demo-4e6b99">
</a>
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.0">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
</a>
<a href="https://github.com/infiniflow/ragflow/releases/latest">
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Rilis%20Terbaru" alt="Rilis Terbaru">
@ -80,7 +80,9 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
## 🔥 Pembaruan Terbaru
- 2025-08-01 Mendukung Alur Kerja agen.
- 2025-08-08 Mendukung model seri GPT-5 terbaru dari OpenAI.
- 2025-08-04 Mendukung model baru, termasuk Kimi K2 dan Grok 4.
- 2025-08-01 Mendukung alur kerja agen dan MCP.
- 2025-05-23 Menambahkan komponen pelaksana kode Python/JS ke Agen.
- 2025-05-05 Mendukung kueri lintas bahasa.
- 2025-03-19 Mendukung penggunaan model multi-modal untuk memahami gambar di dalam file PDF atau DOCX.
@ -179,7 +181,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
> Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
> Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).
> Perintah di bawah ini mengunduh edisi v0.20.0-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.20.0-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0 untuk edisi lengkap v0.20.0.
> Perintah di bawah ini mengunduh edisi v0.20.4-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.20.4-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4 untuk edisi lengkap v0.20.4.
```bash
$ cd ragflow/docker
@ -192,8 +194,8 @@ $ docker compose -f docker-compose.yml up -d
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
| ----------------- | --------------- | --------------------- | ------------------------ |
| v0.20.0 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.0-slim | &approx;2 | ❌ | Stable release |
| v0.20.4 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.4-slim | &approx;2 | ❌ | Stable release |
| nightly | &approx;9 | :heavy_check_mark: | _Unstable_ nightly build |
| nightly-slim | &approx;2 | ❌ | _Unstable_ nightly build |

View File

@ -22,7 +22,7 @@
<img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
</a>
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.0">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
</a>
<a href="https://github.com/infiniflow/ragflow/releases/latest">
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -60,7 +60,9 @@
## 🔥 最新情報
- 2025-08-01 エージェントワークフローをサポートします。
- 2025-08-08 OpenAI の最新 GPT-5 シリーズモデルをサポートします。
- 2025-08-04 新モデル、キミK2およびGrok 4をサポート。
- 2025-08-01 エージェントワークフローとMCPをサポート。
- 2025-05-23 エージェントに Python/JS コードエグゼキュータコンポーネントを追加しました。
- 2025-05-05 言語間クエリをサポートしました。
- 2025-03-19 PDFまたはDOCXファイル内の画像を理解するために、多モーダルモデルを使用することをサポートします。
@ -158,7 +160,7 @@
> 現在、公式に提供されているすべての Docker イメージは x86 アーキテクチャ向けにビルドされており、ARM64 用の Docker イメージは提供されていません。
> ARM64 アーキテクチャのオペレーティングシステムを使用している場合は、[このドキュメント](https://ragflow.io/docs/dev/build_docker_image)を参照して Docker イメージを自分でビルドしてください。
> 以下のコマンドは、RAGFlow Docker イメージの v0.20.0-slim エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.20.0-slim とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。例えば、完全版 v0.20.0 をダウンロードするには、RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0 と設定します。
> 以下のコマンドは、RAGFlow Docker イメージの v0.20.4-slim エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.20.4-slim とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。例えば、完全版 v0.20.4 をダウンロードするには、RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4 と設定します。
```bash
$ cd ragflow/docker
@ -171,8 +173,8 @@
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
| ----------------- | --------------- | --------------------- | ------------------------ |
| v0.20.0 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.0-slim | &approx;2 | ❌ | Stable release |
| v0.20.4 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.4-slim | &approx;2 | ❌ | Stable release |
| nightly | &approx;9 | :heavy_check_mark: | _Unstable_ nightly build |
| nightly-slim | &approx;2 | ❌ | _Unstable_ nightly build |

View File

@ -22,7 +22,7 @@
<img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
</a>
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.0">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
</a>
<a href="https://github.com/infiniflow/ragflow/releases/latest">
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -60,7 +60,9 @@
## 🔥 업데이트
- 2025-08-01 에이전트 워크플로를 지원합니다.
- 2025-08-08 OpenAI의 최신 GPT-5 시리즈 모델을 지원합니다.
- 2025-08-04 새로운 모델인 Kimi K2와 Grok 4를 포함하여 지원합니다.
- 2025-08-01 에이전트 워크플로우와 MCP를 지원합니다.
- 2025-05-23 Agent에 Python/JS 코드 실행기 구성 요소를 추가합니다.
- 2025-05-05 언어 간 쿼리를 지원합니다.
- 2025-03-19 PDF 또는 DOCX 파일 내의 이미지를 이해하기 위해 다중 모드 모델을 사용하는 것을 지원합니다.
@ -158,7 +160,7 @@
> 모든 Docker 이미지는 x86 플랫폼을 위해 빌드되었습니다. 우리는 현재 ARM64 플랫폼을 위한 Docker 이미지를 제공하지 않습니다.
> ARM64 플랫폼을 사용 중이라면, [시스템과 호환되는 Docker 이미지를 빌드하려면 이 가이드를 사용해 주세요](https://ragflow.io/docs/dev/build_docker_image).
> 아래 명령어는 RAGFlow Docker 이미지의 v0.20.0-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.20.0-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.20.0을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0로 설정합니다.
> 아래 명령어는 RAGFlow Docker 이미지의 v0.20.4-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.20.4-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.20.4을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4로 설정합니다.
```bash
$ cd ragflow/docker
@ -171,8 +173,8 @@
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
| ----------------- | --------------- | --------------------- | ------------------------ |
| v0.20.0 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.0-slim | &approx;2 | ❌ | Stable release |
| v0.20.4 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.4-slim | &approx;2 | ❌ | Stable release |
| nightly | &approx;9 | :heavy_check_mark: | _Unstable_ nightly build |
| nightly-slim | &approx;2 | ❌ | _Unstable_ nightly build |

View File

@ -22,7 +22,7 @@
<img alt="Badge Estático" src="https://img.shields.io/badge/Online-Demo-4e6b99">
</a>
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.0">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
</a>
<a href="https://github.com/infiniflow/ragflow/releases/latest">
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Última%20Relese" alt="Última Versão">
@ -80,7 +80,9 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
## 🔥 Últimas Atualizações
- 01-08-2025 Suporta o fluxo de trabalho agêntico.
- 08-08-2025 Suporta a mais recente série GPT-5 da OpenAI.
- 04-08-2025 Suporta novos modelos, incluindo Kimi K2 e Grok 4.
- 01-08-2025 Suporta fluxo de trabalho agente e MCP.
- 23-05-2025 Adicione o componente executor de código Python/JS ao Agente.
- 05-05-2025 Suporte a consultas entre idiomas.
- 19-03-2025 Suporta o uso de um modelo multi-modal para entender imagens dentro de arquivos PDF ou DOCX.
@ -178,7 +180,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
> Todas as imagens Docker são construídas para plataformas x86. Atualmente, não oferecemos imagens Docker para ARM64.
> Se você estiver usando uma plataforma ARM64, por favor, utilize [este guia](https://ragflow.io/docs/dev/build_docker_image) para construir uma imagem Docker compatível com o seu sistema.
> O comando abaixo baixa a edição `v0.20.0-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.20.0-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0` para a edição completa `v0.20.0`.
> O comando abaixo baixa a edição `v0.20.4-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.20.4-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` para a edição completa `v0.20.4`.
```bash
$ cd ragflow/docker
@ -191,8 +193,8 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
| Tag da imagem RAGFlow | Tamanho da imagem (GB) | Possui modelos de incorporação? | Estável? |
| --------------------- | ---------------------- | ------------------------------- | ------------------------ |
| v0.20.0 | ~9 | :heavy_check_mark: | Lançamento estável |
| v0.20.0-slim | ~2 | ❌ | Lançamento estável |
| v0.20.4 | ~9 | :heavy_check_mark: | Lançamento estável |
| v0.20.4-slim | ~2 | ❌ | Lançamento estável |
| nightly | ~9 | :heavy_check_mark: | _Instável_ build noturno |
| nightly-slim | ~2 | ❌ | _Instável_ build noturno |

View File

@ -22,7 +22,7 @@
<img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
</a>
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.0">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
</a>
<a href="https://github.com/infiniflow/ragflow/releases/latest">
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -83,7 +83,9 @@
## 🔥 近期更新
- 2025-08-01 支援 agentic workflow
- 2025-08-08 支援 OpenAI 最新的 GPT-5 系列模型。
- 2025-08-04 支援 Kimi K2 和 Grok 4 等模型.
- 2025-08-01 支援 agentic workflow 和 MCP
- 2025-05-23 為 Agent 新增 Python/JS 程式碼執行器元件。
- 2025-05-05 支援跨語言查詢。
- 2025-03-19 PDF和DOCX中的圖支持用多模態大模型去解析得到描述.
@ -181,7 +183,7 @@
> 所有 Docker 映像檔都是為 x86 平台建置的。目前,我們不提供 ARM64 平台的 Docker 映像檔。
> 如果您使用的是 ARM64 平台,請使用 [這份指南](https://ragflow.io/docs/dev/build_docker_image) 來建置適合您系統的 Docker 映像檔。
> 執行以下指令會自動下載 RAGFlow slim Docker 映像 `v0.20.0-slim`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.20.0-slim` 的 Docker 映像,請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。例如,你可以透過設定 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0` 來下載 RAGFlow 鏡像的 `v0.20.0` 完整發行版。
> 執行以下指令會自動下載 RAGFlow slim Docker 映像 `v0.20.4-slim`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.20.4-slim` 的 Docker 映像,請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。例如,你可以透過設定 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` 來下載 RAGFlow 鏡像的 `v0.20.4` 完整發行版。
```bash
$ cd ragflow/docker
@ -194,8 +196,8 @@
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
| ----------------- | --------------- | --------------------- | ------------------------ |
| v0.20.0 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.0-slim | &approx;2 | ❌ | Stable release |
| v0.20.4 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.4-slim | &approx;2 | ❌ | Stable release |
| nightly | &approx;9 | :heavy_check_mark: | _Unstable_ nightly build |
| nightly-slim | &approx;2 | ❌ | _Unstable_ nightly build |

View File

@ -22,7 +22,7 @@
<img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
</a>
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.0">
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
</a>
<a href="https://github.com/infiniflow/ragflow/releases/latest">
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -83,7 +83,9 @@
## 🔥 近期更新
- 2025-08-01 支持 agentic workflow。
- 2025-08-08 支持 OpenAI 最新的 GPT-5 系列模型.
- 2025-08-04 新增对 Kimi K2 和 Grok 4 等模型的支持.
- 2025-08-01 支持 agentic workflow 和 MCP。
- 2025-05-23 Agent 新增 Python/JS 代码执行器组件。
- 2025-05-05 支持跨语言查询。
- 2025-03-19 PDF 和 DOCX 中的图支持用多模态大模型去解析得到描述.
@ -181,7 +183,7 @@
> 请注意,目前官方提供的所有 Docker 镜像均基于 x86 架构构建,并不提供基于 ARM64 的 Docker 镜像。
> 如果你的操作系统是 ARM64 架构,请参考[这篇文档](https://ragflow.io/docs/dev/build_docker_image)自行构建 Docker 镜像。
> 运行以下命令会自动下载 RAGFlow slim Docker 镜像 `v0.20.0-slim`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.20.0-slim` 的 Docker 镜像,请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。比如,你可以通过设置 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0` 来下载 RAGFlow 镜像的 `v0.20.0` 完整发行版。
> 运行以下命令会自动下载 RAGFlow slim Docker 镜像 `v0.20.4-slim`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.20.4-slim` 的 Docker 镜像,请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。比如,你可以通过设置 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` 来下载 RAGFlow 镜像的 `v0.20.4` 完整发行版。
```bash
$ cd ragflow/docker
@ -194,8 +196,8 @@
| RAGFlow image tag | Image size (GB) | Has embedding models? | Stable? |
| ----------------- | --------------- | --------------------- | ------------------------ |
| v0.20.0 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.0-slim | &approx;2 | ❌ | Stable release |
| v0.20.4 | &approx;9 | :heavy_check_mark: | Stable release |
| v0.20.4-slim | &approx;2 | ❌ | Stable release |
| nightly | &approx;9 | :heavy_check_mark: | _Unstable_ nightly build |
| nightly-slim | &approx;2 | ❌ | _Unstable_ nightly build |

View File

@ -131,7 +131,16 @@ class Canvas:
self.path = self.dsl["path"]
self.history = self.dsl["history"]
self.globals = self.dsl["globals"]
if "globals" in self.dsl:
self.globals = self.dsl["globals"]
else:
self.globals = {
"sys.query": "",
"sys.user_id": "",
"sys.conversation_turns": 0,
"sys.files": []
}
self.retrieval = self.dsl["retrieval"]
self.memory = self.dsl.get("memory", [])
@ -417,7 +426,7 @@ class Canvas:
convs = []
if window_size <= 0:
return convs
for role, obj in self.history[window_size * -1:]:
for role, obj in self.history[window_size * -2:]:
if isinstance(obj, dict):
convs.append({"role": role, "content": obj.get("content", "")})
else:
@ -460,6 +469,9 @@ class Canvas:
def get_prologue(self):
return self.components["begin"]["obj"]._param.prologue
def get_mode(self):
return self.components["begin"]["obj"]._param.mode
def set_global_param(self, **kwargs):
self.globals.update(kwargs)
@ -484,7 +496,7 @@ class Canvas:
threads.append(exe.submit(FileService.parse, file["name"], FileService.get_blob(file["created_by"], file["id"]), True, file["created_by"]))
return [th.result() for th in threads]
def tool_use_callback(self, agent_id: str, func_name: str, params: dict, result: Any):
def tool_use_callback(self, agent_id: str, func_name: str, params: dict, result: Any, elapsed_time=None):
agent_ids = agent_id.split("-->")
agent_name = self.get_component_name(agent_ids[0])
path = agent_name if len(agent_ids) < 2 else agent_name+"-->"+"-->".join(agent_ids[1:])
@ -493,16 +505,16 @@ class Canvas:
if bin:
obj = json.loads(bin.encode("utf-8"))
if obj[-1]["component_id"] == agent_ids[0]:
obj[-1]["trace"].append({"path": path, "tool_name": func_name, "arguments": params, "result": result})
obj[-1]["trace"].append({"path": path, "tool_name": func_name, "arguments": params, "result": result, "elapsed_time": elapsed_time})
else:
obj.append({
"component_id": agent_ids[0],
"trace": [{"path": path, "tool_name": func_name, "arguments": params, "result": result}]
"trace": [{"path": path, "tool_name": func_name, "arguments": params, "result": result, "elapsed_time": elapsed_time}]
})
else:
obj = [{
"component_id": agent_ids[0],
"trace": [{"path": path, "tool_name": func_name, "arguments": params, "result": result}]
"trace": [{"path": path, "tool_name": func_name, "arguments": params, "result": result, "elapsed_time": elapsed_time}]
}]
REDIS_CONN.set_obj(f"{self.task_id}-{self.message_id}-logs", obj, 60*10)
except Exception as e:

View File

@ -22,9 +22,10 @@ from functools import partial
from typing import Any
import json_repair
from timeit import default_timer as timer
from agent.tools.base import LLMToolPluginCallSession, ToolParamBase, ToolBase, ToolMeta
from api.db.services.llm_service import LLMBundle, TenantLLMService
from api.db.services.llm_service import LLMBundle
from api.db.services.tenant_llm_service import TenantLLMService
from api.db.services.mcp_server_service import MCPServerService
from api.utils.api_utils import timeout
from rag.prompts import message_fit_in
@ -165,7 +166,7 @@ class Agent(LLM, ToolBase):
_, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(self.chat_mdl.max_length * 0.97))
use_tools = []
ans = ""
for delta_ans, tk in self._react_with_tools_streamly(msg, use_tools):
for delta_ans, tk in self._react_with_tools_streamly(prompt, msg, use_tools):
ans += delta_ans
if ans.find("**ERROR**") >= 0:
@ -185,7 +186,7 @@ class Agent(LLM, ToolBase):
_, msg = message_fit_in([{"role": "system", "content": prompt}, *msg], int(self.chat_mdl.max_length * 0.97))
answer_without_toolcall = ""
use_tools = []
for delta_ans,_ in self._react_with_tools_streamly(msg, use_tools):
for delta_ans,_ in self._react_with_tools_streamly(prompt, msg, use_tools):
if delta_ans.find("**ERROR**") >= 0:
if self.get_exception_default_value():
self.set_output("content", self.get_exception_default_value())
@ -208,20 +209,21 @@ class Agent(LLM, ToolBase):
]):
yield delta_ans
def _react_with_tools_streamly(self, history: list[dict], use_tools):
def _react_with_tools_streamly(self, prompt, history: list[dict], use_tools):
token_count = 0
tool_metas = self.tool_meta
hist = deepcopy(history)
last_calling = ""
if len(hist) > 3:
st = timer()
user_request = full_question(messages=history, chat_mdl=self.chat_mdl)
self.callback("Multi-turn conversation optimization", {}, user_request)
self.callback("Multi-turn conversation optimization", {}, user_request, elapsed_time=timer()-st)
else:
user_request = history[-1]["content"]
def use_tool(name, args):
nonlocal hist, use_tools, token_count,last_calling,user_request
print(f"{last_calling=} == {name=}", )
logging.info(f"{last_calling=} == {name=}")
# Summarize of function calling
#if all([
# isinstance(self.toolcall_session.get_tool_obj(name), Agent),
@ -243,7 +245,7 @@ class Agent(LLM, ToolBase):
def complete():
nonlocal hist
need2cite = self._canvas.get_reference()["chunks"] and self._id.find("-->") < 0
need2cite = self._param.cite and self._canvas.get_reference()["chunks"] and self._id.find("-->") < 0
cited = False
if hist[0]["role"] == "system" and need2cite:
if len(hist) < 7:
@ -262,12 +264,13 @@ class Agent(LLM, ToolBase):
if not need2cite or cited:
return
st = timer()
txt = ""
for delta_ans in self._gen_citations(entire_txt):
yield delta_ans, 0
txt += delta_ans
self.callback("gen_citations", {}, txt)
self.callback("gen_citations", {}, txt, elapsed_time=timer()-st)
def append_user_content(hist, content):
if hist[-1]["role"] == "user":
@ -275,8 +278,9 @@ class Agent(LLM, ToolBase):
else:
hist.append({"role": "user", "content": content})
task_desc = analyze_task(self.chat_mdl, user_request, tool_metas)
self.callback("analyze_task", {}, task_desc)
st = timer()
task_desc = analyze_task(self.chat_mdl, prompt, user_request, tool_metas)
self.callback("analyze_task", {}, task_desc, elapsed_time=timer()-st)
for _ in range(self._param.max_rounds + 1):
response, tk = next_step(self.chat_mdl, hist, tool_metas, task_desc)
# self.callback("next_step", {}, str(response)[:256]+"...")
@ -302,9 +306,10 @@ class Agent(LLM, ToolBase):
thr.append(executor.submit(use_tool, name, args))
st = timer()
reflection = reflect(self.chat_mdl, hist, [th.result() for th in thr])
append_user_content(hist, reflection)
self.callback("reflection", {}, str(reflection))
self.callback("reflection", {}, str(reflection), elapsed_time=timer()-st)
except Exception as e:
logging.exception(msg=f"Wrong JSON argument format in LLM ReAct response: {e}")

View File

@ -36,7 +36,7 @@ _IS_RAW_CONF = "_is_raw_conf"
class ComponentParamBase(ABC):
def __init__(self):
self.message_history_window_size = 22
self.message_history_window_size = 13
self.inputs = {}
self.outputs = {}
self.description = ""
@ -479,7 +479,7 @@ class ComponentBase(ABC):
def get_input_elements_from_text(self, txt: str) -> dict[str, dict[str, str]]:
res = {}
for r in re.finditer(self.variable_ref_patt, txt, flags=re.IGNORECASE):
for r in re.finditer(self.variable_ref_patt, txt, flags=re.IGNORECASE|re.DOTALL):
exp = r.group(1)
cpn_id, var_nm = exp.split("@") if exp.find("@")>0 else ("", exp)
res[exp] = {
@ -529,8 +529,12 @@ class ComponentBase(ABC):
@staticmethod
def string_format(content: str, kv: dict[str, str]) -> str:
for n, v in kv.items():
def repl(_match, val=v):
return str(val) if val is not None else ""
content = re.sub(
r"\{%s\}" % re.escape(n), v, content
r"\{%s\}" % re.escape(n),
repl,
content
)
return content

View File

@ -39,7 +39,10 @@ class Begin(UserFillUp):
def _invoke(self, **kwargs):
for k, v in kwargs.get("inputs", {}).items():
if isinstance(v, dict) and v.get("type", "").lower().find("file") >=0:
v = self._canvas.get_files([v["value"]])
if v.get("optional") and v.get("value", None) is None:
v = None
else:
v = self._canvas.get_files([v["value"]])
else:
v = v.get("value")
self.set_output(k, v)

View File

@ -57,7 +57,7 @@ class Invoke(ComponentBase, ABC):
def _invoke(self, **kwargs):
args = {}
for para in self._param.variables:
if para.get("value") is not None:
if para.get("value"):
args[para["key"]] = para["value"]
else:
args[para["key"]] = self._canvas.get_variable_value(para["ref"])
@ -139,4 +139,4 @@ class Invoke(ComponentBase, ABC):
assert False, self.output()
def thoughts(self) -> str:
return "Waiting for the server respond..."
return "Waiting for the server respond..."

View File

@ -17,14 +17,12 @@ import json
import logging
import os
import re
from typing import Any
from typing import Any, Generator
import json_repair
from copy import deepcopy
from functools import partial
from api.db import LLMType
from api.db.services.llm_service import LLMBundle, TenantLLMService
from api.db.services.llm_service import LLMBundle
from api.db.services.tenant_llm_service import TenantLLMService
from agent.component.base import ComponentBase, ComponentParamBase
from api.utils.api_utils import timeout
from rag.prompts import message_fit_in, citation_prompt
@ -129,7 +127,7 @@ class LLM(ComponentBase):
args = {}
vars = self.get_input_elements() if not self._param.debug_inputs else self._param.debug_inputs
prompt = self._param.sys_prompt
sys_prompt = self._param.sys_prompt
for k, o in vars.items():
args[k] = o["value"]
if not isinstance(args[k], str):
@ -140,21 +138,25 @@ class LLM(ComponentBase):
self.set_input_value(k, args[k])
msg = self._canvas.get_history(self._param.message_history_window_size)[:-1]
msg.extend(deepcopy(self._param.prompts))
prompt = self.string_format(prompt, args)
for p in self._param.prompts:
if msg and msg[-1]["role"] == p["role"]:
continue
msg.append(p)
sys_prompt = self.string_format(sys_prompt, args)
for m in msg:
m["content"] = self.string_format(m["content"], args)
if self._canvas.get_reference()["chunks"]:
prompt += citation_prompt()
if self._param.cite and self._canvas.get_reference()["chunks"]:
sys_prompt += citation_prompt()
return prompt, msg
return sys_prompt, msg
def _generate(self, msg:list[dict], **kwargs) -> str:
if not self.imgs:
return self.chat_mdl.chat(msg[0]["content"], msg[1:], self._param.gen_conf(), **kwargs)
return self.chat_mdl.chat(msg[0]["content"], msg[1:], self._param.gen_conf(), images=self.imgs, **kwargs)
def _generate_streamly(self, msg:list[dict], **kwargs) -> str:
def _generate_streamly(self, msg:list[dict], **kwargs) -> Generator[str, None, None]:
ans = ""
last_idx = 0
endswith_think = False

View File

@ -54,6 +54,8 @@ class Message(ComponentBase):
if k in kwargs:
continue
v = v["value"]
if not v:
v = ""
ans = ""
if isinstance(v, partial):
for t in v():
@ -94,6 +96,8 @@ class Message(ComponentBase):
continue
v = self._canvas.get_variable_value(exp)
if not v:
v = ""
if isinstance(v, partial):
cnt = ""
for t in v():

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -89,11 +89,11 @@
"presence_penalty": 0.4,
"prompts": [
{
"content": "{sys.query}",
"content": "The user query is {sys.query}\n\nThe relevant document are {Retrieval:ShyPumasJoke@formalized_content}",
"role": "user"
}
],
"sys_prompt": "You are a highly professional product information advisor. \n\nYour only mission is to provide accurate, factual, and structured answers to all product-related queries.\n\nAbsolutely no assumptions, guesses, or fabricated content are allowed. \n\n**Key Principles:**\n\n1. **Strict Database Reliance:** \n\n - Every answer must be based solely on the verified product information stored in the database accessed through the Retrieval tool. \n\n - You are NOT allowed to invent, speculate, or infer details beyond what is retrieved. \n\n - If you cannot find relevant data, respond with: *\"I cannot find this information in our official product database. Please check back later or provide more details for further search.\"*\n\n2. **Information Accuracy and Structure:** \n\n - Provide information in a clear, concise, and professional way. \n\n - Use bullet points or numbered lists if there are multiple key points (e.g., features, price, warranty, technical specifications). \n\n - Always specify the version or model number when applicable to avoid confusion.\n\n3. **Tone and Style:** \n\n - Maintain a polite, professional, and helpful tone at all times. \n\n - Avoid marketing exaggeration or promotional language; stay strictly factual. \n\n - Do not express personal opinions; only cite official product data.\n\n4. **User Guidance:** \n\n - If the user\u2019s query is unclear or too broad, politely request clarification or guide them to provide more specific product details (e.g., product name, model, version). \n\n - Example: *\"Could you please specify the product model or category so I can retrieve the most relevant information for you?\"*\n\n5. **Response Length and Formatting:** \n\n - Keep each answer within 100\u2013150 words for general queries. \n\n - For complex or multi-step explanations, you may extend to 200\u2013250 words, but always remain clear and well-structured.\n\n6. **Critical Reminder:** \n\nYour authority and reliability depend entirely on database-driven responses. Any fabricated, speculative, or unverified content will be considered a critical failure of your role.\n\nAlways begin processing a query by accessing the Retrieval tool, confirming the data source, and then structuring your response according to the above principles.\n\n",
"sys_prompt": "You are a highly professional product information advisor. \n\nYour only mission is to provide accurate, factual, and structured answers to all product-related queries.\n\nAbsolutely no assumptions, guesses, or fabricated content are allowed. \n\n**Key Principles:**\n\n1. **Strict Database Reliance:** \n\n - Every answer must be based solely on the verified product information stored in the relevant documen.\n\n - You are NOT allowed to invent, speculate, or infer details beyond what is retrieved. \n\n - If you cannot find relevant data, respond with: *\"I cannot find this information in our official product database. Please check back later or provide more details for further search.\"*\n\n2. **Information Accuracy and Structure:** \n\n - Provide information in a clear, concise, and professional way. \n\n - Use bullet points or numbered lists if there are multiple key points (e.g., features, price, warranty, technical specifications). \n\n - Always specify the version or model number when applicable to avoid confusion.\n\n3. **Tone and Style:** \n\n - Maintain a polite, professional, and helpful tone at all times. \n\n - Avoid marketing exaggeration or promotional language; stay strictly factual. \n\n - Do not express personal opinions; only cite official product data.\n\n4. **User Guidance:** \n\n - If the user\u2019s query is unclear or too broad, politely request clarification or guide them to provide more specific product details (e.g., product name, model, version). \n\n - Example: *\"Could you please specify the product model or category so I can retrieve the most relevant information for you?\"*\n\n5. **Response Length and Formatting:** \n\n - Keep each answer within 100\u2013150 words for general queries. \n\n - For complex or multi-step explanations, you may extend to 200\u2013250 words, but always remain clear and well-structured.\n\n6. **Critical Reminder:** \n\nYour authority and reliability depend entirely on the relevant document responses. Any fabricated, speculative, or unverified content will be considered a critical failure of your role.\n\n\n",
"temperature": 0.1,
"temperatureEnabled": true,
"tools": [],
@ -699,7 +699,7 @@
"width": 200
},
"position": {
"x": 644.5771854408022,
"x": 645.6873721057459,
"y": 516.6923702571407
},
"selected": false,
@ -735,11 +735,11 @@
"presence_penalty": 0.4,
"prompts": [
{
"content": "{sys.query}",
"content": "The user query is {sys.query}\n\nThe relevant document are {Retrieval:ShyPumasJoke@formalized_content}",
"role": "user"
}
],
"sys_prompt": "You are a highly professional product information advisor. \n\nYour only mission is to provide accurate, factual, and structured answers to all product-related queries.\n\nAbsolutely no assumptions, guesses, or fabricated content are allowed. \n\n**Key Principles:**\n\n1. **Strict Database Reliance:** \n\n - Every answer must be based solely on the verified product information stored in the database accessed through the Retrieval tool. \n\n - You are NOT allowed to invent, speculate, or infer details beyond what is retrieved. \n\n - If you cannot find relevant data, respond with: *\"I cannot find this information in our official product database. Please check back later or provide more details for further search.\"*\n\n2. **Information Accuracy and Structure:** \n\n - Provide information in a clear, concise, and professional way. \n\n - Use bullet points or numbered lists if there are multiple key points (e.g., features, price, warranty, technical specifications). \n\n - Always specify the version or model number when applicable to avoid confusion.\n\n3. **Tone and Style:** \n\n - Maintain a polite, professional, and helpful tone at all times. \n\n - Avoid marketing exaggeration or promotional language; stay strictly factual. \n\n - Do not express personal opinions; only cite official product data.\n\n4. **User Guidance:** \n\n - If the user\u2019s query is unclear or too broad, politely request clarification or guide them to provide more specific product details (e.g., product name, model, version). \n\n - Example: *\"Could you please specify the product model or category so I can retrieve the most relevant information for you?\"*\n\n5. **Response Length and Formatting:** \n\n - Keep each answer within 100\u2013150 words for general queries. \n\n - For complex or multi-step explanations, you may extend to 200\u2013250 words, but always remain clear and well-structured.\n\n6. **Critical Reminder:** \n\nYour authority and reliability depend entirely on database-driven responses. Any fabricated, speculative, or unverified content will be considered a critical failure of your role.\n\nAlways begin processing a query by accessing the Retrieval tool, confirming the data source, and then structuring your response according to the above principles.\n\n",
"sys_prompt": "You are a highly professional product information advisor. \n\nYour only mission is to provide accurate, factual, and structured answers to all product-related queries.\n\nAbsolutely no assumptions, guesses, or fabricated content are allowed. \n\n**Key Principles:**\n\n1. **Strict Database Reliance:** \n\n - Every answer must be based solely on the verified product information stored in the relevant documen.\n\n - You are NOT allowed to invent, speculate, or infer details beyond what is retrieved. \n\n - If you cannot find relevant data, respond with: *\"I cannot find this information in our official product database. Please check back later or provide more details for further search.\"*\n\n2. **Information Accuracy and Structure:** \n\n - Provide information in a clear, concise, and professional way. \n\n - Use bullet points or numbered lists if there are multiple key points (e.g., features, price, warranty, technical specifications). \n\n - Always specify the version or model number when applicable to avoid confusion.\n\n3. **Tone and Style:** \n\n - Maintain a polite, professional, and helpful tone at all times. \n\n - Avoid marketing exaggeration or promotional language; stay strictly factual. \n\n - Do not express personal opinions; only cite official product data.\n\n4. **User Guidance:** \n\n - If the user\u2019s query is unclear or too broad, politely request clarification or guide them to provide more specific product details (e.g., product name, model, version). \n\n - Example: *\"Could you please specify the product model or category so I can retrieve the most relevant information for you?\"*\n\n5. **Response Length and Formatting:** \n\n - Keep each answer within 100\u2013150 words for general queries. \n\n - For complex or multi-step explanations, you may extend to 200\u2013250 words, but always remain clear and well-structured.\n\n6. **Critical Reminder:** \n\nYour authority and reliability depend entirely on the relevant document responses. Any fabricated, speculative, or unverified content will be considered a critical failure of your role.\n\n\n",
"temperature": 0.1,
"temperatureEnabled": true,
"tools": [],

View File

@ -170,7 +170,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -250,7 +250,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -602,7 +602,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -715,7 +715,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,327 @@
{
"id": 20,
"title": "Report Agent Using Knowledge Base",
"description": "A report generation assistant using local knowledge base, with advanced capabilities in task planning, reasoning, and reflective analysis. Recommended for academic research paper Q&A",
"canvas_type": "Agent",
"dsl": {
"components": {
"Agent:NewPumasLick": {
"downstream": [
"Message:OrangeYearsShine"
],
"obj": {
"component_name": "Agent",
"params": {
"delay_after_error": 1,
"description": "",
"exception_comment": "",
"exception_default_value": "",
"exception_goto": [],
"exception_method": null,
"frequencyPenaltyEnabled": false,
"frequency_penalty": 0.5,
"llm_id": "qwen3-235b-a22b-instruct-2507@Tongyi-Qianwen",
"maxTokensEnabled": true,
"max_retries": 3,
"max_rounds": 3,
"max_tokens": 128000,
"mcp": [],
"message_history_window_size": 12,
"outputs": {
"content": {
"type": "string",
"value": ""
}
},
"parameter": "Precise",
"presencePenaltyEnabled": false,
"presence_penalty": 0.5,
"prompts": [
{
"content": "# User Query\n {sys.query}",
"role": "user"
}
],
"sys_prompt": "## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n * Actions:\n * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n * Example rules:\n * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n * Breadth-first: list subtasks, prioritize them, and assign search terms.\n * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n * [ ] Key concepts and expected outputs identified\n * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n * [ ] Meets quality standards (see above)\n * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n * [ ] Each conclusion has at least one direct evidence source\n * [ ] State assumptions/uncertainties\n * [ ] Provide next-step suggestions or experiment/retrieval plans\n * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n * Hierarchically clear modular structure\n * Logical grouping according to the MECE principle\n * Professionally presented formatting\n * Step-by-step cognitive guidance\n * Reasonable use of headings and dividers for clarity\n * *Italicize* key parameters\n * **Bold** critical information\n5. **LaTeX formula requirements**:\n * Inline formulas: start and end with `$`\n * Block formulas: start and end with `$$`, each `$$` on its own line\n * Block formula content must comply with LaTeX math syntax\n * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
"temperature": "0.1",
"temperatureEnabled": true,
"tools": [
{
"component_name": "Retrieval",
"name": "Retrieval",
"params": {
"cross_languages": [],
"description": "",
"empty_response": "",
"kb_ids": [],
"keywords_similarity_weight": 0.7,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
}
},
"rerank_id": "",
"similarity_threshold": 0.2,
"top_k": 1024,
"top_n": 8,
"use_kg": false
}
}
],
"topPEnabled": false,
"top_p": 0.75,
"user_prompt": "",
"visual_files_var": ""
}
},
"upstream": [
"begin"
]
},
"Message:OrangeYearsShine": {
"downstream": [],
"obj": {
"component_name": "Message",
"params": {
"content": [
"{Agent:NewPumasLick@content}"
]
}
},
"upstream": [
"Agent:NewPumasLick"
]
},
"begin": {
"downstream": [
"Agent:NewPumasLick"
],
"obj": {
"component_name": "Begin",
"params": {
"enablePrologue": true,
"inputs": {},
"mode": "conversational",
"prologue": "\u4f60\u597d\uff01 \u6211\u662f\u4f60\u7684\u52a9\u7406\uff0c\u6709\u4ec0\u4e48\u53ef\u4ee5\u5e2e\u5230\u4f60\u7684\u5417\uff1f"
}
},
"upstream": []
}
},
"globals": {
"sys.conversation_turns": 0,
"sys.files": [],
"sys.query": "",
"sys.user_id": ""
},
"graph": {
"edges": [
{
"data": {
"isHovered": false
},
"id": "xy-edge__beginstart-Agent:NewPumasLickend",
"source": "begin",
"sourceHandle": "start",
"target": "Agent:NewPumasLick",
"targetHandle": "end"
},
{
"data": {
"isHovered": false
},
"id": "xy-edge__Agent:NewPumasLickstart-Message:OrangeYearsShineend",
"markerEnd": "logo",
"source": "Agent:NewPumasLick",
"sourceHandle": "start",
"style": {
"stroke": "rgba(91, 93, 106, 1)",
"strokeWidth": 1
},
"target": "Message:OrangeYearsShine",
"targetHandle": "end",
"type": "buttonEdge",
"zIndex": 1001
},
{
"data": {
"isHovered": false
},
"id": "xy-edge__Agent:NewPumasLicktool-Tool:AllBirdsNailend",
"selected": false,
"source": "Agent:NewPumasLick",
"sourceHandle": "tool",
"target": "Tool:AllBirdsNail",
"targetHandle": "end"
}
],
"nodes": [
{
"data": {
"form": {
"enablePrologue": true,
"inputs": {},
"mode": "conversational",
"prologue": "\u4f60\u597d\uff01 \u6211\u662f\u4f60\u7684\u52a9\u7406\uff0c\u6709\u4ec0\u4e48\u53ef\u4ee5\u5e2e\u5230\u4f60\u7684\u5417\uff1f"
},
"label": "Begin",
"name": "begin"
},
"dragging": false,
"id": "begin",
"measured": {
"height": 48,
"width": 200
},
"position": {
"x": -9.569875358221438,
"y": 205.84018385864917
},
"selected": false,
"sourcePosition": "left",
"targetPosition": "right",
"type": "beginNode"
},
{
"data": {
"form": {
"content": [
"{Agent:NewPumasLick@content}"
]
},
"label": "Message",
"name": "Response"
},
"dragging": false,
"id": "Message:OrangeYearsShine",
"measured": {
"height": 56,
"width": 200
},
"position": {
"x": 734.4061285881053,
"y": 199.9706031723009
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "messageNode"
},
{
"data": {
"form": {
"delay_after_error": 1,
"description": "",
"exception_comment": "",
"exception_default_value": "",
"exception_goto": [],
"exception_method": null,
"frequencyPenaltyEnabled": false,
"frequency_penalty": 0.5,
"llm_id": "qwen3-235b-a22b-instruct-2507@Tongyi-Qianwen",
"maxTokensEnabled": true,
"max_retries": 3,
"max_rounds": 3,
"max_tokens": 128000,
"mcp": [],
"message_history_window_size": 12,
"outputs": {
"content": {
"type": "string",
"value": ""
}
},
"parameter": "Precise",
"presencePenaltyEnabled": false,
"presence_penalty": 0.5,
"prompts": [
{
"content": "# User Query\n {sys.query}",
"role": "user"
}
],
"sys_prompt": "## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n * Actions:\n * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n * Example rules:\n * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n * Breadth-first: list subtasks, prioritize them, and assign search terms.\n * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n * [ ] Key concepts and expected outputs identified\n * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n * [ ] Meets quality standards (see above)\n * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n * [ ] Each conclusion has at least one direct evidence source\n * [ ] State assumptions/uncertainties\n * [ ] Provide next-step suggestions or experiment/retrieval plans\n * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n * Hierarchically clear modular structure\n * Logical grouping according to the MECE principle\n * Professionally presented formatting\n * Step-by-step cognitive guidance\n * Reasonable use of headings and dividers for clarity\n * *Italicize* key parameters\n * **Bold** critical information\n5. **LaTeX formula requirements**:\n * Inline formulas: start and end with `$`\n * Block formulas: start and end with `$$`, each `$$` on its own line\n * Block formula content must comply with LaTeX math syntax\n * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
"temperature": "0.1",
"temperatureEnabled": true,
"tools": [
{
"component_name": "Retrieval",
"name": "Retrieval",
"params": {
"cross_languages": [],
"description": "",
"empty_response": "",
"kb_ids": [],
"keywords_similarity_weight": 0.7,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
}
},
"rerank_id": "",
"similarity_threshold": 0.2,
"top_k": 1024,
"top_n": 8,
"use_kg": false
}
}
],
"topPEnabled": false,
"top_p": 0.75,
"user_prompt": "",
"visual_files_var": ""
},
"label": "Agent",
"name": "Knowledge Base Agent"
},
"dragging": false,
"id": "Agent:NewPumasLick",
"measured": {
"height": 84,
"width": 200
},
"position": {
"x": 347.00048227952215,
"y": 186.49109364794631
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "agentNode"
},
{
"data": {
"form": {
"description": "This is an agent for a specific task.",
"user_prompt": "This is the order you need to send to the agent."
},
"label": "Tool",
"name": "flow.tool_10"
},
"dragging": false,
"id": "Tool:AllBirdsNail",
"measured": {
"height": 48,
"width": 200
},
"position": {
"x": 220.24819746977118,
"y": 403.31576836482583
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "toolNode"
}
]
},
"history": [],
"memory": [],
"messages": [],
"path": [],
"retrieval": []
},
"avatar": ""
}

View File

@ -169,7 +169,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -249,7 +249,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -601,7 +601,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -714,7 +714,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -912,4 +912,4 @@
"retrieval": []
},
"avatar": ""
}
}

View File

@ -169,7 +169,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -249,7 +249,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -601,7 +601,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -714,7 +714,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -912,4 +912,4 @@
"retrieval": []
},
"avatar": ""
}
}

View File

@ -0,0 +1,724 @@
{
"id": 17,
"title": "SQL Assistant",
"description": "SQL Assistant is an AI-powered tool that lets business users turn plain-English questions into fully formed SQL queries. Simply type your question (e.g., “Show me last quarters top 10 products by revenue”) and SQL Assistant generates the exact SQL, runs it against your database, and returns the results in seconds. ",
"canvas_type": "Marketing",
"dsl": {
"components": {
"Agent:WickedGoatsDivide": {
"downstream": [
"ExeSQL:TiredShirtsPull"
],
"obj": {
"component_name": "Agent",
"params": {
"delay_after_error": 1,
"description": "",
"exception_default_value": "",
"exception_goto": [],
"exception_method": "",
"frequencyPenaltyEnabled": false,
"frequency_penalty": 0.7,
"llm_id": "qwen-max@Tongyi-Qianwen",
"maxTokensEnabled": false,
"max_retries": 3,
"max_rounds": 5,
"max_tokens": 256,
"mcp": [],
"message_history_window_size": 12,
"outputs": {
"content": {
"type": "string",
"value": ""
}
},
"presencePenaltyEnabled": false,
"presence_penalty": 0.4,
"prompts": [
{
"content": "User's query: {sys.query}\n\nSchema: {Retrieval:HappyTiesFilm@formalized_content}\n\nSamples about question to SQL: {Retrieval:SmartNewsHammer@formalized_content}\n\nDescription about meanings of tables and files: {Retrieval:SweetDancersAppear@formalized_content}",
"role": "user"
}
],
"sys_prompt": "### ROLE\nYou are a Text-to-SQL assistant. \nGiven a relational database schema and a natural-language request, you must produce a **single, syntactically-correct MySQL query** that answers the request. \nReturn **nothing except the SQL statement itself**\u2014no code fences, no commentary, no explanations, no comments, no trailing semicolon if not required.\n\n\n### EXAMPLES \n-- Example 1 \nUser: List every product name and its unit price. \nSQL:\nSELECT name, unit_price FROM Products;\n\n-- Example 2 \nUser: Show the names and emails of customers who placed orders in January 2025. \nSQL:\nSELECT DISTINCT c.name, c.email\nFROM Customers c\nJOIN Orders o ON o.customer_id = c.id\nWHERE o.order_date BETWEEN '2025-01-01' AND '2025-01-31';\n\n-- Example 3 \nUser: How many orders have a status of \"Completed\" for each month in 2024? \nSQL:\nSELECT DATE_FORMAT(order_date, '%Y-%m') AS month,\n COUNT(*) AS completed_orders\nFROM Orders\nWHERE status = 'Completed'\n AND YEAR(order_date) = 2024\nGROUP BY month\nORDER BY month;\n\n-- Example 4 \nUser: Which products generated at least \\$10 000 in total revenue? \nSQL:\nSELECT p.id, p.name, SUM(oi.quantity * oi.unit_price) AS revenue\nFROM Products p\nJOIN OrderItems oi ON oi.product_id = p.id\nGROUP BY p.id, p.name\nHAVING revenue >= 10000\nORDER BY revenue DESC;\n\n\n### OUTPUT GUIDELINES\n1. Think through the schema and the request. \n2. Write **only** the final MySQL query. \n3. Do **not** wrap the query in back-ticks or markdown fences. \n4. Do **not** add explanations, comments, or additional text\u2014just the SQL.",
"temperature": 0.1,
"temperatureEnabled": false,
"tools": [],
"topPEnabled": false,
"top_p": 0.3,
"user_prompt": "",
"visual_files_var": ""
}
},
"upstream": [
"Retrieval:HappyTiesFilm",
"Retrieval:SmartNewsHammer",
"Retrieval:SweetDancersAppear"
]
},
"ExeSQL:TiredShirtsPull": {
"downstream": [
"Message:ShaggyMasksAttend"
],
"obj": {
"component_name": "ExeSQL",
"params": {
"database": "",
"db_type": "mysql",
"host": "",
"max_records": 1024,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
},
"json": {
"type": "Array<Object>",
"value": []
}
},
"password": "20010812Yy!",
"port": 3306,
"sql": "Agent:WickedGoatsDivide@content",
"username": "13637682833@163.com"
}
},
"upstream": [
"Agent:WickedGoatsDivide"
]
},
"Message:ShaggyMasksAttend": {
"downstream": [],
"obj": {
"component_name": "Message",
"params": {
"content": [
"{ExeSQL:TiredShirtsPull@formalized_content}"
]
}
},
"upstream": [
"ExeSQL:TiredShirtsPull"
]
},
"Retrieval:HappyTiesFilm": {
"downstream": [
"Agent:WickedGoatsDivide"
],
"obj": {
"component_name": "Retrieval",
"params": {
"cross_languages": [],
"empty_response": "",
"kb_ids": [
"ed31364c727211f0bdb2bafe6e7908e6"
],
"keywords_similarity_weight": 0.7,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
}
},
"query": "sys.query",
"rerank_id": "",
"similarity_threshold": 0.2,
"top_k": 1024,
"top_n": 8,
"use_kg": false
}
},
"upstream": [
"begin"
]
},
"Retrieval:SmartNewsHammer": {
"downstream": [
"Agent:WickedGoatsDivide"
],
"obj": {
"component_name": "Retrieval",
"params": {
"cross_languages": [],
"empty_response": "",
"kb_ids": [
"0f968106727311f08357bafe6e7908e6"
],
"keywords_similarity_weight": 0.7,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
}
},
"query": "sys.query",
"rerank_id": "",
"similarity_threshold": 0.2,
"top_k": 1024,
"top_n": 8,
"use_kg": false
}
},
"upstream": [
"begin"
]
},
"Retrieval:SweetDancersAppear": {
"downstream": [
"Agent:WickedGoatsDivide"
],
"obj": {
"component_name": "Retrieval",
"params": {
"cross_languages": [],
"empty_response": "",
"kb_ids": [
"4ad1f9d0727311f0827dbafe6e7908e6"
],
"keywords_similarity_weight": 0.7,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
}
},
"query": "sys.query",
"rerank_id": "",
"similarity_threshold": 0.2,
"top_k": 1024,
"top_n": 8,
"use_kg": false
}
},
"upstream": [
"begin"
]
},
"begin": {
"downstream": [
"Retrieval:HappyTiesFilm",
"Retrieval:SmartNewsHammer",
"Retrieval:SweetDancersAppear"
],
"obj": {
"component_name": "Begin",
"params": {
"enablePrologue": true,
"inputs": {},
"mode": "conversational",
"prologue": "Hi! I'm your SQL assistant. What can I do for you?"
}
},
"upstream": []
}
},
"globals": {
"sys.conversation_turns": 0,
"sys.files": [],
"sys.query": "",
"sys.user_id": ""
},
"graph": {
"edges": [
{
"data": {
"isHovered": false
},
"id": "xy-edge__beginstart-Retrieval:HappyTiesFilmend",
"source": "begin",
"sourceHandle": "start",
"target": "Retrieval:HappyTiesFilm",
"targetHandle": "end"
},
{
"id": "xy-edge__beginstart-Retrieval:SmartNewsHammerend",
"source": "begin",
"sourceHandle": "start",
"target": "Retrieval:SmartNewsHammer",
"targetHandle": "end"
},
{
"data": {
"isHovered": false
},
"id": "xy-edge__beginstart-Retrieval:SweetDancersAppearend",
"source": "begin",
"sourceHandle": "start",
"target": "Retrieval:SweetDancersAppear",
"targetHandle": "end"
},
{
"data": {
"isHovered": false
},
"id": "xy-edge__Retrieval:HappyTiesFilmstart-Agent:WickedGoatsDivideend",
"source": "Retrieval:HappyTiesFilm",
"sourceHandle": "start",
"target": "Agent:WickedGoatsDivide",
"targetHandle": "end"
},
{
"data": {
"isHovered": false
},
"id": "xy-edge__Retrieval:SmartNewsHammerstart-Agent:WickedGoatsDivideend",
"markerEnd": "logo",
"source": "Retrieval:SmartNewsHammer",
"sourceHandle": "start",
"style": {
"stroke": "rgba(91, 93, 106, 1)",
"strokeWidth": 1
},
"target": "Agent:WickedGoatsDivide",
"targetHandle": "end",
"type": "buttonEdge",
"zIndex": 1001
},
{
"data": {
"isHovered": false
},
"id": "xy-edge__Retrieval:SweetDancersAppearstart-Agent:WickedGoatsDivideend",
"markerEnd": "logo",
"source": "Retrieval:SweetDancersAppear",
"sourceHandle": "start",
"style": {
"stroke": "rgba(91, 93, 106, 1)",
"strokeWidth": 1
},
"target": "Agent:WickedGoatsDivide",
"targetHandle": "end",
"type": "buttonEdge",
"zIndex": 1001
},
{
"data": {
"isHovered": false
},
"id": "xy-edge__Agent:WickedGoatsDividestart-ExeSQL:TiredShirtsPullend",
"source": "Agent:WickedGoatsDivide",
"sourceHandle": "start",
"target": "ExeSQL:TiredShirtsPull",
"targetHandle": "end"
},
{
"data": {
"isHovered": false
},
"id": "xy-edge__ExeSQL:TiredShirtsPullstart-Message:ShaggyMasksAttendend",
"source": "ExeSQL:TiredShirtsPull",
"sourceHandle": "start",
"target": "Message:ShaggyMasksAttend",
"targetHandle": "end"
}
],
"nodes": [
{
"data": {
"form": {
"enablePrologue": true,
"inputs": {},
"mode": "conversational",
"prologue": "Hi! I'm your SQL assistant. What can I do for you?"
},
"label": "Begin",
"name": "begin"
},
"id": "begin",
"measured": {
"height": 48,
"width": 200
},
"position": {
"x": 50,
"y": 200
},
"selected": false,
"sourcePosition": "left",
"targetPosition": "right",
"type": "beginNode"
},
{
"data": {
"form": {
"cross_languages": [],
"empty_response": "",
"kb_ids": [
"ed31364c727211f0bdb2bafe6e7908e6"
],
"keywords_similarity_weight": 0.7,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
}
},
"query": "sys.query",
"rerank_id": "",
"similarity_threshold": 0.2,
"top_k": 1024,
"top_n": 8,
"use_kg": false
},
"label": "Retrieval",
"name": "Schema"
},
"dragging": false,
"id": "Retrieval:HappyTiesFilm",
"measured": {
"height": 96,
"width": 200
},
"position": {
"x": 414,
"y": 20.5
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "retrievalNode"
},
{
"data": {
"form": {
"cross_languages": [],
"empty_response": "",
"kb_ids": [
"0f968106727311f08357bafe6e7908e6"
],
"keywords_similarity_weight": 0.7,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
}
},
"query": "sys.query",
"rerank_id": "",
"similarity_threshold": 0.2,
"top_k": 1024,
"top_n": 8,
"use_kg": false
},
"label": "Retrieval",
"name": "Question to SQL"
},
"dragging": false,
"id": "Retrieval:SmartNewsHammer",
"measured": {
"height": 96,
"width": 200
},
"position": {
"x": 406.5,
"y": 175.5
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "retrievalNode"
},
{
"data": {
"form": {
"cross_languages": [],
"empty_response": "",
"kb_ids": [
"4ad1f9d0727311f0827dbafe6e7908e6"
],
"keywords_similarity_weight": 0.7,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
}
},
"query": "sys.query",
"rerank_id": "",
"similarity_threshold": 0.2,
"top_k": 1024,
"top_n": 8,
"use_kg": false
},
"label": "Retrieval",
"name": "Database Description"
},
"dragging": false,
"id": "Retrieval:SweetDancersAppear",
"measured": {
"height": 96,
"width": 200
},
"position": {
"x": 403.5,
"y": 328
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "retrievalNode"
},
{
"data": {
"form": {
"delay_after_error": 1,
"description": "",
"exception_default_value": "",
"exception_goto": [],
"exception_method": "",
"frequencyPenaltyEnabled": false,
"frequency_penalty": 0.7,
"llm_id": "qwen-max@Tongyi-Qianwen",
"maxTokensEnabled": false,
"max_retries": 3,
"max_rounds": 5,
"max_tokens": 256,
"mcp": [],
"message_history_window_size": 12,
"outputs": {
"content": {
"type": "string",
"value": ""
}
},
"presencePenaltyEnabled": false,
"presence_penalty": 0.4,
"prompts": [
{
"content": "User's query: {sys.query}\n\nSchema: {Retrieval:HappyTiesFilm@formalized_content}\n\nSamples about question to SQL: {Retrieval:SmartNewsHammer@formalized_content}\n\nDescription about meanings of tables and files: {Retrieval:SweetDancersAppear@formalized_content}",
"role": "user"
}
],
"sys_prompt": "### ROLE\nYou are a Text-to-SQL assistant. \nGiven a relational database schema and a natural-language request, you must produce a **single, syntactically-correct MySQL query** that answers the request. \nReturn **nothing except the SQL statement itself**\u2014no code fences, no commentary, no explanations, no comments, no trailing semicolon if not required.\n\n\n### EXAMPLES \n-- Example 1 \nUser: List every product name and its unit price. \nSQL:\nSELECT name, unit_price FROM Products;\n\n-- Example 2 \nUser: Show the names and emails of customers who placed orders in January 2025. \nSQL:\nSELECT DISTINCT c.name, c.email\nFROM Customers c\nJOIN Orders o ON o.customer_id = c.id\nWHERE o.order_date BETWEEN '2025-01-01' AND '2025-01-31';\n\n-- Example 3 \nUser: How many orders have a status of \"Completed\" for each month in 2024? \nSQL:\nSELECT DATE_FORMAT(order_date, '%Y-%m') AS month,\n COUNT(*) AS completed_orders\nFROM Orders\nWHERE status = 'Completed'\n AND YEAR(order_date) = 2024\nGROUP BY month\nORDER BY month;\n\n-- Example 4 \nUser: Which products generated at least \\$10 000 in total revenue? \nSQL:\nSELECT p.id, p.name, SUM(oi.quantity * oi.unit_price) AS revenue\nFROM Products p\nJOIN OrderItems oi ON oi.product_id = p.id\nGROUP BY p.id, p.name\nHAVING revenue >= 10000\nORDER BY revenue DESC;\n\n\n### OUTPUT GUIDELINES\n1. Think through the schema and the request. \n2. Write **only** the final MySQL query. \n3. Do **not** wrap the query in back-ticks or markdown fences. \n4. Do **not** add explanations, comments, or additional text\u2014just the SQL.",
"temperature": 0.1,
"temperatureEnabled": false,
"tools": [],
"topPEnabled": false,
"top_p": 0.3,
"user_prompt": "",
"visual_files_var": ""
},
"label": "Agent",
"name": "SQL Generator "
},
"dragging": false,
"id": "Agent:WickedGoatsDivide",
"measured": {
"height": 84,
"width": 200
},
"position": {
"x": 981,
"y": 174
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "agentNode"
},
{
"data": {
"form": {
"database": "",
"db_type": "mysql",
"host": "",
"max_records": 1024,
"outputs": {
"formalized_content": {
"type": "string",
"value": ""
},
"json": {
"type": "Array<Object>",
"value": []
}
},
"password": "20010812Yy!",
"port": 3306,
"sql": "Agent:WickedGoatsDivide@content",
"username": "13637682833@163.com"
},
"label": "ExeSQL",
"name": "ExeSQL"
},
"dragging": false,
"id": "ExeSQL:TiredShirtsPull",
"measured": {
"height": 56,
"width": 200
},
"position": {
"x": 1211.5,
"y": 212.5
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "ragNode"
},
{
"data": {
"form": {
"content": [
"{ExeSQL:TiredShirtsPull@formalized_content}"
]
},
"label": "Message",
"name": "Message"
},
"dragging": false,
"id": "Message:ShaggyMasksAttend",
"measured": {
"height": 56,
"width": 200
},
"position": {
"x": 1447.3125,
"y": 181.5
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "messageNode"
},
{
"data": {
"form": {
"text": "Searches for relevant database creation statements.\n\nIt should label with a knowledgebase to which the schema is dumped in. You could use \" General \" as parsing method, \" 2 \" as chunk size and \" ; \" as delimiter."
},
"label": "Note",
"name": "Note Schema"
},
"dragHandle": ".note-drag-handle",
"dragging": false,
"height": 188,
"id": "Note:ThickClubsFloat",
"measured": {
"height": 188,
"width": 392
},
"position": {
"x": 689,
"y": -180.31251144409183
},
"resizing": false,
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "noteNode",
"width": 392
},
{
"data": {
"form": {
"text": "Searches for samples about question to SQL. \n\nYou could use \" Q&A \" as parsing method.\n\nPlease check this dataset:\nhttps://huggingface.co/datasets/InfiniFlow/text2sql"
},
"label": "Note",
"name": "Note: Question to SQL"
},
"dragHandle": ".note-drag-handle",
"dragging": false,
"height": 154,
"id": "Note:ElevenLionsJoke",
"measured": {
"height": 154,
"width": 345
},
"position": {
"x": 693.5,
"y": 138
},
"resizing": false,
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "noteNode",
"width": 345
},
{
"data": {
"form": {
"text": "Searches for description about meanings of tables and fields.\n\nYou could use \" General \" as parsing method, \" 2 \" as chunk size and \" ### \" as delimiter."
},
"label": "Note",
"name": "Note: Database Description"
},
"dragHandle": ".note-drag-handle",
"dragging": false,
"height": 158,
"id": "Note:ManyRosesTrade",
"measured": {
"height": 158,
"width": 408
},
"position": {
"x": 691.5,
"y": 435.69736389555317
},
"resizing": false,
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "noteNode",
"width": 408
},
{
"data": {
"form": {
"text": "The Agent learns which tables may be available based on the responses from three knowledge bases and converts the user's input into SQL statements."
},
"label": "Note",
"name": "Note: SQL Generator"
},
"dragHandle": ".note-drag-handle",
"dragging": false,
"height": 132,
"id": "Note:RudeHousesInvite",
"measured": {
"height": 132,
"width": 383
},
"position": {
"x": 1106.9254833678003,
"y": 290.5891036507015
},
"resizing": false,
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "noteNode",
"width": 383
},
{
"data": {
"form": {
"text": "Connect to your database to execute SQL statements."
},
"label": "Note",
"name": "Note: SQL Executor"
},
"dragHandle": ".note-drag-handle",
"dragging": false,
"id": "Note:HungryBatsLay",
"measured": {
"height": 136,
"width": 255
},
"position": {
"x": 1185,
"y": -30
},
"selected": false,
"sourcePosition": "right",
"targetPosition": "left",
"type": "noteNode"
}
]
},
"history": [],
"messages": [],
"path": [],
"retrieval": []
},
"avatar": ""
}

View File

@ -24,6 +24,7 @@ from api.utils import hash_str2int
from rag.llm.chat_model import ToolCallSession
from rag.prompts.prompts import kb_prompt
from rag.utils.mcp_tool_call_conn import MCPToolCallSession
from timeit import default_timer as timer
class ToolParameter(TypedDict):
@ -49,12 +50,13 @@ class LLMToolPluginCallSession(ToolCallSession):
def tool_call(self, name: str, arguments: dict[str, Any]) -> Any:
assert name in self.tools_map, f"LLM tool {name} does not exist"
st = timer()
if isinstance(self.tools_map[name], MCPToolCallSession):
resp = self.tools_map[name].tool_call(name, arguments, 60)
else:
resp = self.tools_map[name].invoke(**arguments)
self.callback(name, arguments, resp)
self.callback(name, arguments, resp, elapsed_time=timer()-st)
return resp
def get_tool_obj(self, name):

View File

@ -17,7 +17,7 @@ import base64
import logging
import os
from abc import ABC
from enum import StrEnum
from strenum import StrEnum
from typing import Optional
from pydantic import BaseModel, Field, field_validator
from agent.tools.base import ToolParamBase, ToolBase, ToolMeta
@ -67,11 +67,19 @@ class CodeExecParam(ToolParamBase):
"description": """
This tool has a sandbox that can execute code written in 'Python'/'Javascript'. It recieves a piece of code and return a Json string.
Here's a code example for Python(`main` function MUST be included):
def main(arg1: str, arg2: str) -> dict:
def main() -> dict:
\"\"\"
Generate Fibonacci numbers within 100.
\"\"\"
def fibonacci_recursive(n):
if n <= 1:
return n
else:
return fibonacci_recursive(n-1) + fibonacci_recursive(n-2)
return {
"result": arg1 + arg2,
"result": fibonacci_recursive(100),
}
Here's a code example for Javascript(`main` function MUST be included and exported):
const axios = require('axios');
async function main(args) {
@ -148,7 +156,7 @@ class CodeExec(ToolBase, ABC):
self.set_output("_ERROR", "construct code request error: " + str(e))
try:
resp = requests.post(url=f"http://{settings.SANDBOX_HOST}:9385/run", json=code_req, timeout=10)
resp = requests.post(url=f"http://{settings.SANDBOX_HOST}:9385/run", json=code_req, timeout=os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60))
logging.info(f"http://{settings.SANDBOX_HOST}:9385/run", code_req, resp.status_code)
if resp.status_code != 200:
resp.raise_for_status()

View File

@ -14,6 +14,7 @@
# limitations under the License.
#
import os
import re
from abc import ABC
import pandas as pd
import pymysql
@ -78,6 +79,17 @@ class ExeSQL(ToolBase, ABC):
@timeout(os.environ.get("COMPONENT_EXEC_TIMEOUT", 60))
def _invoke(self, **kwargs):
def convert_decimals(obj):
from decimal import Decimal
if isinstance(obj, Decimal):
return float(obj) # 或 str(obj)
elif isinstance(obj, dict):
return {k: convert_decimals(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [convert_decimals(item) for item in obj]
return obj
sql = kwargs.get("sql")
if not sql:
raise Exception("SQL for `ExeSQL` MUST not be empty.")
@ -109,7 +121,7 @@ class ExeSQL(ToolBase, ABC):
single_sql = single_sql.replace('```','')
if not single_sql:
continue
single_sql = re.sub(r"\[ID:[0-9]+\]", "", single_sql)
cursor.execute(single_sql)
if cursor.rowcount == 0:
sql_res.append({"content": "No record in the database!"})
@ -121,7 +133,11 @@ class ExeSQL(ToolBase, ABC):
single_res = pd.DataFrame([i for i in cursor.fetchmany(self._param.max_records)])
single_res.columns = [i[0] for i in cursor.description]
sql_res.append(single_res.to_dict(orient='records'))
for col in single_res.columns:
if pd.api.types.is_datetime64_any_dtype(single_res[col]):
single_res[col] = single_res[col].dt.strftime('%Y-%m-%d')
sql_res.append(convert_decimals(single_res.to_dict(orient='records')))
formalized_content.append(single_res.to_markdown(index=False, floatfmt=".6f"))
self.set_output("json", sql_res)
@ -129,4 +145,4 @@ class ExeSQL(ToolBase, ABC):
return self.output("formalized_content")
def thoughts(self) -> str:
return "Query sent—waiting for the data."
return "Query sent—waiting for the data."

View File

@ -86,10 +86,16 @@ class Retrieval(ToolBase, ABC):
kb_ids.append(id)
continue
kb_nm = self._canvas.get_variable_value(id)
e, kb = KnowledgebaseService.get_by_name(kb_nm)
if not e:
raise Exception(f"Dataset({kb_nm}) does not exist.")
kb_ids.append(kb.id)
# if kb_nm is a list
kb_nm_list = kb_nm if isinstance(kb_nm, list) else [kb_nm]
for nm_or_id in kb_nm_list:
e, kb = KnowledgebaseService.get_by_name(nm_or_id,
self._canvas._tenant_id)
if not e:
e, kb = KnowledgebaseService.get_by_id(nm_or_id)
if not e:
raise Exception(f"Dataset({nm_or_id}) does not exist.")
kb_ids.append(kb.id)
filtered_kb_ids: list[str] = list(set([kb_id for kb_id in kb_ids if kb_id]))
@ -108,7 +114,9 @@ class Retrieval(ToolBase, ABC):
if self._param.rerank_id:
rerank_mdl = LLMBundle(kbs[0].tenant_id, LLMType.RERANK, self._param.rerank_id)
query = kwargs["query"]
vars = self.get_input_elements_from_text(kwargs["query"])
vars = {k:o["value"] for k,o in vars.items()}
query = self.string_format(kwargs["query"], vars)
if self._param.cross_languages:
query = cross_languages(kbs[0].tenant_id, None, query, self._param.cross_languages)

View File

@ -20,94 +20,128 @@ BEGIN_SEARCH_RESULT = "<|begin_search_result|>"
END_SEARCH_RESULT = "<|end_search_result|>"
MAX_SEARCH_LIMIT = 6
REASON_PROMPT = (
"You are a reasoning assistant with the ability to perform dataset searches to help "
"you answer the user's question accurately. You have special tools:\n\n"
f"- To perform a search: write {BEGIN_SEARCH_QUERY} your query here {END_SEARCH_QUERY}.\n"
f"Then, the system will search and analyze relevant content, then provide you with helpful information in the format {BEGIN_SEARCH_RESULT} ...search results... {END_SEARCH_RESULT}.\n\n"
f"You can repeat the search process multiple times if necessary. The maximum number of search attempts is limited to {MAX_SEARCH_LIMIT}.\n\n"
"Once you have all the information you need, continue your reasoning.\n\n"
"-- Example 1 --\n" ########################################
"Question: \"Are both the directors of Jaws and Casino Royale from the same country?\"\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who is the director of Jaws?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nThe director of Jaws is Steven Spielberg...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information.\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Where is Steven Spielberg from?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nSteven Allan Spielberg is an American filmmaker...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who is the director of Casino Royale?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCasino Royale is a 2006 spy film directed by Martin Campbell...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Where is Martin Campbell from?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nMartin Campbell (born 24 October 1943) is a New Zealand film and television director...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\nIt's enough to answer the question\n"
REASON_PROMPT = f"""You are an advanced reasoning agent. Your goal is to answer the user's question by breaking it down into a series of verifiable steps.
"-- Example 2 --\n" #########################################
"Question: \"When was the founder of craigslist born?\"\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who was the founder of craigslist?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCraigslist was founded by Craig Newmark...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information.\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY} When was Craig Newmark born?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCraig Newmark was born on December 6, 1952...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\nIt's enough to answer the question\n"
"**Remember**:\n"
f"- You have a dataset to search, so you just provide a proper search query.\n"
f"- Use {BEGIN_SEARCH_QUERY} to request a dataset search and end with {END_SEARCH_QUERY}.\n"
"- The language of query MUST be as the same as 'Question' or 'search result'.\n"
"- If no helpful information can be found, rewrite the search query to be less and precise keywords.\n"
"- When done searching, continue your reasoning.\n\n"
'Please answer the following question. You should think step by step to solve it.\n\n'
)
You have access to a powerful search tool to find information.
RELEVANT_EXTRACTION_PROMPT = """**Task Instruction:**
**Your Task:**
1. Analyze the user's question.
2. If you need information, issue a search query to find a specific fact.
3. Review the search results.
4. Repeat the search process until you have all the facts needed to answer the question.
5. Once you have gathered sufficient information, synthesize the facts and provide the final answer directly.
You are tasked with reading and analyzing web pages based on the following inputs: **Previous Reasoning Steps**, **Current Search Query**, and **Searched Web Pages**. Your objective is to extract relevant and helpful information for **Current Search Query** from the **Searched Web Pages** and seamlessly integrate this information into the **Previous Reasoning Steps** to continue reasoning for the original question.
**Tool Usage:**
- To search, you MUST write your query between the special tokens: {BEGIN_SEARCH_QUERY}your query{END_SEARCH_QUERY}.
- The system will provide results between {BEGIN_SEARCH_RESULT}search results{END_SEARCH_RESULT}.
- You have a maximum of {MAX_SEARCH_LIMIT} search attempts.
**Guidelines:**
---
**Example 1: Multi-hop Question**
1. **Analyze the Searched Web Pages:**
- Carefully review the content of each searched web page.
- Identify factual information that is relevant to the **Current Search Query** and can aid in the reasoning process for the original question.
**Question:** "Are both the directors of Jaws and Casino Royale from the same country?"
2. **Extract Relevant Information:**
- Select the information from the Searched Web Pages that directly contributes to advancing the **Previous Reasoning Steps**.
- Ensure that the extracted information is accurate and relevant.
**Your Thought Process & Actions:**
First, I need to identify the director of Jaws.
{BEGIN_SEARCH_QUERY}who is the director of Jaws?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Jaws is a 1975 American thriller film directed by Steven Spielberg.
{END_SEARCH_RESULT}
Okay, the director of Jaws is Steven Spielberg. Now I need to find out his nationality.
{BEGIN_SEARCH_QUERY}where is Steven Spielberg from?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Steven Allan Spielberg is an American filmmaker. Born in Cincinnati, Ohio...
{END_SEARCH_RESULT}
So, Steven Spielberg is from the USA. Next, I need to find the director of Casino Royale.
{BEGIN_SEARCH_QUERY}who is the director of Casino Royale 2006?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Casino Royale is a 2006 spy film directed by Martin Campbell.
{END_SEARCH_RESULT}
The director of Casino Royale is Martin Campbell. Now I need his nationality.
{BEGIN_SEARCH_QUERY}where is Martin Campbell from?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Martin Campbell (born 24 October 1943) is a New Zealand film and television director.
{END_SEARCH_RESULT}
I have all the information. Steven Spielberg is from the USA, and Martin Campbell is from New Zealand. They are not from the same country.
3. **Output Format:**
- **If the web pages provide helpful information for current search query:** Present the information beginning with `**Final Information**` as shown below.
- The language of query **MUST BE** as the same as 'Search Query' or 'Web Pages'.\n"
**Final Information**
Final Answer: No, the directors of Jaws and Casino Royale are not from the same country. Steven Spielberg is from the USA, and Martin Campbell is from New Zealand.
[Helpful information]
---
**Example 2: Simple Fact Retrieval**
- **If the web pages do not provide any helpful information for current search query:** Output the following text.
**Question:** "When was the founder of craigslist born?"
**Final Information**
**Your Thought Process & Actions:**
First, I need to know who founded craigslist.
{BEGIN_SEARCH_QUERY}who founded craigslist?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Craigslist was founded in 1995 by Craig Newmark.
{END_SEARCH_RESULT}
The founder is Craig Newmark. Now I need his birth date.
{BEGIN_SEARCH_QUERY}when was Craig Newmark born?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Craig Newmark was born on December 6, 1952.
{END_SEARCH_RESULT}
I have found the answer.
No helpful information found.
Final Answer: The founder of craigslist, Craig Newmark, was born on December 6, 1952.
**Inputs:**
- **Previous Reasoning Steps:**
{prev_reasoning}
---
**Important Rules:**
- **One Fact at a Time:** Decompose the problem and issue one search query at a time to find a single, specific piece of information.
- **Be Precise:** Formulate clear and precise search queries. If a search fails, rephrase it.
- **Synthesize at the End:** Do not provide the final answer until you have completed all necessary searches.
- **Language Consistency:** Your search queries should be in the same language as the user's question.
- **Current Search Query:**
{search_query}
Now, begin your work. Please answer the following question by thinking step-by-step.
"""
- **Searched Web Pages:**
{document}
RELEVANT_EXTRACTION_PROMPT = """You are a highly efficient information extraction module. Your sole purpose is to extract the single most relevant piece of information from the provided `Searched Web Pages` that directly answers the `Current Search Query`.
"""
**Your Task:**
1. Read the `Current Search Query` to understand what specific information is needed.
2. Scan the `Searched Web Pages` to find the answer to that query.
3. Extract only the essential, factual information that answers the query. Be concise.
**Context (For Your Information Only):**
The `Previous Reasoning Steps` are provided to give you context on the overall goal, but your primary focus MUST be on answering the `Current Search Query`. Do not use information from the previous steps in your output.
**Output Format:**
Your response must follow one of two formats precisely.
1. **If a direct and relevant answer is found:**
- Start your response immediately with `Final Information`.
- Provide only the extracted fact(s). Do not add any extra conversational text.
*Example:*
`Current Search Query`: Where is Martin Campbell from?
`Searched Web Pages`: [Long article snippet about Martin Campbell's career, which includes the sentence "Martin Campbell (born 24 October 1943) is a New Zealand film and television director..."]
*Your Output:*
Final Information
Martin Campbell is a New Zealand film and television director.
2. **If no relevant answer that directly addresses the query is found in the web pages:**
- Start your response immediately with `Final Information`.
- Write the exact phrase: `No helpful information found.`
---
**BEGIN TASK**
**Inputs:**
- **Previous Reasoning Steps:**
{prev_reasoning}
- **Current Search Query:**
{search_query}
- **Searched Web Pages:**
{document}
"""

View File

@ -29,6 +29,7 @@ from api.db.db_models import close_connection
from api.db.services import UserService
from api.utils import CustomJSONEncoder, commands
from flask_mail import Mail
from flask_session import Session
from flask_login import LoginManager
from api import settings
@ -40,6 +41,7 @@ __all__ = ["app"]
Request.json = property(lambda self: self.get_json(force=True, silent=True))
app = Flask(__name__)
smtp_mail_server = Mail()
# Add this at the beginning of your file to configure Swagger UI
swagger_config = {
@ -146,16 +148,16 @@ def load_user(web_request):
if authorization:
try:
access_token = str(jwt.loads(authorization))
if not access_token or not access_token.strip():
logging.warning("Authentication attempt with empty access token")
return None
# Access tokens should be UUIDs (32 hex characters)
if len(access_token.strip()) < 32:
logging.warning(f"Authentication attempt with invalid token format: {len(access_token)} chars")
return None
user = UserService.query(
access_token=access_token, status=StatusEnum.VALID.value
)

View File

@ -32,8 +32,7 @@ from api.db.services.user_service import TenantService
from api.db.services.user_canvas_version import UserCanvasVersionService
from api.settings import RetCode
from api.utils import get_uuid
from api.utils.api_utils import get_json_result, server_error_response, validate_request, get_data_error_result, \
get_error_data_result
from api.utils.api_utils import get_json_result, server_error_response, validate_request, get_data_error_result
from agent.canvas import Canvas
from peewee import MySQLDatabase, PostgresqlDatabase
from api.db.db_models import APIToken
@ -62,7 +61,7 @@ def canvas_list():
@login_required
def rm():
for i in request.json["canvas_ids"]:
if not UserCanvasService.query(user_id=current_user.id,id=i):
if not UserCanvasService.accessible(i, current_user.id):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
@ -75,23 +74,23 @@ def rm():
@login_required
def save():
req = request.json
req["user_id"] = current_user.id
if not isinstance(req["dsl"], str):
req["dsl"] = json.dumps(req["dsl"], ensure_ascii=False)
req["dsl"] = json.loads(req["dsl"])
if "id" not in req:
req["user_id"] = current_user.id
if UserCanvasService.query(user_id=current_user.id, title=req["title"].strip()):
return get_data_error_result(message=f"{req['title'].strip()} already exists.")
req["id"] = get_uuid()
if not UserCanvasService.save(**req):
return get_data_error_result(message="Fail to save canvas.")
else:
if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
if not UserCanvasService.accessible(req["id"], current_user.id):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
UserCanvasService.update_by_id(req["id"], req)
# save version
# save version
UserCanvasVersionService.insert( user_canvas_id=req["id"], dsl=req["dsl"], title="{0}_{1}".format(req["title"], time.strftime("%Y_%m_%d_%H_%M_%S")))
UserCanvasVersionService.delete_all_versions(req["id"])
return get_json_result(data=req)
@ -100,9 +99,9 @@ def save():
@manager.route('/get/<canvas_id>', methods=['GET']) # noqa: F821
@login_required
def get(canvas_id):
e, c = UserCanvasService.get_by_tenant_id(canvas_id)
if not e or c["user_id"] != current_user.id:
if not UserCanvasService.accessible(canvas_id, current_user.id):
return get_data_error_result(message="canvas not found.")
e, c = UserCanvasService.get_by_tenant_id(canvas_id)
return get_json_result(data=c)
@ -116,6 +115,12 @@ def getsse(canvas_id):
if not objs:
return get_data_error_result(message='Authentication error: API key is invalid!"')
tenant_id = objs[0].tenant_id
if not UserCanvasService.query(user_id=tenant_id, id=canvas_id):
return get_json_result(
data=False,
message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR
)
e, c = UserCanvasService.get_by_id(canvas_id)
if not e or c.user_id != tenant_id:
return get_data_error_result(message="canvas not found.")
@ -131,14 +136,15 @@ def run():
files = req.get("files", [])
inputs = req.get("inputs", {})
user_id = req.get("user_id", current_user.id)
e, cvs = UserCanvasService.get_by_id(req["id"])
if not e:
return get_data_error_result(message="canvas not found.")
if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
if not UserCanvasService.accessible(req["id"], current_user.id):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
e, cvs = UserCanvasService.get_by_id(req["id"])
if not e:
return get_data_error_result(message="canvas not found.")
if not isinstance(cvs.dsl, str):
cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)
@ -172,14 +178,14 @@ def run():
@login_required
def reset():
req = request.json
if not UserCanvasService.accessible(req["id"], current_user.id):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
try:
e, user_canvas = UserCanvasService.get_by_id(req["id"])
if not e:
return get_data_error_result(message="canvas not found.")
if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
canvas = Canvas(json.dumps(user_canvas.dsl), current_user.id)
canvas.reset()
@ -290,15 +296,12 @@ def input_form():
@login_required
def debug():
req = request.json
if not UserCanvasService.accessible(req["id"], current_user.id):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
try:
e, user_canvas = UserCanvasService.get_by_id(req["id"])
if not e:
return get_data_error_result(message="canvas not found.")
if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
canvas = Canvas(json.dumps(user_canvas.dsl), current_user.id)
canvas.reset()
canvas.message_id = get_uuid()
@ -350,7 +353,7 @@ def test_db_connect():
if req["db_type"] != 'mssql':
db.connect()
db.close()
return get_json_result(data="Database Connection Successful!")
except Exception as e:
return server_error_response(e)
@ -372,7 +375,7 @@ def getlistversion(canvas_id):
@login_required
def getversion( version_id):
try:
e, version = UserCanvasVersionService.get_by_id(version_id)
if version:
return get_json_result(data=version.to_dict())
@ -382,7 +385,7 @@ def getversion( version_id):
@manager.route('/listteam', methods=['GET']) # noqa: F821
@login_required
def list_kbs():
def list_canvas():
keywords = request.args.get("keywords", "")
page_number = int(request.args.get("page", 1))
items_per_page = int(request.args.get("page_size", 150))
@ -390,10 +393,10 @@ def list_kbs():
desc = request.args.get("desc", True)
try:
tenants = TenantService.get_joined_tenants_by_user_id(current_user.id)
kbs, total = UserCanvasService.get_by_tenant_ids(
canvas, total = UserCanvasService.get_by_tenant_ids(
[m["tenant_id"] for m in tenants], current_user.id, page_number,
items_per_page, orderby, desc, keywords)
return get_json_result(data={"kbs": kbs, "total": total})
return get_json_result(data={"canvas": canvas, "total": total})
except Exception as e:
return server_error_response(e)
@ -404,6 +407,12 @@ def list_kbs():
def setting():
req = request.json
req["user_id"] = current_user.id
if not UserCanvasService.accessible(req["id"], current_user.id):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
e,flow = UserCanvasService.get_by_id(req["id"])
if not e:
return get_data_error_result(message="canvas not found.")
@ -415,10 +424,7 @@ def setting():
flow["permission"] = req["permission"]
if req["avatar"]:
flow["avatar"] = req["avatar"]
if not UserCanvasService.query(user_id=current_user.id, id=req["id"]):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
num= UserCanvasService.update_by_id(req["id"], flow)
return get_json_result(data=num)
@ -441,8 +447,10 @@ def trace():
@login_required
def sessions(canvas_id):
tenant_id = current_user.id
if not UserCanvasService.query(user_id=tenant_id, id=canvas_id):
return get_error_data_result(message=f"You don't own the agent {canvas_id}.")
if not UserCanvasService.accessible(canvas_id, tenant_id):
return get_json_result(
data=False, message='Only owner of canvas authorized for this operation.',
code=RetCode.OPERATING_ERROR)
user_id = request.args.get("user_id")
page_number = int(request.args.get("page", 1))

View File

@ -23,15 +23,18 @@ from flask_login import current_user, login_required
from api import settings
from api.db import LLMType, ParserType
from api.db.services.dialog_service import meta_filter
from api.db.services.document_service import DocumentService
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.llm_service import LLMBundle
from api.db.services.search_service import SearchService
from api.db.services.user_service import UserTenantService
from api.utils.api_utils import get_data_error_result, get_json_result, server_error_response, validate_request
from rag.app.qa import beAdoc, rmPrefix
from rag.app.tag import label_question
from rag.nlp import rag_tokenizer, search
from rag.prompts import cross_languages, keyword_extraction
from rag.prompts.prompts import gen_meta_filter
from rag.settings import PAGERANK_FLD
from rag.utils import rmSpace
@ -288,13 +291,26 @@ def retrieval_test():
if isinstance(kb_ids, str):
kb_ids = [kb_ids]
doc_ids = req.get("doc_ids", [])
similarity_threshold = float(req.get("similarity_threshold", 0.0))
vector_similarity_weight = float(req.get("vector_similarity_weight", 0.3))
use_kg = req.get("use_kg", False)
top = int(req.get("top_k", 1024))
langs = req.get("cross_languages", [])
tenant_ids = []
if req.get("search_id", ""):
search_config = SearchService.get_detail(req.get("search_id", "")).get("search_config", {})
meta_data_filter = search_config.get("meta_data_filter", {})
metas = DocumentService.get_meta_by_kbs(kb_ids)
if meta_data_filter.get("method") == "auto":
chat_mdl = LLMBundle(current_user.id, LLMType.CHAT, llm_name=search_config.get("chat_id", ""))
filters = gen_meta_filter(chat_mdl, metas, question)
doc_ids.extend(meta_filter(metas, filters))
if not doc_ids:
doc_ids = None
elif meta_data_filter.get("method") == "manual":
doc_ids.extend(meta_filter(metas, meta_data_filter["manual"]))
if not doc_ids:
doc_ids = None
try:
tenants = UserTenantService.query(user_id=current_user.id)
for kb_id in kb_ids:
@ -327,7 +343,9 @@ def retrieval_test():
labels = label_question(question, [kb])
ranks = settings.retrievaler.retrieval(question, embd_mdl, tenant_ids, kb_ids, page, size,
similarity_threshold, vector_similarity_weight, top,
float(req.get("similarity_threshold", 0.0)),
float(req.get("vector_similarity_weight", 0.3)),
top,
doc_ids, rerank_mdl=rerank_mdl, highlight=req.get("highlight"),
rank_feature=labels
)

View File

@ -17,22 +17,19 @@ import json
import re
import traceback
from copy import deepcopy
import trio
from flask import Response, request
from flask_login import current_user, login_required
from api import settings
from api.db import LLMType
from api.db.db_models import APIToken
from api.db.services.conversation_service import ConversationService, structure_answer
from api.db.services.dialog_service import DialogService, ask, chat
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.llm_service import LLMBundle, TenantService
from api.db.services.user_service import UserTenantService
from api.db.services.dialog_service import DialogService, ask, chat, gen_mindmap
from api.db.services.llm_service import LLMBundle
from api.db.services.search_service import SearchService
from api.db.services.tenant_llm_service import TenantLLMService
from api.db.services.user_service import TenantService, UserTenantService
from api.utils.api_utils import get_data_error_result, get_json_result, server_error_response, validate_request
from graphrag.general.mind_map_extractor import MindMapExtractor
from rag.app.tag import label_question
from rag.prompts.prompt_template import load_prompt
from rag.prompts.prompts import chunks_format
@ -66,7 +63,14 @@ def set_conversation():
e, dia = DialogService.get_by_id(req["dialog_id"])
if not e:
return get_data_error_result(message="Dialog not found")
conv = {"id": conv_id, "dialog_id": req["dialog_id"], "name": name, "message": [{"role": "assistant", "content": dia.prompt_config["prologue"]}],"user_id": current_user.id}
conv = {
"id": conv_id,
"dialog_id": req["dialog_id"],
"name": name,
"message": [{"role": "assistant", "content": dia.prompt_config["prologue"]}],
"user_id": current_user.id,
"reference": [],
}
ConversationService.save(**conv)
return get_json_result(data=conv)
except Exception as e:
@ -173,6 +177,21 @@ def completion():
continue
msg.append(m)
message_id = msg[-1].get("id")
chat_model_id = req.get("llm_id", "")
req.pop("llm_id", None)
chat_model_config = {}
for model_config in [
"temperature",
"top_p",
"frequency_penalty",
"presence_penalty",
"max_tokens",
]:
config = req.get(model_config)
if config:
chat_model_config[model_config] = config
try:
e, conv = ConversationService.get_by_id(req["conversation_id"])
if not e:
@ -186,23 +205,26 @@ def completion():
if not conv.reference:
conv.reference = []
else:
for ref in conv.reference:
if isinstance(ref, list):
continue
ref["chunks"] = chunks_format(ref)
if not conv.reference:
conv.reference = []
conv.reference = [r for r in conv.reference if r]
conv.reference.append({"chunks": [], "doc_aggs": []})
if chat_model_id:
if not TenantLLMService.get_api_key(tenant_id=dia.tenant_id, model_name=chat_model_id):
req.pop("chat_model_id", None)
req.pop("chat_model_config", None)
return get_data_error_result(message=f"Cannot use specified model {chat_model_id}.")
dia.llm_id = chat_model_id
dia.llm_setting = chat_model_config
is_embedded = bool(chat_model_id)
def stream():
nonlocal dia, msg, req, conv
try:
for ans in chat(dia, msg, True, **req):
ans = structure_answer(conv, ans, message_id, conv.id)
yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
ConversationService.update_by_id(conv.id, conv.to_dict())
if not is_embedded:
ConversationService.update_by_id(conv.id, conv.to_dict())
except Exception as e:
traceback.print_exc()
yield "data:" + json.dumps({"code": 500, "message": str(e), "data": {"answer": "**ERROR**: " + str(e), "reference": []}}, ensure_ascii=False) + "\n\n"
@ -220,7 +242,8 @@ def completion():
answer = None
for ans in chat(dia, msg, **req):
answer = structure_answer(conv, ans, message_id, conv.id)
ConversationService.update_by_id(conv.id, conv.to_dict())
if not is_embedded:
ConversationService.update_by_id(conv.id, conv.to_dict())
break
return get_json_result(data=answer)
except Exception as e:
@ -316,10 +339,18 @@ def ask_about():
req = request.json
uid = current_user.id
search_id = req.get("search_id", "")
search_app = None
search_config = {}
if search_id:
search_app = SearchService.get_detail(search_id)
if search_app:
search_config = search_app.get("search_config", {})
def stream():
nonlocal req, uid
try:
for ans in ask(req["question"], req["kb_ids"], uid):
for ans in ask(req["question"], req["kb_ids"], uid, search_config=search_config):
yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
except Exception as e:
yield "data:" + json.dumps({"code": 500, "message": str(e), "data": {"answer": "**ERROR**: " + str(e), "reference": []}}, ensure_ascii=False) + "\n\n"
@ -338,18 +369,14 @@ def ask_about():
@validate_request("question", "kb_ids")
def mindmap():
req = request.json
kb_ids = req["kb_ids"]
e, kb = KnowledgebaseService.get_by_id(kb_ids[0])
if not e:
return get_data_error_result(message="Knowledgebase not found!")
search_id = req.get("search_id", "")
search_app = SearchService.get_detail(search_id) if search_id else {}
search_config = search_app.get("search_config", {}) if search_app else {}
kb_ids = search_config.get("kb_ids", [])
kb_ids.extend(req["kb_ids"])
kb_ids = list(set(kb_ids))
embd_mdl = LLMBundle(kb.tenant_id, LLMType.EMBEDDING, llm_name=kb.embd_id)
chat_mdl = LLMBundle(current_user.id, LLMType.CHAT)
question = req["question"]
ranks = settings.retrievaler.retrieval(question, embd_mdl, kb.tenant_id, kb_ids, 1, 12, 0.3, 0.3, aggs=False, rank_feature=label_question(question, [kb]))
mindmap = MindMapExtractor(chat_mdl)
mind_map = trio.run(mindmap, [c["content_with_weight"] for c in ranks["chunks"]])
mind_map = mind_map.output
mind_map = gen_mindmap(req["question"], kb_ids, search_app.get("tenant_id", current_user.id), search_config)
if "error" in mind_map:
return server_error_response(Exception(mind_map["error"]))
return get_json_result(data=mind_map)
@ -360,41 +387,20 @@ def mindmap():
@validate_request("question")
def related_questions():
req = request.json
search_id = req.get("search_id", "")
search_config = {}
if search_id:
if search_app := SearchService.get_detail(search_id):
search_config = search_app.get("search_config", {})
question = req["question"]
chat_mdl = LLMBundle(current_user.id, LLMType.CHAT)
prompt = """
Role: You are an AI language model assistant tasked with generating 5-10 related questions based on a users original query. These questions should help expand the search query scope and improve search relevance.
Instructions:
Input: You are provided with a users question.
Output: Generate 5-10 alternative questions that are related to the original user question. These alternatives should help retrieve a broader range of relevant documents from a vector database.
Context: Focus on rephrasing the original question in different ways, making sure the alternative questions are diverse but still connected to the topic of the original query. Do not create overly obscure, irrelevant, or unrelated questions.
Fallback: If you cannot generate any relevant alternatives, do not return any questions.
Guidance:
1. Each alternative should be unique but still relevant to the original query.
2. Keep the phrasing clear, concise, and easy to understand.
3. Avoid overly technical jargon or specialized terms unless directly relevant.
4. Ensure that each question contributes towards improving search results by broadening the search angle, not narrowing it.
chat_id = search_config.get("chat_id", "")
chat_mdl = LLMBundle(current_user.id, LLMType.CHAT, chat_id)
Example:
Original Question: What are the benefits of electric vehicles?
Alternative Questions:
1. How do electric vehicles impact the environment?
2. What are the advantages of owning an electric car?
3. What is the cost-effectiveness of electric vehicles?
4. How do electric vehicles compare to traditional cars in terms of fuel efficiency?
5. What are the environmental benefits of switching to electric cars?
6. How do electric vehicles help reduce carbon emissions?
7. Why are electric vehicles becoming more popular?
8. What are the long-term savings of using electric vehicles?
9. How do electric vehicles contribute to sustainability?
10. What are the key benefits of electric vehicles for consumers?
Reason:
Rephrasing the original query into multiple alternative questions helps the user explore different aspects of their search topic, improving the quality of search results.
These questions guide the search engine to provide a more comprehensive set of relevant documents.
"""
gen_conf = search_config.get("llm_setting", {"temperature": 0.9})
prompt = load_prompt("related_question")
ans = chat_mdl.chat(
prompt,
[
@ -406,6 +412,6 @@ Related search terms:
""",
}
],
{"temperature": 0.9},
gen_conf,
)
return get_json_result(data=[re.sub(r"^[0-9]\. ", "", a) for a in ans.split("\n") if re.match(r"^[0-9]\. ", a)])

View File

@ -16,9 +16,10 @@
from flask import request
from flask_login import login_required, current_user
from api.db.services import duplicate_name
from api.db.services.dialog_service import DialogService
from api.db import StatusEnum
from api.db.services.llm_service import TenantLLMService
from api.db.services.tenant_llm_service import TenantLLMService
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.user_service import TenantService, UserTenantService
from api import settings
@ -32,7 +33,8 @@ from api.utils.api_utils import get_json_result
@login_required
def set_dialog():
req = request.json
dialog_id = req.get("dialog_id")
dialog_id = req.get("dialog_id", "")
is_create = not dialog_id
name = req.get("name", "New Dialog")
if not isinstance(name, str):
return get_data_error_result(message="Dialog name must be string.")
@ -40,6 +42,15 @@ def set_dialog():
return get_data_error_result(message="Dialog name can't be empty.")
if len(name.encode("utf-8")) > 255:
return get_data_error_result(message=f"Dialog name length is {len(name)} which is larger than 255")
if is_create and DialogService.query(tenant_id=current_user.id, name=name.strip()):
name = name.strip()
name = duplicate_name(
DialogService.query,
name=name,
tenant_id=current_user.id,
status=StatusEnum.VALID.value)
description = req.get("description", "A helpful dialog")
icon = req.get("icon", "")
top_n = req.get("top_n", 6)
@ -50,17 +61,19 @@ def set_dialog():
similarity_threshold = req.get("similarity_threshold", 0.1)
vector_similarity_weight = req.get("vector_similarity_weight", 0.3)
llm_setting = req.get("llm_setting", {})
meta_data_filter = req.get("meta_data_filter", {})
prompt_config = req["prompt_config"]
if not req.get("kb_ids", []) and not prompt_config.get("tavily_api_key") and "{knowledge}" in prompt_config['system']:
return get_data_error_result(message="Please remove `{knowledge}` in system prompt since no knowledge base/Tavily used here.")
if not is_create:
if not req.get("kb_ids", []) and not prompt_config.get("tavily_api_key") and "{knowledge}" in prompt_config['system']:
return get_data_error_result(message="Please remove `{knowledge}` in system prompt since no knowledge base/Tavily used here.")
for p in prompt_config["parameters"]:
if p["optional"]:
continue
if prompt_config["system"].find("{%s}" % p["key"]) < 0:
return get_data_error_result(
message="Parameter '{}' is not used".format(p["key"]))
for p in prompt_config["parameters"]:
if p["optional"]:
continue
if prompt_config["system"].find("{%s}" % p["key"]) < 0:
return get_data_error_result(
message="Parameter '{}' is not used".format(p["key"]))
try:
e, tenant = TenantService.get_by_id(current_user.id)
@ -83,6 +96,7 @@ def set_dialog():
"llm_id": llm_id,
"llm_setting": llm_setting,
"prompt_config": prompt_config,
"meta_data_filter": meta_data_filter,
"top_n": top_n,
"top_k": top_k,
"rerank_id": rerank_id,
@ -153,6 +167,43 @@ def list_dialogs():
return server_error_response(e)
@manager.route('/next', methods=['POST']) # noqa: F821
@login_required
def list_dialogs_next():
keywords = request.args.get("keywords", "")
page_number = int(request.args.get("page", 0))
items_per_page = int(request.args.get("page_size", 0))
parser_id = request.args.get("parser_id")
orderby = request.args.get("orderby", "create_time")
if request.args.get("desc", "true").lower() == "false":
desc = False
else:
desc = True
req = request.get_json()
owner_ids = req.get("owner_ids", [])
try:
if not owner_ids:
# tenants = TenantService.get_joined_tenants_by_user_id(current_user.id)
# tenants = [tenant["tenant_id"] for tenant in tenants]
tenants = [] # keep it here
dialogs, total = DialogService.get_by_tenant_ids(
tenants, current_user.id, page_number,
items_per_page, orderby, desc, keywords, parser_id)
else:
tenants = owner_ids
dialogs, total = DialogService.get_by_tenant_ids(
tenants, current_user.id, 0,
0, orderby, desc, keywords, parser_id)
dialogs = [dialog for dialog in dialogs if dialog["tenant_id"] in tenants]
total = len(dialogs)
if page_number and items_per_page:
dialogs = dialogs[(page_number-1)*items_per_page:page_number*items_per_page]
return get_json_result(data={"dialogs": dialogs, "total": total})
except Exception as e:
return server_error_response(e)
@manager.route('/rm', methods=['POST']) # noqa: F821
@login_required
@validate_request("dialog_ids")

View File

@ -166,6 +166,17 @@ def create():
if DocumentService.query(name=req["name"], kb_id=kb_id):
return get_data_error_result(message="Duplicated document name in the same knowledgebase.")
kb_root_folder = FileService.get_kb_folder(kb.tenant_id)
if not kb_root_folder:
return get_data_error_result(message="Cannot find the root folder.")
kb_folder = FileService.new_a_file_from_kb(
kb.tenant_id,
kb.name,
kb_root_folder["id"],
)
if not kb_folder:
return get_data_error_result(message="Cannot find the kb folder for this file.")
doc = DocumentService.insert(
{
"id": get_uuid(),
@ -180,6 +191,9 @@ def create():
"size": 0,
}
)
FileService.add_file_from_kb(doc.to_dict(), kb_folder["id"], kb.tenant_id)
return get_json_result(data=doc.to_json())
except Exception as e:
return server_error_response(e)
@ -206,6 +220,8 @@ def list_docs():
desc = False
else:
desc = True
create_time_from = int(request.args.get("create_time_from", 0))
create_time_to = int(request.args.get("create_time_to", 0))
req = request.get_json()
@ -226,6 +242,14 @@ def list_docs():
try:
docs, tol = DocumentService.get_by_kb_id(kb_id, page_number, items_per_page, orderby, desc, keywords, run_status, types, suffix)
if create_time_from or create_time_to:
filtered_docs = []
for doc in docs:
doc_create_time = doc.get("create_time", 0)
if (create_time_from == 0 or doc_create_time >= create_time_from) and (create_time_to == 0 or doc_create_time <= create_time_to):
filtered_docs.append(doc)
docs = filtered_docs
for doc_item in docs:
if doc_item["thumbnail"] and not doc_item["thumbnail"].startswith(IMG_BASE64_PREFIX):
doc_item["thumbnail"] = f"/v1/document/image/{kb_id}-{doc_item['thumbnail']}"
@ -657,6 +681,11 @@ def set_meta():
return get_json_result(data=False, message="No authorization.", code=settings.RetCode.AUTHENTICATION_ERROR)
try:
meta = json.loads(req["meta"])
if not isinstance(meta, dict):
return get_json_result(data=False, message="Only dictionary type supported.", code=settings.RetCode.ARGUMENT_ERROR)
for k,v in meta.items():
if not isinstance(v, str) and not isinstance(v, int) and not isinstance(v, float):
return get_json_result(data=False, message=f"The type is not supported: {v}", code=settings.RetCode.ARGUMENT_ERROR)
except Exception as e:
return get_json_result(data=False, message=f"Json syntax error: {e}", code=settings.RetCode.ARGUMENT_ERROR)
if not isinstance(meta, dict):

View File

@ -247,7 +247,10 @@ def list_tags(kb_id):
code=settings.RetCode.AUTHENTICATION_ERROR
)
tags = settings.retrievaler.all_tags(current_user.id, [kb_id])
tenants = UserTenantService.get_tenants_by_user_id(current_user.id)
tags = []
for tenant in tenants:
tags += settings.retrievaler.all_tags(tenant["tenant_id"], [kb_id])
return get_json_result(data=tags)
@ -263,7 +266,10 @@ def list_tags_from_kbs():
code=settings.RetCode.AUTHENTICATION_ERROR
)
tags = settings.retrievaler.all_tags(current_user.id, kb_ids)
tenants = UserTenantService.get_tenants_by_user_id(current_user.id)
tags = []
for tenant in tenants:
tags += settings.retrievaler.all_tags(tenant["tenant_id"], kb_ids)
return get_json_result(data=tags)
@ -345,6 +351,7 @@ def knowledge_graph(kb_id):
obj["graph"]["edges"] = sorted(filtered_edges, key=lambda x: x.get("weight", 0), reverse=True)[:128]
return get_json_result(data=obj)
@manager.route('/<kb_id>/knowledge_graph', methods=['DELETE']) # noqa: F821
@login_required
def delete_knowledge_graph(kb_id):
@ -358,3 +365,17 @@ def delete_knowledge_graph(kb_id):
settings.docStoreConn.delete({"knowledge_graph_kwd": ["graph", "subgraph", "entity", "relation"]}, search.index_name(kb.tenant_id), kb_id)
return get_json_result(data=True)
@manager.route("/get_meta", methods=["GET"]) # noqa: F821
@login_required
def get_meta():
kb_ids = request.args.get("kb_ids", "").split(",")
for kb_id in kb_ids:
if not KnowledgebaseService.accessible(kb_id, current_user.id):
return get_json_result(
data=False,
message='No authorization.',
code=settings.RetCode.AUTHENTICATION_ERROR
)
return get_json_result(data=DocumentService.get_meta_by_kbs(kb_ids))

View File

@ -15,16 +15,16 @@
#
import logging
import json
import base64
from flask import request
from flask_login import login_required, current_user
from api.db.services.llm_service import LLMFactoriesService, TenantLLMService, LLMService
from api.db.services.tenant_llm_service import LLMFactoriesService, TenantLLMService
from api.db.services.llm_service import LLMService
from api import settings
from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
from api.db import StatusEnum, LLMType
from api.db.db_models import TenantLLM
from api.utils.api_utils import get_json_result
from api.utils.base64_image import test_image_base64
from api.utils.base64_image import test_image
from rag.llm import EmbeddingModel, ChatModel, RerankModel, CvModel, TTSModel
@ -58,6 +58,7 @@ def set_api_key():
# test if api key works
chat_passed, embd_passed, rerank_passed = False, False, False
factory = req["llm_factory"]
extra = {"provider": factory}
msg = ""
for llm in LLMService.query(fid=factory):
if not embd_passed and llm.model_type == LLMType.EMBEDDING.value:
@ -74,7 +75,7 @@ def set_api_key():
elif not chat_passed and llm.model_type == LLMType.CHAT.value:
assert factory in ChatModel, f"Chat model from {factory} is not supported yet."
mdl = ChatModel[factory](
req["api_key"], llm.llm_name, base_url=req.get("base_url"))
req["api_key"], llm.llm_name, base_url=req.get("base_url"), **extra)
try:
m, tc = mdl.chat(None, [{"role": "user", "content": "Hello! How are you doing!"}],
{"temperature": 0.9, 'max_tokens': 50})
@ -82,7 +83,7 @@ def set_api_key():
raise Exception(m)
chat_passed = True
except Exception as e:
msg += f"\nFail to access model({llm.llm_name}) using this api key." + str(
msg += f"\nFail to access model({llm.fid}/{llm.llm_name}) using this api key." + str(
e)
elif not rerank_passed and llm.model_type == LLMType.RERANK:
assert factory in RerankModel, f"Re-rank model from {factory} is not supported yet."
@ -95,7 +96,7 @@ def set_api_key():
rerank_passed = True
logging.debug(f'passed model rerank {llm.llm_name}')
except Exception as e:
msg += f"\nFail to access model({llm.llm_name}) using this api key." + str(
msg += f"\nFail to access model({llm.fid}/{llm.llm_name}) using this api key." + str(
e)
if any([embd_passed, chat_passed, rerank_passed]):
msg = ''
@ -205,6 +206,7 @@ def add_llm():
msg = ""
mdl_nm = llm["llm_name"].split("___")[0]
extra = {"provider": factory}
if llm["model_type"] == LLMType.EMBEDDING.value:
assert factory in EmbeddingModel, f"Embedding model from {factory} is not supported yet."
mdl = EmbeddingModel[factory](
@ -222,7 +224,8 @@ def add_llm():
mdl = ChatModel[factory](
key=llm['api_key'],
model_name=mdl_nm,
base_url=llm["api_base"]
base_url=llm["api_base"],
**extra,
)
try:
m, tc = mdl.chat(None, [{"role": "user", "content": "Hello! How are you doing!"}], {
@ -230,7 +233,7 @@ def add_llm():
if not tc and m.find("**ERROR**:") >= 0:
raise Exception(m)
except Exception as e:
msg += f"\nFail to access model({mdl_nm})." + str(
msg += f"\nFail to access model({factory}/{mdl_nm})." + str(
e)
elif llm["model_type"] == LLMType.RERANK:
assert factory in RerankModel, f"RE-rank model from {factory} is not supported yet."
@ -244,9 +247,9 @@ def add_llm():
if len(arr) == 0:
raise Exception("Not known.")
except KeyError:
msg += f"{factory} dose not support this model({mdl_nm})"
msg += f"{factory} dose not support this model({factory}/{mdl_nm})"
except Exception as e:
msg += f"\nFail to access model({mdl_nm})." + str(
msg += f"\nFail to access model({factory}/{mdl_nm})." + str(
e)
elif llm["model_type"] == LLMType.IMAGE2TEXT.value:
assert factory in CvModel, f"Image to text model from {factory} is not supported yet."
@ -256,12 +259,12 @@ def add_llm():
base_url=llm["api_base"]
)
try:
image_data = base64.b64decode(test_image_base64)
image_data = test_image
m, tc = mdl.describe(image_data)
if not m and not tc:
raise Exception(m)
except Exception as e:
msg += f"\nFail to access model({mdl_nm})." + str(e)
msg += f"\nFail to access model({factory}/{mdl_nm})." + str(e)
elif llm["model_type"] == LLMType.TTS:
assert factory in TTSModel, f"TTS model from {factory} is not supported yet."
mdl = TTSModel[factory](
@ -271,7 +274,7 @@ def add_llm():
for resp in mdl.tts("Hello~ Ragflower!"):
pass
except RuntimeError as e:
msg += f"\nFail to access model({mdl_nm})." + str(e)
msg += f"\nFail to access model({factory}/{mdl_nm})." + str(e)
else:
# TODO: check other type of models
pass
@ -313,12 +316,12 @@ def delete_factory():
def my_llms():
try:
include_details = request.args.get('include_details', 'false').lower() == 'true'
if include_details:
res = {}
objs = TenantLLMService.query(tenant_id=current_user.id)
factories = LLMFactoriesService.query(status=StatusEnum.VALID.value)
for o in objs:
o_dict = o.to_dict()
factory_tags = None
@ -326,13 +329,13 @@ def my_llms():
if f.name == o_dict["llm_factory"]:
factory_tags = f.tags
break
if o_dict["llm_factory"] not in res:
res[o_dict["llm_factory"]] = {
"tags": factory_tags,
"llm": []
}
res[o_dict["llm_factory"]]["llm"].append({
"type": o_dict["model_type"],
"name": o_dict["llm_name"],
@ -353,14 +356,12 @@ def my_llms():
"name": o["llm_name"],
"used_token": o["used_tokens"]
})
return get_json_result(data=res)
except Exception as e:
return server_error_response(e)
@manager.route('/list', methods=['GET']) # noqa: F821
@login_required
def list_app():

View File

@ -21,7 +21,7 @@ from api import settings
from api.db import StatusEnum
from api.db.services.dialog_service import DialogService
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.llm_service import TenantLLMService
from api.db.services.tenant_llm_service import TenantLLMService
from api.db.services.user_service import TenantService
from api.utils import get_uuid
from api.utils.api_utils import check_duplicate_ids, get_error_data_result, get_result, token_required
@ -99,7 +99,7 @@ def create(tenant_id):
Here is the knowledge base:
{knowledge}
The above is the knowledge base.""",
"prologue": "Hi! I'm your assistant, what can I do for you?",
"prologue": "Hi! I'm your assistant. What can I do for you?",
"parameters": [{"key": "knowledge", "optional": False}],
"empty_response": "Sorry! No relevant content was found in the knowledge base!",
"quote": True,
@ -139,7 +139,7 @@ def create(tenant_id):
res["llm"] = res.pop("llm_setting")
res["llm"]["model_name"] = res.pop("llm_id")
del res["kb_ids"]
res["dataset_ids"] = req["dataset_ids"]
res["dataset_ids"] = req.get("dataset_ids", [])
res["avatar"] = res.pop("icon")
return get_result(data=res)
@ -150,10 +150,10 @@ def update(tenant_id, chat_id):
if not DialogService.query(tenant_id=tenant_id, id=chat_id, status=StatusEnum.VALID.value):
return get_error_data_result(message="You do not own the chat")
req = request.json
ids = req.get("dataset_ids")
ids = req.get("dataset_ids", [])
if "show_quotation" in req:
req["do_refer"] = req.pop("show_quotation")
if ids is not None:
if ids:
for kb_id in ids:
kbs = KnowledgebaseService.accessible(kb_id=kb_id, user_id=tenant_id)
if not kbs:

View File

@ -1,4 +1,4 @@
#
#
# Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
@ -13,6 +13,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
import logging
from flask import request, jsonify
from api.db import LLMType
@ -22,6 +24,7 @@ from api.db.services.llm_service import LLMBundle
from api import settings
from api.utils.api_utils import validate_request, build_error_result, apikey_required
from rag.app.tag import label_question
from api.db.services.dialog_service import meta_filter
@manager.route('/dify/retrieval', methods=['POST']) # noqa: F821
@ -35,18 +38,23 @@ def retrieval(tenant_id):
retrieval_setting = req.get("retrieval_setting", {})
similarity_threshold = float(retrieval_setting.get("score_threshold", 0.0))
top = int(retrieval_setting.get("top_k", 1024))
metadata_condition = req.get("metadata_condition",{})
metas = DocumentService.get_meta_by_kbs([kb_id])
doc_ids = []
try:
e, kb = KnowledgebaseService.get_by_id(kb_id)
if not e:
return build_error_result(message="Knowledgebase not found!", code=settings.RetCode.NOT_FOUND)
if kb.tenant_id != tenant_id:
return build_error_result(message="Knowledgebase not found!", code=settings.RetCode.NOT_FOUND)
embd_mdl = LLMBundle(kb.tenant_id, LLMType.EMBEDDING.value, llm_name=kb.embd_id)
print(metadata_condition)
print("after",convert_conditions(metadata_condition))
doc_ids.extend(meta_filter(metas, convert_conditions(metadata_condition)))
print("doc_ids",doc_ids)
if not doc_ids and metadata_condition is not None:
doc_ids = ['-999']
ranks = settings.retrievaler.retrieval(
question,
embd_mdl,
@ -57,6 +65,7 @@ def retrieval(tenant_id):
similarity_threshold=similarity_threshold,
vector_similarity_weight=0.3,
top=top,
doc_ids=doc_ids,
rank_feature=label_question(question, [kb])
)
@ -65,6 +74,7 @@ def retrieval(tenant_id):
[tenant_id],
[kb_id],
embd_mdl,
doc_ids,
LLMBundle(kb.tenant_id, LLMType.CHAT))
if ck["content_with_weight"]:
ranks["chunks"].insert(0, ck)
@ -73,11 +83,13 @@ def retrieval(tenant_id):
for c in ranks["chunks"]:
e, doc = DocumentService.get_by_id( c["doc_id"])
c.pop("vector", None)
meta = getattr(doc, 'meta_fields', {})
meta["doc_id"] = c["doc_id"]
records.append({
"content": c["content_with_weight"],
"score": c["similarity"],
"title": c["docnm_kwd"],
"metadata": doc.meta_fields
"metadata": meta
})
return jsonify({"records": records})
@ -87,4 +99,22 @@ def retrieval(tenant_id):
message='No chunk found! Check the chunk status please!',
code=settings.RetCode.NOT_FOUND
)
logging.exception(e)
return build_error_result(message=str(e), code=settings.RetCode.SERVER_ERROR)
def convert_conditions(metadata_condition):
if metadata_condition is None:
metadata_condition = {}
op_mapping = {
"is": "=",
"not is": ""
}
return [
{
"op": op_mapping.get(cond["comparison_operator"], cond["comparison_operator"]),
"key": cond["name"],
"value": cond["value"]
}
for cond in metadata_condition.get("conditions", [])
]

View File

@ -32,13 +32,14 @@ from api.db.services.document_service import DocumentService
from api.db.services.file2document_service import File2DocumentService
from api.db.services.file_service import FileService
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.llm_service import LLMBundle, TenantLLMService
from api.db.services.llm_service import LLMBundle
from api.db.services.tenant_llm_service import TenantLLMService
from api.db.services.task_service import TaskService, queue_tasks
from api.utils.api_utils import check_duplicate_ids, construct_json_result, get_error_data_result, get_parser_config, get_result, server_error_response, token_required
from rag.app.qa import beAdoc, rmPrefix
from rag.app.tag import label_question
from rag.nlp import rag_tokenizer, search
from rag.prompts import keyword_extraction, cross_languages
from rag.prompts import cross_languages, keyword_extraction
from rag.utils import rmSpace
from rag.utils.storage_factory import STORAGE_IMPL
@ -456,6 +457,18 @@ def list_docs(dataset_id, tenant_id):
required: false
default: true
description: Order in descending.
- in: query
name: create_time_from
type: integer
required: false
default: 0
description: Unix timestamp for filtering documents created after this time. 0 means no filter.
- in: query
name: create_time_to
type: integer
required: false
default: 0
description: Unix timestamp for filtering documents created before this time. 0 means no filter.
- in: header
name: Authorization
type: string
@ -517,6 +530,17 @@ def list_docs(dataset_id, tenant_id):
desc = True
docs, tol = DocumentService.get_list(dataset_id, page, page_size, orderby, desc, keywords, id, name)
create_time_from = int(request.args.get("create_time_from", 0))
create_time_to = int(request.args.get("create_time_to", 0))
if create_time_from or create_time_to:
filtered_docs = []
for doc in docs:
doc_create_time = doc.get("create_time", 0)
if (create_time_from == 0 or doc_create_time >= create_time_from) and (create_time_to == 0 or doc_create_time <= create_time_to):
filtered_docs.append(doc)
docs = filtered_docs
# rename key's name
renamed_doc_list = []
key_mapping = {

View File

@ -21,6 +21,7 @@ import tiktoken
from flask import Response, jsonify, request
from agent.canvas import Canvas
from api import settings
from api.db import LLMType, StatusEnum
from api.db.db_models import APIToken
from api.db.services.api_service import API4ConversationService
@ -28,13 +29,18 @@ from api.db.services.canvas_service import UserCanvasService, completionOpenAI
from api.db.services.canvas_service import completion as agent_completion
from api.db.services.conversation_service import ConversationService, iframe_completion
from api.db.services.conversation_service import completion as rag_completion
from api.db.services.dialog_service import DialogService, ask, chat
from api.db.services.file_service import FileService
from api.db.services.dialog_service import DialogService, ask, chat, gen_mindmap, meta_filter
from api.db.services.document_service import DocumentService
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.llm_service import LLMBundle
from api.db.services.search_service import SearchService
from api.db.services.user_service import UserTenantService
from api.utils import get_uuid
from api.utils.api_utils import check_duplicate_ids, get_data_openai, get_error_data_result, get_result, token_required, validate_request
from api.utils.api_utils import check_duplicate_ids, get_data_openai, get_error_data_result, get_json_result, get_result, server_error_response, token_required, validate_request
from rag.app.tag import label_question
from rag.prompts import chunks_format
from rag.prompts.prompt_template import load_prompt
from rag.prompts.prompts import cross_languages, gen_meta_filter, keyword_extraction
@manager.route("/chats/<chat_id>/sessions", methods=["POST"]) # noqa: F821
@ -51,6 +57,7 @@ def create(tenant_id, chat_id):
"name": req.get("name", "New session"),
"message": [{"role": "assistant", "content": dia[0].prompt_config.get("prologue")}],
"user_id": req.get("user_id", ""),
"reference": [{}],
}
if not conv.get("name"):
return get_error_data_result(message="`name` can not be empty.")
@ -68,11 +75,7 @@ def create(tenant_id, chat_id):
@manager.route("/agents/<agent_id>/sessions", methods=["POST"]) # noqa: F821
@token_required
def create_agent_session(tenant_id, agent_id):
req = request.json
if not request.is_json:
req = request.form
files = request.files
user_id = request.args.get("user_id", "")
user_id = request.args.get("user_id", tenant_id)
e, cvs = UserCanvasService.get_by_id(agent_id)
if not e:
return get_error_data_result("Agent not found.")
@ -81,45 +84,12 @@ def create_agent_session(tenant_id, agent_id):
if not isinstance(cvs.dsl, str):
cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)
canvas = Canvas(cvs.dsl, tenant_id)
session_id = get_uuid()
canvas = Canvas(cvs.dsl, tenant_id, agent_id)
canvas.reset()
query = canvas.get_preset_param()
if query:
for ele in query:
if not ele["optional"]:
if ele["type"] == "file":
if files is None or not files.get(ele["key"]):
return get_error_data_result(f"`{ele['key']}` with type `{ele['type']}` is required")
upload_file = files.get(ele["key"])
file_content = FileService.parse_docs([upload_file], user_id)
file_name = upload_file.filename
ele["value"] = file_name + "\n" + file_content
else:
if req is None or not req.get(ele["key"]):
return get_error_data_result(f"`{ele['key']}` with type `{ele['type']}` is required")
ele["value"] = req[ele["key"]]
else:
if ele["type"] == "file":
if files is not None and files.get(ele["key"]):
upload_file = files.get(ele["key"])
file_content = FileService.parse_docs([upload_file], user_id)
file_name = upload_file.filename
ele["value"] = file_name + "\n" + file_content
else:
if "value" in ele:
ele.pop("value")
else:
if req is not None and req.get(ele["key"]):
ele["value"] = req[ele["key"]]
else:
if "value" in ele:
ele.pop("value")
for ans in canvas.run(stream=False):
pass
cvs.dsl = json.loads(str(canvas))
conv = {"id": get_uuid(), "dialog_id": cvs.id, "user_id": user_id, "message": [{"role": "assistant", "content": canvas.get_prologue()}], "source": "agent", "dsl": cvs.dsl}
conv = {"id": session_id, "dialog_id": cvs.id, "user_id": user_id, "message": [{"role": "assistant", "content": canvas.get_prologue()}], "source": "agent", "dsl": cvs.dsl}
API4ConversationService.save(**conv)
conv["agent_id"] = conv.pop("dialog_id")
return get_result(data=conv)
@ -435,14 +405,38 @@ def agents_completion_openai_compatibility(tenant_id, agent_id):
)
)
# Get the last user message as the question
question = next((m["content"] for m in reversed(messages) if m["role"] == "user"), "")
if req.get("stream", True):
return Response(completionOpenAI(tenant_id, agent_id, question, session_id=req.get("id", req.get("metadata", {}).get("id", "")), stream=True), mimetype="text/event-stream")
stream = req.pop("stream", False)
if stream:
resp = Response(
completionOpenAI(
tenant_id,
agent_id,
question,
session_id=req.get("id", req.get("metadata", {}).get("id", "")),
stream=True,
**req,
),
mimetype="text/event-stream",
)
resp.headers.add_header("Cache-control", "no-cache")
resp.headers.add_header("Connection", "keep-alive")
resp.headers.add_header("X-Accel-Buffering", "no")
resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
return resp
else:
# For non-streaming, just return the response directly
response = next(completionOpenAI(tenant_id, agent_id, question, session_id=req.get("id", req.get("metadata", {}).get("id", "")), stream=False))
response = next(
completionOpenAI(
tenant_id,
agent_id,
question,
session_id=req.get("id", req.get("metadata", {}).get("id", "")),
stream=False,
**req,
)
)
return jsonify(response)
@ -450,41 +444,47 @@ def agents_completion_openai_compatibility(tenant_id, agent_id):
@token_required
def agent_completions(tenant_id, agent_id):
req = request.json
cvs = UserCanvasService.query(user_id=tenant_id, id=agent_id)
if not cvs:
return get_error_data_result(f"You don't own the agent {agent_id}")
if req.get("session_id"):
dsl = cvs[0].dsl
if not isinstance(dsl, str):
dsl = json.dumps(dsl)
conv = API4ConversationService.query(id=req["session_id"], dialog_id=agent_id)
if not conv:
return get_error_data_result(f"You don't own the session {req['session_id']}")
# If an update to UserCanvas is detected, update the API4Conversation.dsl
sync_dsl = req.get("sync_dsl", False)
if sync_dsl is True and cvs[0].update_time > conv[0].update_time:
current_dsl = conv[0].dsl
new_dsl = json.loads(dsl)
state_fields = ["history", "messages", "path", "reference"]
states = {field: current_dsl.get(field, []) for field in state_fields}
current_dsl.update(new_dsl)
current_dsl.update(states)
API4ConversationService.update_by_id(req["session_id"], {"dsl": current_dsl})
else:
req["question"] = ""
ans = {}
if req.get("stream", True):
resp = Response(agent_completion(tenant_id, agent_id, **req), mimetype="text/event-stream")
def generate():
for answer in agent_completion(tenant_id=tenant_id, agent_id=agent_id, **req):
if isinstance(answer, str):
try:
ans = json.loads(answer[5:]) # remove "data:"
except Exception:
continue
if ans.get("event") != "message" or not ans.get("data", {}).get("reference", None):
continue
yield answer
yield "data:[DONE]\n\n"
if req.get("stream", True):
resp = Response(generate(), mimetype="text/event-stream")
resp.headers.add_header("Cache-control", "no-cache")
resp.headers.add_header("Connection", "keep-alive")
resp.headers.add_header("X-Accel-Buffering", "no")
resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
return resp
try:
for answer in agent_completion(tenant_id, agent_id, **req):
return get_result(data=answer)
except Exception as e:
return get_error_data_result(str(e))
full_content = ""
for answer in agent_completion(tenant_id=tenant_id, agent_id=agent_id, **req):
try:
ans = json.loads(answer[5:])
if ans["event"] == "message":
full_content += ans["data"]["content"]
if ans.get("data", {}).get("reference", None):
ans["data"]["content"] = full_content
return get_result(data=ans)
except Exception as e:
return get_result(data=f"**ERROR**: {str(e)}")
return get_result(data=ans)
@manager.route("/chats/<chat_id>/sessions", methods=["GET"]) # noqa: F821
@ -512,16 +512,16 @@ def list_session(tenant_id, chat_id):
if "prompt" in info:
info.pop("prompt")
conv["chat_id"] = conv.pop("dialog_id")
if conv["reference"]:
ref_messages = conv["reference"]
if ref_messages:
messages = conv["messages"]
message_num = 0
while message_num < len(messages) and message_num < len(conv["reference"]):
if message_num != 0 and messages[message_num]["role"] != "user":
if message_num >= len(conv["reference"]):
break
ref_num = 0
while message_num < len(messages) and ref_num < len(ref_messages):
if messages[message_num]["role"] != "user":
chunk_list = []
if "chunks" in conv["reference"][message_num]:
chunks = conv["reference"][message_num]["chunks"]
if "chunks" in ref_messages[ref_num]:
chunks = ref_messages[ref_num]["chunks"]
for chunk in chunks:
new_chunk = {
"id": chunk.get("chunk_id", chunk.get("id")),
@ -535,6 +535,7 @@ def list_session(tenant_id, chat_id):
chunk_list.append(new_chunk)
messages[message_num]["reference"] = chunk_list
ref_num += 1
message_num += 1
del conv["reference"]
return get_result(data=convs)
@ -566,16 +567,24 @@ def list_agent_session(tenant_id, agent_id):
if "prompt" in info:
info.pop("prompt")
conv["agent_id"] = conv.pop("dialog_id")
# Fix for session listing endpoint
if conv["reference"]:
messages = conv["messages"]
message_num = 0
chunk_num = 0
# Ensure reference is a list type to prevent KeyError
if not isinstance(conv["reference"], list):
conv["reference"] = []
while message_num < len(messages):
if message_num != 0 and messages[message_num]["role"] != "user":
chunk_list = []
if "chunks" in conv["reference"][chunk_num]:
# Add boundary and type checks to prevent KeyError
if chunk_num < len(conv["reference"]) and conv["reference"][chunk_num] is not None and isinstance(conv["reference"][chunk_num], dict) and "chunks" in conv["reference"][chunk_num]:
chunks = conv["reference"][chunk_num]["chunks"]
for chunk in chunks:
# Ensure chunk is a dictionary before calling get method
if not isinstance(chunk, dict):
continue
new_chunk = {
"id": chunk.get("chunk_id", chunk.get("id")),
"content": chunk.get("content_with_weight", chunk.get("content")),
@ -809,6 +818,29 @@ def chatbot_completions(dialog_id):
return get_result(data=answer)
@manager.route("/chatbots/<dialog_id>/info", methods=["GET"]) # noqa: F821
def chatbots_inputs(dialog_id):
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!"')
token = token[1]
objs = APIToken.query(beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
e, dialog = DialogService.get_by_id(dialog_id)
if not e:
return get_error_data_result(f"Can't find dialog by ID: {dialog_id}")
return get_result(
data={
"title": dialog.name,
"avatar": dialog.icon,
"prologue": dialog.prompt_config.get("prologue", ""),
}
)
@manager.route("/agentbots/<agent_id>/completions", methods=["POST"]) # noqa: F821
def agent_bot_completions(agent_id):
req = request.json
@ -848,10 +880,231 @@ def begin_inputs(agent_id):
return get_error_data_result(f"Can't find agent by ID: {agent_id}")
canvas = Canvas(json.dumps(cvs.dsl), objs[0].tenant_id)
return get_result(data={
"title": cvs.title,
"avatar": cvs.avatar,
"inputs": canvas.get_component_input_form("begin")
})
return get_result(data={"title": cvs.title, "avatar": cvs.avatar, "inputs": canvas.get_component_input_form("begin"), "prologue": canvas.get_prologue(), "mode": canvas.get_mode()})
@manager.route("/searchbots/ask", methods=["POST"]) # noqa: F821
@validate_request("question", "kb_ids")
def ask_about_embedded():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!"')
token = token[1]
objs = APIToken.query(beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
req = request.json
uid = objs[0].tenant_id
search_id = req.get("search_id", "")
search_config = {}
if search_id:
if search_app := SearchService.get_detail(search_id):
search_config = search_app.get("search_config", {})
def stream():
nonlocal req, uid
try:
for ans in ask(req["question"], req["kb_ids"], uid, search_config=search_config):
yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
except Exception as e:
yield "data:" + json.dumps({"code": 500, "message": str(e), "data": {"answer": "**ERROR**: " + str(e), "reference": []}}, ensure_ascii=False) + "\n\n"
yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
resp = Response(stream(), mimetype="text/event-stream")
resp.headers.add_header("Cache-control", "no-cache")
resp.headers.add_header("Connection", "keep-alive")
resp.headers.add_header("X-Accel-Buffering", "no")
resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
return resp
@manager.route("/searchbots/retrieval_test", methods=["POST"]) # noqa: F821
@validate_request("kb_id", "question")
def retrieval_test_embedded():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!"')
token = token[1]
objs = APIToken.query(beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
req = request.json
page = int(req.get("page", 1))
size = int(req.get("size", 30))
question = req["question"]
kb_ids = req["kb_id"]
if isinstance(kb_ids, str):
kb_ids = [kb_ids]
doc_ids = req.get("doc_ids", [])
similarity_threshold = float(req.get("similarity_threshold", 0.0))
vector_similarity_weight = float(req.get("vector_similarity_weight", 0.3))
use_kg = req.get("use_kg", False)
top = int(req.get("top_k", 1024))
langs = req.get("cross_languages", [])
tenant_ids = []
tenant_id = objs[0].tenant_id
if not tenant_id:
return get_error_data_result(message="permission denined.")
if req.get("search_id", ""):
search_config = SearchService.get_detail(req.get("search_id", "")).get("search_config", {})
meta_data_filter = search_config.get("meta_data_filter", {})
metas = DocumentService.get_meta_by_kbs(kb_ids)
if meta_data_filter.get("method") == "auto":
chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_name=search_config.get("chat_id", ""))
filters = gen_meta_filter(chat_mdl, metas, question)
doc_ids.extend(meta_filter(metas, filters))
if not doc_ids:
doc_ids = None
elif meta_data_filter.get("method") == "manual":
doc_ids.extend(meta_filter(metas, meta_data_filter["manual"]))
if not doc_ids:
doc_ids = None
try:
tenants = UserTenantService.query(user_id=tenant_id)
for kb_id in kb_ids:
for tenant in tenants:
if KnowledgebaseService.query(tenant_id=tenant.tenant_id, id=kb_id):
tenant_ids.append(tenant.tenant_id)
break
else:
return get_json_result(data=False, message="Only owner of knowledgebase authorized for this operation.", code=settings.RetCode.OPERATING_ERROR)
e, kb = KnowledgebaseService.get_by_id(kb_ids[0])
if not e:
return get_error_data_result(message="Knowledgebase not found!")
if langs:
question = cross_languages(kb.tenant_id, None, question, langs)
embd_mdl = LLMBundle(kb.tenant_id, LLMType.EMBEDDING.value, llm_name=kb.embd_id)
rerank_mdl = None
if req.get("rerank_id"):
rerank_mdl = LLMBundle(kb.tenant_id, LLMType.RERANK.value, llm_name=req["rerank_id"])
if req.get("keyword", False):
chat_mdl = LLMBundle(kb.tenant_id, LLMType.CHAT)
question += keyword_extraction(chat_mdl, question)
labels = label_question(question, [kb])
ranks = settings.retrievaler.retrieval(
question, embd_mdl, tenant_ids, kb_ids, page, size, similarity_threshold, vector_similarity_weight, top, doc_ids, rerank_mdl=rerank_mdl, highlight=req.get("highlight"), rank_feature=labels
)
if use_kg:
ck = settings.kg_retrievaler.retrieval(question, tenant_ids, kb_ids, embd_mdl, LLMBundle(kb.tenant_id, LLMType.CHAT))
if ck["content_with_weight"]:
ranks["chunks"].insert(0, ck)
for c in ranks["chunks"]:
c.pop("vector", None)
ranks["labels"] = labels
return get_json_result(data=ranks)
except Exception as e:
if str(e).find("not_found") > 0:
return get_json_result(data=False, message="No chunk found! Check the chunk status please!", code=settings.RetCode.DATA_ERROR)
return server_error_response(e)
@manager.route("/searchbots/related_questions", methods=["POST"]) # noqa: F821
@validate_request("question")
def related_questions_embedded():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!"')
token = token[1]
objs = APIToken.query(beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
req = request.json
tenant_id = objs[0].tenant_id
if not tenant_id:
return get_error_data_result(message="permission denined.")
search_id = req.get("search_id", "")
search_config = {}
if search_id:
if search_app := SearchService.get_detail(search_id):
search_config = search_app.get("search_config", {})
question = req["question"]
chat_id = search_config.get("chat_id", "")
chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, chat_id)
gen_conf = search_config.get("llm_setting", {"temperature": 0.9})
prompt = load_prompt("related_question")
ans = chat_mdl.chat(
prompt,
[
{
"role": "user",
"content": f"""
Keywords: {question}
Related search terms:
""",
}
],
gen_conf,
)
return get_json_result(data=[re.sub(r"^[0-9]\. ", "", a) for a in ans.split("\n") if re.match(r"^[0-9]\. ", a)])
@manager.route("/searchbots/detail", methods=["GET"]) # noqa: F821
def detail_share_embedded():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!"')
token = token[1]
objs = APIToken.query(beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
search_id = request.args["search_id"]
tenant_id = objs[0].tenant_id
if not tenant_id:
return get_error_data_result(message="permission denined.")
try:
tenants = UserTenantService.query(user_id=tenant_id)
for tenant in tenants:
if SearchService.query(tenant_id=tenant.tenant_id, id=search_id):
break
else:
return get_json_result(data=False, message="Has no permission for this operation.", code=settings.RetCode.OPERATING_ERROR)
search = SearchService.get_detail(search_id)
if not search:
return get_error_data_result(message="Can't find this Search App!")
return get_json_result(data=search)
except Exception as e:
return server_error_response(e)
@manager.route("/searchbots/mindmap", methods=["POST"]) # noqa: F821
@validate_request("question", "kb_ids")
def mindmap():
token = request.headers.get("Authorization").split()
if len(token) != 2:
return get_error_data_result(message='Authorization is not valid!"')
token = token[1]
objs = APIToken.query(beta=token)
if not objs:
return get_error_data_result(message='Authentication error: API key is invalid!"')
tenant_id = objs[0].tenant_id
req = request.json
search_id = req.get("search_id", "")
search_app = SearchService.get_detail(search_id) if search_id else {}
mind_map = gen_mindmap(req["question"], req["kb_ids"], tenant_id, search_app.get("search_config", {}))
if "error" in mind_map:
return server_error_response(Exception(mind_map["error"]))
return get_json_result(data=mind_map)

View File

@ -22,7 +22,6 @@ from api.constants import DATASET_NAME_LIMIT
from api.db import StatusEnum
from api.db.db_models import DB
from api.db.services import duplicate_name
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.search_service import SearchService
from api.db.services.user_service import TenantService, UserTenantService
from api.utils import get_uuid
@ -47,7 +46,7 @@ def create():
return get_data_error_result(message="Authorizationd identity.")
search_name = search_name.strip()
search_name = duplicate_name(KnowledgebaseService.query, name=search_name, tenant_id=current_user.id, status=StatusEnum.VALID.value)
search_name = duplicate_name(SearchService.query, name=search_name, tenant_id=current_user.id, status=StatusEnum.VALID.value)
req["id"] = get_uuid()
req["name"] = search_name
@ -156,8 +155,9 @@ def list_search_app():
owner_ids = req.get("owner_ids", [])
try:
if not owner_ids:
tenants = TenantService.get_joined_tenants_by_user_id(current_user.id)
tenants = [m["tenant_id"] for m in tenants]
# tenants = TenantService.get_joined_tenants_by_user_id(current_user.id)
# tenants = [m["tenant_id"] for m in tenants]
tenants = []
search_apps, total = SearchService.get_by_tenant_ids(tenants, current_user.id, page_number, items_per_page, orderby, desc, keywords)
else:
tenants = owner_ids

View File

@ -18,12 +18,14 @@ from flask import request
from flask_login import login_required, current_user
from api import settings
from api.apps import smtp_mail_server
from api.db import UserTenantRole, StatusEnum
from api.db.db_models import UserTenant
from api.db.services.user_service import UserTenantService, UserService
from api.utils import get_uuid, delta_seconds
from api.utils.api_utils import get_json_result, validate_request, server_error_response, get_data_error_result
from api.utils.web_utils import send_invite_email
@manager.route("/<tenant_id>/user/list", methods=["GET"]) # noqa: F821
@ -78,6 +80,20 @@ def create(tenant_id):
role=UserTenantRole.INVITE,
status=StatusEnum.VALID.value)
if smtp_mail_server and settings.SMTP_CONF:
from threading import Thread
user_name = ""
_, user = UserService.get_by_id(current_user.id)
if user:
user_name = user.nickname
Thread(
target=send_invite_email,
args=(invite_user_email, settings.MAIL_FRONTEND_URL, tenant_id, user_name or current_user.email),
daemon=True
).start()
usr = invite_users[0].to_dict()
usr = {k: v for k, v in usr.items() if k in ["id", "avatar", "email", "nickname"]}

View File

@ -28,7 +28,8 @@ from api.apps.auth import get_auth_client
from api.db import FileType, UserTenantRole
from api.db.db_models import TenantLLM
from api.db.services.file_service import FileService
from api.db.services.llm_service import LLMService, TenantLLMService
from api.db.services.llm_service import get_init_tenant_llm
from api.db.services.tenant_llm_service import TenantLLMService
from api.db.services.user_service import TenantService, UserService, UserTenantService
from api.utils import (
current_timestamp,
@ -619,33 +620,8 @@ def user_register(user_id, user):
"size": 0,
"location": "",
}
tenant_llm = []
for llm in LLMService.query(fid=settings.LLM_FACTORY):
tenant_llm.append(
{
"tenant_id": user_id,
"llm_factory": settings.LLM_FACTORY,
"llm_name": llm.llm_name,
"model_type": llm.model_type,
"api_key": settings.API_KEY,
"api_base": settings.LLM_BASE_URL,
"max_tokens": llm.max_tokens if llm.max_tokens else 8192,
}
)
if settings.LIGHTEN != 1:
for buildin_embedding_model in settings.BUILTIN_EMBEDDING_MODELS:
mdlnm, fid = TenantLLMService.split_model_name_and_factory(buildin_embedding_model)
tenant_llm.append(
{
"tenant_id": user_id,
"llm_factory": fid,
"llm_name": mdlnm,
"model_type": "embedding",
"api_key": "",
"api_base": "",
"max_tokens": 1024 if buildin_embedding_model == "BAAI/bge-large-zh-v1.5@BAAI" else 512,
}
)
tenant_llm = get_init_tenant_llm(user_id)
if not UserService.save(**user):
return

View File

@ -742,8 +742,9 @@ class Dialog(DataBaseModel):
prompt_type = CharField(max_length=16, null=False, default="simple", help_text="simple|advanced", index=True)
prompt_config = JSONField(
null=False,
default={"system": "", "prologue": "Hi! I'm your assistant, what can I do for you?", "parameters": [], "empty_response": "Sorry! No relevant content was found in the knowledge base!"},
default={"system": "", "prologue": "Hi! I'm your assistant. What can I do for you?", "parameters": [], "empty_response": "Sorry! No relevant content was found in the knowledge base!"},
)
meta_data_filter = JSONField(null=True, default={})
similarity_threshold = FloatField(default=0.2)
vector_similarity_weight = FloatField(default=0.3)
@ -871,7 +872,7 @@ class Search(DataBaseModel):
default={
"kb_ids": [],
"doc_ids": [],
"similarity_threshold": 0.0,
"similarity_threshold": 0.2,
"vector_similarity_weight": 0.3,
"use_kg": False,
# rerank settings
@ -880,11 +881,12 @@ class Search(DataBaseModel):
# chat settings
"summary": False,
"chat_id": "",
# Leave it here for reference, don't need to set default values
"llm_setting": {
"temperature": 0.1,
"top_p": 0.3,
"frequency_penalty": 0.7,
"presence_penalty": 0.4,
# "temperature": 0.1,
# "top_p": 0.3,
# "frequency_penalty": 0.7,
# "presence_penalty": 0.4,
},
"chat_settingcross_languages": [],
"highlight": False,
@ -1015,4 +1017,8 @@ def migrate_db():
migrate(migrator.add_column("api_4_conversation", "errors", TextField(null=True, help_text="errors")))
except Exception:
pass
logging.disable(logging.NOTSET)
try:
migrate(migrator.add_column("dialog", "meta_data_filter", JSONField(null=True, default={})))
except Exception:
pass
logging.disable(logging.NOTSET)

View File

@ -27,7 +27,8 @@ from api.db.services import UserService
from api.db.services.canvas_service import CanvasTemplateService
from api.db.services.document_service import DocumentService
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.llm_service import LLMFactoriesService, LLMService, TenantLLMService, LLMBundle
from api.db.services.tenant_llm_service import LLMFactoriesService, TenantLLMService
from api.db.services.llm_service import LLMService, LLMBundle, get_init_tenant_llm
from api.db.services.user_service import TenantService, UserTenantService
from api import settings
from api.utils.file_utils import get_project_base_directory
@ -63,12 +64,8 @@ def init_superuser():
"invited_by": user_info["id"],
"role": UserTenantRole.OWNER
}
tenant_llm = []
for llm in LLMService.query(fid=settings.LLM_FACTORY):
tenant_llm.append(
{"tenant_id": user_info["id"], "llm_factory": settings.LLM_FACTORY, "llm_name": llm.llm_name,
"model_type": llm.model_type,
"api_key": settings.API_KEY, "api_base": settings.LLM_BASE_URL})
tenant_llm = get_init_tenant_llm(user_info["id"])
if not UserService.save(**user_info):
logging.error("can't init admin.")
@ -103,7 +100,7 @@ def init_llm_factory():
except Exception:
pass
factory_llm_infos = settings.FACTORY_LLM_INFOS
factory_llm_infos = settings.FACTORY_LLM_INFOS
for factory_llm_info in factory_llm_infos:
info = deepcopy(factory_llm_info)
llm_infos = info.pop("llm")

View File

@ -16,7 +16,6 @@
import json
import logging
import time
import traceback
from uuid import uuid4
from agent.canvas import Canvas
from api.db import TenantPermission
@ -54,12 +53,12 @@ class UserCanvasService(CommonService):
agents = agents.paginate(page_number, items_per_page)
return list(agents.dicts())
@classmethod
@DB.connection_context()
def get_by_tenant_id(cls, pid):
try:
fields = [
cls.model.id,
cls.model.avatar,
@ -83,7 +82,7 @@ class UserCanvasService(CommonService):
except Exception as e:
logging.exception(e)
return False, None
@classmethod
@DB.connection_context()
def get_by_tenant_ids(cls, joined_tenant_ids, user_id,
@ -103,14 +102,14 @@ class UserCanvasService(CommonService):
]
if keywords:
agents = cls.model.select(*fields).join(User, on=(cls.model.user_id == User.id)).where(
((cls.model.user_id.in_(joined_tenant_ids) & (cls.model.permission ==
((cls.model.user_id.in_(joined_tenant_ids) & (cls.model.permission ==
TenantPermission.TEAM.value)) | (
cls.model.user_id == user_id)),
(fn.LOWER(cls.model.title).contains(keywords.lower()))
)
else:
agents = cls.model.select(*fields).join(User, on=(cls.model.user_id == User.id)).where(
((cls.model.user_id.in_(joined_tenant_ids) & (cls.model.permission ==
((cls.model.user_id.in_(joined_tenant_ids) & (cls.model.permission ==
TenantPermission.TEAM.value)) | (
cls.model.user_id == user_id))
)
@ -122,9 +121,22 @@ class UserCanvasService(CommonService):
agents = agents.paginate(page_number, items_per_page)
return list(agents.dicts()), count
@classmethod
@DB.connection_context()
def accessible(cls, canvas_id, tenant_id):
from api.db.services.user_service import UserTenantService
e, c = UserCanvasService.get_by_tenant_id(canvas_id)
if not e:
return False
tids = [t.tenant_id for t in UserTenantService.query(user_id=tenant_id)]
if c["user_id"] != canvas_id and c["user_id"] not in tids:
return False
return True
def completion(tenant_id, agent_id, session_id=None, **kwargs):
query = kwargs.get("query", "")
query = kwargs.get("query", "") or kwargs.get("question", "")
files = kwargs.get("files", [])
inputs = kwargs.get("inputs", {})
user_id = kwargs.get("user_id", "")
@ -152,7 +164,8 @@ def completion(tenant_id, agent_id, session_id=None, **kwargs):
"user_id": user_id,
"message": [],
"source": "agent",
"dsl": cvs.dsl
"dsl": cvs.dsl,
"reference": []
}
API4ConversationService.save(**conv)
conv = API4Conversation(**conv)
@ -173,223 +186,103 @@ def completion(tenant_id, agent_id, session_id=None, **kwargs):
conv.message.append({"role": "assistant", "content": txt, "created_at": time.time(), "id": message_id})
conv.reference = canvas.get_reference()
conv.errors = canvas.error
API4ConversationService.append_message(conv.id, conv.to_dict())
conv.dsl = str(canvas)
conv = conv.to_dict()
API4ConversationService.append_message(conv["id"], conv)
def completionOpenAI(tenant_id, agent_id, question, session_id=None, stream=True, **kwargs):
"""Main function for OpenAI-compatible completions, structured similarly to the completion function."""
tiktokenenc = tiktoken.get_encoding("cl100k_base")
e, cvs = UserCanvasService.get_by_id(agent_id)
if not e:
yield get_data_openai(
id=session_id,
model=agent_id,
content="**ERROR**: Agent not found."
)
return
if cvs.user_id != tenant_id:
yield get_data_openai(
id=session_id,
model=agent_id,
content="**ERROR**: You do not own the agent"
)
return
if not isinstance(cvs.dsl, str):
cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)
canvas = Canvas(cvs.dsl, tenant_id)
canvas.reset()
message_id = str(uuid4())
# Handle new session creation
if not session_id:
query = canvas.get_preset_param()
if query:
for ele in query:
if not ele["optional"]:
if not kwargs.get(ele["key"]):
yield get_data_openai(
id=None,
model=agent_id,
content=f"`{ele['key']}` is required",
completion_tokens=len(tiktokenenc.encode(f"`{ele['key']}` is required")),
prompt_tokens=len(tiktokenenc.encode(question if question else ""))
)
return
ele["value"] = kwargs[ele["key"]]
if ele["optional"]:
if kwargs.get(ele["key"]):
ele["value"] = kwargs[ele['key']]
else:
if "value" in ele:
ele.pop("value")
cvs.dsl = json.loads(str(canvas))
session_id = get_uuid()
conv = {
"id": session_id,
"dialog_id": cvs.id,
"user_id": kwargs.get("user_id", "") if isinstance(kwargs, dict) else "",
"message": [{"role": "assistant", "content": canvas.get_prologue(), "created_at": time.time()}],
"source": "agent",
"dsl": cvs.dsl
}
canvas.messages.append({"role": "user", "content": question, "id": message_id})
canvas.add_user_input(question)
API4ConversationService.save(**conv)
conv = API4Conversation(**conv)
if not conv.message:
conv.message = []
conv.message.append({
"role": "user",
"content": question,
"id": message_id
})
if not conv.reference:
conv.reference = []
conv.reference.append({"chunks": [], "doc_aggs": []})
# Handle existing session
else:
e, conv = API4ConversationService.get_by_id(session_id)
if not e:
yield get_data_openai(
id=session_id,
model=agent_id,
content="**ERROR**: Session not found!"
)
return
canvas = Canvas(json.dumps(conv.dsl), tenant_id)
canvas.messages.append({"role": "user", "content": question, "id": message_id})
canvas.add_user_input(question)
if not conv.message:
conv.message = []
conv.message.append({
"role": "user",
"content": question,
"id": message_id
})
if not conv.reference:
conv.reference = []
conv.reference.append({"chunks": [], "doc_aggs": []})
# Process request based on stream mode
final_ans = {"reference": [], "content": ""}
prompt_tokens = len(tiktokenenc.encode(str(question)))
user_id = kwargs.get("user_id", "")
if stream:
completion_tokens = 0
try:
completion_tokens = 0
for ans in canvas.run(stream=True, bypass_begin=True):
if ans.get("running_status"):
completion_tokens += len(tiktokenenc.encode(ans.get("content", "")))
yield "data: " + json.dumps(
get_data_openai(
id=session_id,
model=agent_id,
content=ans["content"],
object="chat.completion.chunk",
completion_tokens=completion_tokens,
prompt_tokens=prompt_tokens
),
ensure_ascii=False
) + "\n\n"
for ans in completion(
tenant_id=tenant_id,
agent_id=agent_id,
session_id=session_id,
query=question,
user_id=user_id,
**kwargs
):
if isinstance(ans, str):
try:
ans = json.loads(ans[5:]) # remove "data:"
except Exception as e:
logging.exception(f"Agent OpenAI-Compatible completionOpenAI parse answer failed: {e}")
continue
if ans.get("event") != "message" or not ans.get("data", {}).get("reference", None):
continue
for k in ans.keys():
final_ans[k] = ans[k]
completion_tokens += len(tiktokenenc.encode(final_ans.get("content", "")))
content_piece = ans["data"]["content"]
completion_tokens += len(tiktokenenc.encode(content_piece))
yield "data: " + json.dumps(
get_data_openai(
id=session_id,
id=session_id or str(uuid4()),
model=agent_id,
content=final_ans["content"],
object="chat.completion.chunk",
finish_reason="stop",
content=content_piece,
prompt_tokens=prompt_tokens,
completion_tokens=completion_tokens,
prompt_tokens=prompt_tokens
stream=True
),
ensure_ascii=False
) + "\n\n"
# Update conversation
canvas.messages.append({"role": "assistant", "content": final_ans["content"], "created_at": time.time(), "id": message_id})
canvas.history.append(("assistant", final_ans["content"]))
if final_ans.get("reference"):
canvas.reference.append(final_ans["reference"])
conv.dsl = json.loads(str(canvas))
API4ConversationService.append_message(conv.id, conv.to_dict())
yield "data: [DONE]\n\n"
except Exception as e:
traceback.print_exc()
conv.dsl = json.loads(str(canvas))
API4ConversationService.append_message(conv.id, conv.to_dict())
yield "data: " + json.dumps(
get_data_openai(
id=session_id,
id=session_id or str(uuid4()),
model=agent_id,
content="**ERROR**: " + str(e),
content=f"**ERROR**: {str(e)}",
finish_reason="stop",
completion_tokens=len(tiktokenenc.encode("**ERROR**: " + str(e))),
prompt_tokens=prompt_tokens
prompt_tokens=prompt_tokens,
completion_tokens=len(tiktokenenc.encode(f"**ERROR**: {str(e)}")),
stream=True
),
ensure_ascii=False
) + "\n\n"
yield "data: [DONE]\n\n"
else: # Non-streaming mode
else:
try:
all_answer_content = ""
for answer in canvas.run(stream=False, bypass_begin=True):
if answer.get("running_status"):
all_content = ""
for ans in completion(
tenant_id=tenant_id,
agent_id=agent_id,
session_id=session_id,
query=question,
user_id=user_id,
**kwargs
):
if isinstance(ans, str):
ans = json.loads(ans[5:])
if ans.get("event") != "message" or not ans.get("data", {}).get("reference", None):
continue
final_ans["content"] = "\n".join(answer["content"]) if "content" in answer else ""
final_ans["reference"] = answer.get("reference", [])
all_answer_content += final_ans["content"]
final_ans["content"] = all_answer_content
# Update conversation
canvas.messages.append({"role": "assistant", "content": final_ans["content"], "created_at": time.time(), "id": message_id})
canvas.history.append(("assistant", final_ans["content"]))
if final_ans.get("reference"):
canvas.reference.append(final_ans["reference"])
conv.dsl = json.loads(str(canvas))
API4ConversationService.append_message(conv.id, conv.to_dict())
# Return the response in OpenAI format
all_content += ans["data"]["content"]
completion_tokens = len(tiktokenenc.encode(all_content))
yield get_data_openai(
id=session_id,
id=session_id or str(uuid4()),
model=agent_id,
content=final_ans["content"],
finish_reason="stop",
completion_tokens=len(tiktokenenc.encode(final_ans["content"])),
prompt_tokens=prompt_tokens,
param=canvas.get_preset_param() # Added param info like in completion
)
except Exception as e:
traceback.print_exc()
conv.dsl = json.loads(str(canvas))
API4ConversationService.append_message(conv.id, conv.to_dict())
yield get_data_openai(
id=session_id,
model=agent_id,
content="**ERROR**: " + str(e),
completion_tokens=completion_tokens,
content=all_content,
finish_reason="stop",
completion_tokens=len(tiktokenenc.encode("**ERROR**: " + str(e))),
prompt_tokens=prompt_tokens
param=None
)
except Exception as e:
yield get_data_openai(
id=session_id or str(uuid4()),
model=agent_id,
prompt_tokens=prompt_tokens,
completion_tokens=len(tiktokenenc.encode(f"**ERROR**: {str(e)}")),
content=f"**ERROR**: {str(e)}",
finish_reason="stop",
param=None
)

View File

@ -22,21 +22,27 @@ from datetime import datetime
from functools import partial
from timeit import default_timer as timer
import trio
from langfuse import Langfuse
from peewee import fn
from agentic_reasoning import DeepResearcher
from api import settings
from api.db import LLMType, ParserType, StatusEnum
from api.db.db_models import DB, Dialog
from api.db.services.common_service import CommonService
from api.db.services.document_service import DocumentService
from api.db.services.knowledgebase_service import KnowledgebaseService
from api.db.services.langfuse_service import TenantLangfuseService
from api.db.services.llm_service import LLMBundle, TenantLLMService
from api.db.services.llm_service import LLMBundle
from api.db.services.tenant_llm_service import TenantLLMService
from api.utils import current_timestamp, datetime_format
from graphrag.general.mind_map_extractor import MindMapExtractor
from rag.app.resume import forbidden_select_fields4resume
from rag.app.tag import label_question
from rag.nlp.search import index_name
from rag.prompts import chunks_format, citation_prompt, cross_languages, full_question, kb_prompt, keyword_extraction, message_fit_in
from rag.prompts.prompts import gen_meta_filter, PROMPT_JINJA_ENV, ASK_SUMMARY
from rag.utils import num_tokens_from_string, rmSpace
from rag.utils.tavily_conn import Tavily
@ -95,6 +101,66 @@ class DialogService(CommonService):
return list(chats.dicts())
@classmethod
@DB.connection_context()
def get_by_tenant_ids(cls, joined_tenant_ids, user_id, page_number, items_per_page, orderby, desc, keywords, parser_id=None):
from api.db.db_models import User
fields = [
cls.model.id,
cls.model.tenant_id,
cls.model.name,
cls.model.description,
cls.model.language,
cls.model.llm_id,
cls.model.llm_setting,
cls.model.prompt_type,
cls.model.prompt_config,
cls.model.similarity_threshold,
cls.model.vector_similarity_weight,
cls.model.top_n,
cls.model.top_k,
cls.model.do_refer,
cls.model.rerank_id,
cls.model.kb_ids,
cls.model.icon,
cls.model.status,
User.nickname,
User.avatar.alias("tenant_avatar"),
cls.model.update_time,
cls.model.create_time,
]
if keywords:
dialogs = (
cls.model.select(*fields)
.join(User, on=(cls.model.tenant_id == User.id))
.where(
(cls.model.tenant_id.in_(joined_tenant_ids) | (cls.model.tenant_id == user_id)) & (cls.model.status == StatusEnum.VALID.value),
(fn.LOWER(cls.model.name).contains(keywords.lower())),
)
)
else:
dialogs = (
cls.model.select(*fields)
.join(User, on=(cls.model.tenant_id == User.id))
.where(
(cls.model.tenant_id.in_(joined_tenant_ids) | (cls.model.tenant_id == user_id)) & (cls.model.status == StatusEnum.VALID.value),
)
)
if parser_id:
dialogs = dialogs.where(cls.model.parser_id == parser_id)
if desc:
dialogs = dialogs.order_by(cls.model.getter_by(orderby).desc())
else:
dialogs = dialogs.order_by(cls.model.getter_by(orderby).asc())
count = dialogs.count()
if page_number and items_per_page:
dialogs = dialogs.paginate(page_number, items_per_page)
return list(dialogs.dicts()), count
def chat_solo(dialog, messages, stream=True):
if TenantLLMService.llm_id2llm_type(dialog.llm_id) == "image2text":
@ -189,6 +255,55 @@ def repair_bad_citation_formats(answer: str, kbinfos: dict, idx: set):
return answer, idx
def meta_filter(metas: dict, filters: list[dict]):
doc_ids = set([])
def filter_out(v2docs, operator, value):
ids = []
for input, docids in v2docs.items():
try:
input = float(input)
value = float(value)
except Exception:
input = str(input)
value = str(value)
for conds in [
(operator == "contains", str(value).lower() in str(input).lower()),
(operator == "not contains", str(value).lower() not in str(input).lower()),
(operator == "start with", str(input).lower().startswith(str(value).lower())),
(operator == "end with", str(input).lower().endswith(str(value).lower())),
(operator == "empty", not input),
(operator == "not empty", input),
(operator == "=", input == value),
(operator == "", input != value),
(operator == ">", input > value),
(operator == "<", input < value),
(operator == "", input >= value),
(operator == "", input <= value),
]:
try:
if all(conds):
ids.extend(docids)
break
except Exception:
pass
return ids
for k, v2docs in metas.items():
for f in filters:
if k != f["key"]:
continue
ids = filter_out(v2docs, f["op"], f["value"])
if not doc_ids:
doc_ids = set(ids)
else:
doc_ids = doc_ids & set(ids)
if not doc_ids:
return []
return list(doc_ids)
def chat(dialog, messages, stream=True, **kwargs):
assert messages[-1]["role"] == "user", "The last content of this conversation is not from user."
if not dialog.kb_ids and not dialog.prompt_config.get("tavily_api_key"):
@ -208,12 +323,14 @@ def chat(dialog, messages, stream=True, **kwargs):
check_llm_ts = timer()
langfuse_tracer = None
trace_context = {}
langfuse_keys = TenantLangfuseService.filter_by_tenant(tenant_id=dialog.tenant_id)
if langfuse_keys:
langfuse = Langfuse(public_key=langfuse_keys.public_key, secret_key=langfuse_keys.secret_key, host=langfuse_keys.host)
if langfuse.auth_check():
langfuse_tracer = langfuse
langfuse.trace = langfuse_tracer.trace(name=f"{dialog.name}-{llm_model_config['llm_name']}")
trace_id = langfuse_tracer.create_trace_id()
trace_context = {"trace_id": trace_id}
check_langfuse_tracer_ts = timer()
kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog)
@ -224,9 +341,10 @@ def chat(dialog, messages, stream=True, **kwargs):
retriever = settings.retrievaler
questions = [m["content"] for m in messages if m["role"] == "user"][-3:]
attachments = kwargs["doc_ids"].split(",") if "doc_ids" in kwargs else None
attachments = kwargs["doc_ids"].split(",") if "doc_ids" in kwargs else []
if "doc_ids" in messages[-1]:
attachments = messages[-1]["doc_ids"]
prompt_config = dialog.prompt_config
field_map = KnowledgebaseService.get_field_map(dialog.kb_ids)
# try to use sql if field mapping is good to go
@ -253,6 +371,18 @@ def chat(dialog, messages, stream=True, **kwargs):
if prompt_config.get("cross_languages"):
questions = [cross_languages(dialog.tenant_id, dialog.llm_id, questions[0], prompt_config["cross_languages"])]
if dialog.meta_data_filter:
metas = DocumentService.get_meta_by_kbs(dialog.kb_ids)
if dialog.meta_data_filter.get("method") == "auto":
filters = gen_meta_filter(chat_mdl, metas, questions[-1])
attachments.extend(meta_filter(metas, filters))
if not attachments:
attachments = None
elif dialog.meta_data_filter.get("method") == "manual":
attachments.extend(meta_filter(metas, dialog.meta_data_filter["manual"]))
if not attachments:
attachments = None
if prompt_config.get("keyword", False):
questions[-1] += keyword_extraction(chat_mdl, questions[-1])
@ -260,17 +390,26 @@ def chat(dialog, messages, stream=True, **kwargs):
thought = ""
kbinfos = {"total": 0, "chunks": [], "doc_aggs": []}
knowledges = []
if "knowledge" not in [p["key"] for p in prompt_config["parameters"]]:
knowledges = []
else:
if attachments is not None and "knowledge" in [p["key"] for p in prompt_config["parameters"]]:
tenant_ids = list(set([kb.tenant_id for kb in kbs]))
knowledges = []
if prompt_config.get("reasoning", False):
reasoner = DeepResearcher(
chat_mdl,
prompt_config,
partial(retriever.retrieval, embd_mdl=embd_mdl, tenant_ids=tenant_ids, kb_ids=dialog.kb_ids, page=1, page_size=dialog.top_n, similarity_threshold=0.2, vector_similarity_weight=0.3),
partial(
retriever.retrieval,
embd_mdl=embd_mdl,
tenant_ids=tenant_ids,
kb_ids=dialog.kb_ids,
page=1,
page_size=dialog.top_n,
similarity_threshold=0.2,
vector_similarity_weight=0.3,
doc_ids=attachments,
),
)
for think in reasoner.thinking(kbinfos, " ".join(questions)):
@ -400,17 +539,19 @@ def chat(dialog, messages, stream=True, **kwargs):
f" - Token speed: {int(tk_num / (generate_result_time_cost / 1000.0))}/s"
)
langfuse_output = "\n" + re.sub(r"^.*?(### Query:.*)", r"\1", prompt, flags=re.DOTALL)
langfuse_output = {"time_elapsed:": re.sub(r"\n", " \n", langfuse_output), "created_at": time.time()}
# Add a condition check to call the end method only if langfuse_tracer exists
if langfuse_tracer and "langfuse_generation" in locals():
langfuse_generation.end(output=langfuse_output)
langfuse_output = "\n" + re.sub(r"^.*?(### Query:.*)", r"\1", prompt, flags=re.DOTALL)
langfuse_output = {"time_elapsed:": re.sub(r"\n", " \n", langfuse_output), "created_at": time.time()}
langfuse_generation.update(output=langfuse_output)
langfuse_generation.end()
return {"answer": think + answer, "reference": refs, "prompt": re.sub(r"\n", " \n", prompt), "created_at": time.time()}
if langfuse_tracer:
langfuse_generation = langfuse_tracer.trace.generation(name="chat", model=llm_model_config["llm_name"], input={"prompt": prompt, "prompt4citation": prompt4citation, "messages": msg})
langfuse_generation = langfuse_tracer.start_generation(
trace_context=trace_context, name="chat", model=llm_model_config["llm_name"], input={"prompt": prompt, "prompt4citation": prompt4citation, "messages": msg}
)
if stream:
last_ans = ""
@ -556,7 +697,14 @@ def tts(tts_mdl, text):
return binascii.hexlify(bin).decode("utf-8")
def ask(question, kb_ids, tenant_id, chat_llm_name=None):
def ask(question, kb_ids, tenant_id, chat_llm_name=None, search_config={}):
doc_ids = search_config.get("doc_ids", [])
rerank_mdl = None
kb_ids = search_config.get("kb_ids", kb_ids)
chat_llm_name = search_config.get("chat_id", chat_llm_name)
rerank_id = search_config.get("rerank_id", "")
meta_data_filter = search_config.get("meta_data_filter")
kbs = KnowledgebaseService.get_by_ids(kb_ids)
embedding_list = list(set([kb.embd_id for kb in kbs]))
@ -565,30 +713,46 @@ def ask(question, kb_ids, tenant_id, chat_llm_name=None):
embd_mdl = LLMBundle(tenant_id, LLMType.EMBEDDING, embedding_list[0])
chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, chat_llm_name)
if rerank_id:
rerank_mdl = LLMBundle(tenant_id, LLMType.RERANK, rerank_id)
max_tokens = chat_mdl.max_length
tenant_ids = list(set([kb.tenant_id for kb in kbs]))
kbinfos = retriever.retrieval(question, embd_mdl, tenant_ids, kb_ids, 1, 12, 0.1, 0.3, aggs=False, rank_feature=label_question(question, kbs))
if meta_data_filter:
metas = DocumentService.get_meta_by_kbs(kb_ids)
if meta_data_filter.get("method") == "auto":
filters = gen_meta_filter(chat_mdl, metas, question)
doc_ids.extend(meta_filter(metas, filters))
if not doc_ids:
doc_ids = None
elif meta_data_filter.get("method") == "manual":
doc_ids.extend(meta_filter(metas, meta_data_filter["manual"]))
if not doc_ids:
doc_ids = None
kbinfos = retriever.retrieval(
question = question,
embd_mdl=embd_mdl,
tenant_ids=tenant_ids,
kb_ids=kb_ids,
page=1,
page_size=12,
similarity_threshold=search_config.get("similarity_threshold", 0.1),
vector_similarity_weight=search_config.get("vector_similarity_weight", 0.3),
top=search_config.get("top_k", 1024),
doc_ids=doc_ids,
aggs=False,
rerank_mdl=rerank_mdl,
rank_feature=label_question(question, kbs)
)
knowledges = kb_prompt(kbinfos, max_tokens)
prompt = """
Role: You're a smart assistant. Your name is Miss R.
Task: Summarize the information from knowledge bases and answer user's question.
Requirements and restriction:
- DO NOT make things up, especially for numbers.
- If the information from knowledge is irrelevant with user's question, JUST SAY: Sorry, no relevant information provided.
- Answer with markdown format text.
- Answer in language of user's question.
- DO NOT make things up, especially for numbers.
sys_prompt = PROMPT_JINJA_ENV.from_string(ASK_SUMMARY).render(knowledge="\n".join(knowledges))
### Information from knowledge bases
%s
The above is information from knowledge bases.
""" % "\n".join(knowledges)
msg = [{"role": "user", "content": question}]
def decorate_answer(answer):
nonlocal knowledges, kbinfos, prompt
nonlocal knowledges, kbinfos, sys_prompt
answer, idx = retriever.insert_citations(answer, [ck["content_ltks"] for ck in kbinfos["chunks"]], [ck["vector"] for ck in kbinfos["chunks"]], embd_mdl, tkweight=0.7, vtweight=0.3)
idx = set([kbinfos["chunks"][int(i)]["doc_id"] for i in idx])
recall_docs = [d for d in kbinfos["doc_aggs"] if d["doc_id"] in idx]
@ -606,7 +770,55 @@ def ask(question, kb_ids, tenant_id, chat_llm_name=None):
return {"answer": answer, "reference": refs}
answer = ""
for ans in chat_mdl.chat_streamly(prompt, msg, {"temperature": 0.1}):
for ans in chat_mdl.chat_streamly(sys_prompt, msg, {"temperature": 0.1}):
answer = ans
yield {"answer": answer, "reference": {}}
yield decorate_answer(answer)
def gen_mindmap(question, kb_ids, tenant_id, search_config={}):
meta_data_filter = search_config.get("meta_data_filter", {})
doc_ids = search_config.get("doc_ids", [])
rerank_id = search_config.get("rerank_id", "")
rerank_mdl = None
kbs = KnowledgebaseService.get_by_ids(kb_ids)
if not kbs:
return {"error": "No KB selected"}
embedding_list = list(set([kb.embd_id for kb in kbs]))
tenant_ids = list(set([kb.tenant_id for kb in kbs]))
embd_mdl = LLMBundle(tenant_id, LLMType.EMBEDDING, llm_name=embedding_list[0])
chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_name=search_config.get("chat_id", ""))
if rerank_id:
rerank_mdl = LLMBundle(tenant_id, LLMType.RERANK, rerank_id)
if meta_data_filter:
metas = DocumentService.get_meta_by_kbs(kb_ids)
if meta_data_filter.get("method") == "auto":
filters = gen_meta_filter(chat_mdl, metas, question)
doc_ids.extend(meta_filter(metas, filters))
if not doc_ids:
doc_ids = None
elif meta_data_filter.get("method") == "manual":
doc_ids.extend(meta_filter(metas, meta_data_filter["manual"]))
if not doc_ids:
doc_ids = None
ranks = settings.retrievaler.retrieval(
question=question,
embd_mdl=embd_mdl,
tenant_ids=tenant_ids,
kb_ids=kb_ids,
page=1,
page_size=12,
similarity_threshold=search_config.get("similarity_threshold", 0.2),
vector_similarity_weight=search_config.get("vector_similarity_weight", 0.3),
top=search_config.get("top_k", 1024),
doc_ids=doc_ids,
aggs=False,
rerank_mdl=rerank_mdl,
rank_feature=label_question(question, kbs),
)
mindmap = MindMapExtractor(chat_mdl)
mind_map = trio.run(mindmap, [c["content_with_weight"] for c in ranks["chunks"]])
return mind_map.output

View File

@ -243,7 +243,7 @@ class DocumentService(CommonService):
from api.db.services.task_service import TaskService
cls.clear_chunk_num(doc.id)
try:
TaskService.filter_delete(Task.doc_id == doc.id)
TaskService.filter_delete([Task.doc_id == doc.id])
page = 0
page_size = 1000
all_chunk_ids = []
@ -574,6 +574,25 @@ class DocumentService(CommonService):
def update_meta_fields(cls, doc_id, meta_fields):
return cls.update_by_id(doc_id, {"meta_fields": meta_fields})
@classmethod
@DB.connection_context()
def get_meta_by_kbs(cls, kb_ids):
fields = [
cls.model.id,
cls.model.meta_fields,
]
meta = {}
for r in cls.model.select(*fields).where(cls.model.kb_id.in_(kb_ids)):
doc_id = r.id
for k,v in r.meta_fields.items():
if k not in meta:
meta[k] = {}
v = str(v)
if v not in meta[k]:
meta[k][v] = []
meta[k][v].append(doc_id)
return meta
@classmethod
@DB.connection_context()
def update_progress(cls):

View File

@ -227,10 +227,13 @@ class FileService(CommonService):
# tenant_id: Tenant ID
# Returns:
# Knowledge base folder dictionary
for root in cls.model.select().where((cls.model.tenant_id == tenant_id), (cls.model.parent_id == cls.model.id)):
for folder in cls.model.select().where((cls.model.tenant_id == tenant_id), (cls.model.parent_id == root.id), (cls.model.name == KNOWLEDGEBASE_FOLDER_NAME)):
return folder.to_dict()
assert False, "Can't find the KB folder. Database init error."
root_folder = cls.get_root_folder(tenant_id)
root_id = root_folder["id"]
kb_folder = cls.model.select().where((cls.model.tenant_id == tenant_id), (cls.model.parent_id == root_id), (cls.model.name == KNOWLEDGEBASE_FOLDER_NAME)).first()
if not kb_folder:
kb_folder = cls.new_a_file_from_kb(tenant_id, KNOWLEDGEBASE_FOLDER_NAME, root_id)
return kb_folder
return kb_folder.to_dict()
@classmethod
@DB.connection_context()
@ -499,10 +502,9 @@ class FileService(CommonService):
@staticmethod
def get_blob(user_id, location):
bname = f"{user_id}-downloads"
return STORAGE_IMPL.get(bname, location)
return STORAGE_IMPL.get(bname, location)
@staticmethod
def put_blob(user_id, location, blob):
bname = f"{user_id}-downloads"
return STORAGE_IMPL.put(bname, location, blob)
return STORAGE_IMPL.put(bname, location, blob)

View File

@ -13,240 +13,78 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
import inspect
import logging
import re
from functools import partial
from typing import Generator
from langfuse import Langfuse
from api import settings
from api.db import LLMType
from api.db.db_models import DB, LLM, LLMFactories, TenantLLM
from api.db.db_models import LLM
from api.db.services.common_service import CommonService
from api.db.services.langfuse_service import TenantLangfuseService
from api.db.services.user_service import TenantService
from rag.llm import ChatModel, CvModel, EmbeddingModel, RerankModel, Seq2txtModel, TTSModel
class LLMFactoriesService(CommonService):
model = LLMFactories
from api.db.services.tenant_llm_service import LLM4Tenant, TenantLLMService
class LLMService(CommonService):
model = LLM
class TenantLLMService(CommonService):
model = TenantLLM
def get_init_tenant_llm(user_id):
from api import settings
tenant_llm = []
@classmethod
@DB.connection_context()
def get_api_key(cls, tenant_id, model_name):
mdlnm, fid = TenantLLMService.split_model_name_and_factory(model_name)
if not fid:
objs = cls.query(tenant_id=tenant_id, llm_name=mdlnm)
else:
objs = cls.query(tenant_id=tenant_id, llm_name=mdlnm, llm_factory=fid)
seen = set()
factory_configs = []
for factory_config in [
settings.CHAT_CFG,
settings.EMBEDDING_CFG,
settings.ASR_CFG,
settings.IMAGE2TEXT_CFG,
settings.RERANK_CFG,
]:
factory_name = factory_config["factory"]
if factory_name not in seen:
seen.add(factory_name)
factory_configs.append(factory_config)
if (not objs) and fid:
if fid == "LocalAI":
mdlnm += "___LocalAI"
elif fid == "HuggingFace":
mdlnm += "___HuggingFace"
elif fid == "OpenAI-API-Compatible":
mdlnm += "___OpenAI-API"
elif fid == "VLLM":
mdlnm += "___VLLM"
objs = cls.query(tenant_id=tenant_id, llm_name=mdlnm, llm_factory=fid)
if not objs:
return
return objs[0]
@classmethod
@DB.connection_context()
def get_my_llms(cls, tenant_id):
fields = [cls.model.llm_factory, LLMFactories.logo, LLMFactories.tags, cls.model.model_type, cls.model.llm_name, cls.model.used_tokens]
objs = cls.model.select(*fields).join(LLMFactories, on=(cls.model.llm_factory == LLMFactories.name)).where(cls.model.tenant_id == tenant_id, ~cls.model.api_key.is_null()).dicts()
return list(objs)
@staticmethod
def split_model_name_and_factory(model_name):
arr = model_name.split("@")
if len(arr) < 2:
return model_name, None
if len(arr) > 2:
return "@".join(arr[0:-1]), arr[-1]
# model name must be xxx@yyy
try:
model_factories = settings.FACTORY_LLM_INFOS
model_providers = set([f["name"] for f in model_factories])
if arr[-1] not in model_providers:
return model_name, None
return arr[0], arr[-1]
except Exception as e:
logging.exception(f"TenantLLMService.split_model_name_and_factory got exception: {e}")
return model_name, None
@classmethod
@DB.connection_context()
def get_model_config(cls, tenant_id, llm_type, llm_name=None):
e, tenant = TenantService.get_by_id(tenant_id)
if not e:
raise LookupError("Tenant not found")
if llm_type == LLMType.EMBEDDING.value:
mdlnm = tenant.embd_id if not llm_name else llm_name
elif llm_type == LLMType.SPEECH2TEXT.value:
mdlnm = tenant.asr_id
elif llm_type == LLMType.IMAGE2TEXT.value:
mdlnm = tenant.img2txt_id if not llm_name else llm_name
elif llm_type == LLMType.CHAT.value:
mdlnm = tenant.llm_id if not llm_name else llm_name
elif llm_type == LLMType.RERANK:
mdlnm = tenant.rerank_id if not llm_name else llm_name
elif llm_type == LLMType.TTS:
mdlnm = tenant.tts_id if not llm_name else llm_name
else:
assert False, "LLM type error"
model_config = cls.get_api_key(tenant_id, mdlnm)
mdlnm, fid = TenantLLMService.split_model_name_and_factory(mdlnm)
if not model_config: # for some cases seems fid mismatch
model_config = cls.get_api_key(tenant_id, mdlnm)
if model_config:
model_config = model_config.to_dict()
llm = LLMService.query(llm_name=mdlnm) if not fid else LLMService.query(llm_name=mdlnm, fid=fid)
if not llm and fid: # for some cases seems fid mismatch
llm = LLMService.query(llm_name=mdlnm)
if llm:
model_config["is_tools"] = llm[0].is_tools
if not model_config:
if llm_type in [LLMType.EMBEDDING, LLMType.RERANK]:
llm = LLMService.query(llm_name=mdlnm) if not fid else LLMService.query(llm_name=mdlnm, fid=fid)
if llm and llm[0].fid in ["Youdao", "FastEmbed", "BAAI"]:
model_config = {"llm_factory": llm[0].fid, "api_key": "", "llm_name": mdlnm, "api_base": ""}
if not model_config:
if mdlnm == "flag-embedding":
model_config = {"llm_factory": "Tongyi-Qianwen", "api_key": "", "llm_name": llm_name, "api_base": ""}
else:
if not mdlnm:
raise LookupError(f"Type of {llm_type} model is not set.")
raise LookupError("Model({}) not authorized".format(mdlnm))
return model_config
@classmethod
@DB.connection_context()
def model_instance(cls, tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs):
model_config = TenantLLMService.get_model_config(tenant_id, llm_type, llm_name)
if llm_type == LLMType.EMBEDDING.value:
if model_config["llm_factory"] not in EmbeddingModel:
return
return EmbeddingModel[model_config["llm_factory"]](model_config["api_key"], model_config["llm_name"], base_url=model_config["api_base"])
if llm_type == LLMType.RERANK:
if model_config["llm_factory"] not in RerankModel:
return
return RerankModel[model_config["llm_factory"]](model_config["api_key"], model_config["llm_name"], base_url=model_config["api_base"])
if llm_type == LLMType.IMAGE2TEXT.value:
if model_config["llm_factory"] not in CvModel:
return
return CvModel[model_config["llm_factory"]](model_config["api_key"], model_config["llm_name"], lang, base_url=model_config["api_base"], **kwargs)
if llm_type == LLMType.CHAT.value:
if model_config["llm_factory"] not in ChatModel:
return
return ChatModel[model_config["llm_factory"]](model_config["api_key"], model_config["llm_name"], base_url=model_config["api_base"], **kwargs)
if llm_type == LLMType.SPEECH2TEXT:
if model_config["llm_factory"] not in Seq2txtModel:
return
return Seq2txtModel[model_config["llm_factory"]](key=model_config["api_key"], model_name=model_config["llm_name"], lang=lang, base_url=model_config["api_base"])
if llm_type == LLMType.TTS:
if model_config["llm_factory"] not in TTSModel:
return
return TTSModel[model_config["llm_factory"]](
model_config["api_key"],
model_config["llm_name"],
base_url=model_config["api_base"],
for factory_config in factory_configs:
for llm in LLMService.query(fid=factory_config["factory"]):
tenant_llm.append(
{
"tenant_id": user_id,
"llm_factory": factory_config["factory"],
"llm_name": llm.llm_name,
"model_type": llm.model_type,
"api_key": factory_config["api_key"],
"api_base": factory_config["base_url"],
"max_tokens": llm.max_tokens if llm.max_tokens else 8192,
}
)
@classmethod
@DB.connection_context()
def increase_usage(cls, tenant_id, llm_type, used_tokens, llm_name=None):
e, tenant = TenantService.get_by_id(tenant_id)
if not e:
logging.error(f"Tenant not found: {tenant_id}")
return 0
llm_map = {
LLMType.EMBEDDING.value: tenant.embd_id if not llm_name else llm_name,
LLMType.SPEECH2TEXT.value: tenant.asr_id,
LLMType.IMAGE2TEXT.value: tenant.img2txt_id,
LLMType.CHAT.value: tenant.llm_id if not llm_name else llm_name,
LLMType.RERANK.value: tenant.rerank_id if not llm_name else llm_name,
LLMType.TTS.value: tenant.tts_id if not llm_name else llm_name,
}
mdlnm = llm_map.get(llm_type)
if mdlnm is None:
logging.error(f"LLM type error: {llm_type}")
return 0
llm_name, llm_factory = TenantLLMService.split_model_name_and_factory(mdlnm)
try:
num = (
cls.model.update(used_tokens=cls.model.used_tokens + used_tokens)
.where(cls.model.tenant_id == tenant_id, cls.model.llm_name == llm_name, cls.model.llm_factory == llm_factory if llm_factory else True)
.execute()
if settings.LIGHTEN != 1:
for buildin_embedding_model in settings.BUILTIN_EMBEDDING_MODELS:
mdlnm, fid = TenantLLMService.split_model_name_and_factory(buildin_embedding_model)
tenant_llm.append(
{
"tenant_id": user_id,
"llm_factory": fid,
"llm_name": mdlnm,
"model_type": "embedding",
"api_key": "",
"api_base": "",
"max_tokens": 1024 if buildin_embedding_model == "BAAI/bge-large-zh-v1.5@BAAI" else 512,
}
)
except Exception:
logging.exception("TenantLLMService.increase_usage got exception,Failed to update used_tokens for tenant_id=%s, llm_name=%s", tenant_id, llm_name)
return 0
return num
@classmethod
@DB.connection_context()
def get_openai_models(cls):
objs = cls.model.select().where((cls.model.llm_factory == "OpenAI"), ~(cls.model.llm_name == "text-embedding-3-small"), ~(cls.model.llm_name == "text-embedding-3-large")).dicts()
return list(objs)
@staticmethod
def llm_id2llm_type(llm_id: str) ->str|None:
llm_id, *_ = TenantLLMService.split_model_name_and_factory(llm_id)
llm_factories = settings.FACTORY_LLM_INFOS
for llm_factory in llm_factories:
for llm in llm_factory["llm"]:
if llm_id == llm["llm_name"]:
return llm["model_type"].split(",")[-1]
unique = {}
for item in tenant_llm:
key = (item["tenant_id"], item["llm_factory"], item["llm_name"])
if key not in unique:
unique[key] = item
return list(unique.values())
class LLMBundle:
class LLMBundle(LLM4Tenant):
def __init__(self, tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs):
self.tenant_id = tenant_id
self.llm_type = llm_type
self.llm_name = llm_name
self.mdl = TenantLLMService.model_instance(tenant_id, llm_type, llm_name, lang=lang, **kwargs)
assert self.mdl, "Can't find model for {}/{}/{}".format(tenant_id, llm_type, llm_name)
model_config = TenantLLMService.get_model_config(tenant_id, llm_type, llm_name)
self.max_length = model_config.get("max_tokens", 8192)
self.is_tools = model_config.get("is_tools", False)
self.verbose_tool_use = kwargs.get("verbose_tool_use")
langfuse_keys = TenantLangfuseService.filter_by_tenant(tenant_id=tenant_id)
if langfuse_keys:
langfuse = Langfuse(public_key=langfuse_keys.public_key, secret_key=langfuse_keys.secret_key, host=langfuse_keys.host)
if langfuse.auth_check():
self.langfuse = langfuse
self.trace = self.langfuse.trace(name=f"{self.llm_type}-{self.llm_name}")
else:
self.langfuse = None
super().__init__(tenant_id, llm_type, llm_name, lang, **kwargs)
def bind_tools(self, toolcall_session, tools):
if not self.is_tools:
@ -256,7 +94,7 @@ class LLMBundle:
def encode(self, texts: list):
if self.langfuse:
generation = self.trace.generation(name="encode", model=self.llm_name, input={"texts": texts})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="encode", model=self.llm_name, input={"texts": texts})
embeddings, used_tokens = self.mdl.encode(texts)
llm_name = getattr(self, "llm_name", None)
@ -264,13 +102,14 @@ class LLMBundle:
logging.error("LLMBundle.encode can't update token usage for {}/EMBEDDING used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(usage_details={"total_tokens": used_tokens})
generation.update(usage_details={"total_tokens": used_tokens})
generation.end()
return embeddings, used_tokens
def encode_queries(self, query: str):
if self.langfuse:
generation = self.trace.generation(name="encode_queries", model=self.llm_name, input={"query": query})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="encode_queries", model=self.llm_name, input={"query": query})
emd, used_tokens = self.mdl.encode_queries(query)
llm_name = getattr(self, "llm_name", None)
@ -278,65 +117,70 @@ class LLMBundle:
logging.error("LLMBundle.encode_queries can't update token usage for {}/EMBEDDING used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(usage_details={"total_tokens": used_tokens})
generation.update(usage_details={"total_tokens": used_tokens})
generation.end()
return emd, used_tokens
def similarity(self, query: str, texts: list):
if self.langfuse:
generation = self.trace.generation(name="similarity", model=self.llm_name, input={"query": query, "texts": texts})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="similarity", model=self.llm_name, input={"query": query, "texts": texts})
sim, used_tokens = self.mdl.similarity(query, texts)
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
logging.error("LLMBundle.similarity can't update token usage for {}/RERANK used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(usage_details={"total_tokens": used_tokens})
generation.update(usage_details={"total_tokens": used_tokens})
generation.end()
return sim, used_tokens
def describe(self, image, max_tokens=300):
if self.langfuse:
generation = self.trace.generation(name="describe", metadata={"model": self.llm_name})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="describe", metadata={"model": self.llm_name})
txt, used_tokens = self.mdl.describe(image)
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
logging.error("LLMBundle.describe can't update token usage for {}/IMAGE2TEXT used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.update(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.end()
return txt
def describe_with_prompt(self, image, prompt):
if self.langfuse:
generation = self.trace.generation(name="describe_with_prompt", metadata={"model": self.llm_name, "prompt": prompt})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="describe_with_prompt", metadata={"model": self.llm_name, "prompt": prompt})
txt, used_tokens = self.mdl.describe_with_prompt(image, prompt)
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
logging.error("LLMBundle.describe can't update token usage for {}/IMAGE2TEXT used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.update(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.end()
return txt
def transcription(self, audio):
if self.langfuse:
generation = self.trace.generation(name="transcription", metadata={"model": self.llm_name})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="transcription", metadata={"model": self.llm_name})
txt, used_tokens = self.mdl.transcription(audio)
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
logging.error("LLMBundle.transcription can't update token usage for {}/SEQUENCE2TXT used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.update(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.end()
return txt
def tts(self, text: str) -> Generator[bytes, None, None]:
if self.langfuse:
span = self.trace.span(name="tts", input={"text": text})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="tts", input={"text": text})
for chunk in self.mdl.tts(text):
if isinstance(chunk, int):
@ -346,7 +190,7 @@ class LLMBundle:
yield chunk
if self.langfuse:
span.end()
generation.end()
def _remove_reasoning_content(self, txt: str) -> str:
first_think_start = txt.find("<think>")
@ -361,16 +205,34 @@ class LLMBundle:
return txt
return txt[last_think_end + len("</think>") :]
@staticmethod
def _clean_param(chat_partial, **kwargs):
func = chat_partial.func
sig = inspect.signature(func)
keyword_args = []
support_var_args = False
for param in sig.parameters.values():
if param.kind == inspect.Parameter.VAR_KEYWORD or param.kind == inspect.Parameter.VAR_POSITIONAL:
support_var_args = True
elif param.kind == inspect.Parameter.KEYWORD_ONLY:
keyword_args.append(param.name)
def chat(self, system: str, history: list, gen_conf: dict={}, **kwargs) -> str:
use_kwargs = kwargs
if not support_var_args:
use_kwargs = {k: v for k, v in kwargs.items() if k in keyword_args}
return use_kwargs
def chat(self, system: str, history: list, gen_conf: dict = {}, **kwargs) -> str:
if self.langfuse:
generation = self.trace.generation(name="chat", model=self.llm_name, input={"system": system, "history": history})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="chat", model=self.llm_name, input={"system": system, "history": history})
chat_partial = partial(self.mdl.chat, system, history, gen_conf)
if self.is_tools and self.mdl.is_tools:
chat_partial = partial(self.mdl.chat_with_tools, system, history, gen_conf)
txt, used_tokens = chat_partial(**kwargs)
use_kwargs = self._clean_param(chat_partial, **kwargs)
txt, used_tokens = chat_partial(**use_kwargs)
txt = self._remove_reasoning_content(txt)
if not self.verbose_tool_use:
@ -380,25 +242,27 @@ class LLMBundle:
logging.error("LLMBundle.chat can't update token usage for {}/CHAT llm_name: {}, used_tokens: {}".format(self.tenant_id, self.llm_name, used_tokens))
if self.langfuse:
generation.end(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.update(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.end()
return txt
def chat_streamly(self, system: str, history: list, gen_conf: dict={}, **kwargs):
def chat_streamly(self, system: str, history: list, gen_conf: dict = {}, **kwargs):
if self.langfuse:
generation = self.trace.generation(name="chat_streamly", model=self.llm_name, input={"system": system, "history": history})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="chat_streamly", model=self.llm_name, input={"system": system, "history": history})
ans = ""
chat_partial = partial(self.mdl.chat_streamly, system, history, gen_conf)
total_tokens = 0
if self.is_tools and self.mdl.is_tools:
chat_partial = partial(self.mdl.chat_streamly_with_tools, system, history, gen_conf)
for txt in chat_partial(**kwargs):
use_kwargs = self._clean_param(chat_partial, **kwargs)
for txt in chat_partial(**use_kwargs):
if isinstance(txt, int):
total_tokens = txt
if self.langfuse:
generation.end(output={"output": ans})
generation.update(output={"output": ans})
generation.end()
break
if txt.endswith("</think>"):

View File

@ -71,6 +71,8 @@ class SearchService(CommonService):
.first()
.to_dict()
)
if not search:
return {}
return search
@classmethod

View File

@ -0,0 +1,252 @@
#
# Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import logging
from langfuse import Langfuse
from api import settings
from api.db import LLMType
from api.db.db_models import DB, LLMFactories, TenantLLM
from api.db.services.common_service import CommonService
from api.db.services.langfuse_service import TenantLangfuseService
from api.db.services.user_service import TenantService
from rag.llm import ChatModel, CvModel, EmbeddingModel, RerankModel, Seq2txtModel, TTSModel
class LLMFactoriesService(CommonService):
model = LLMFactories
class TenantLLMService(CommonService):
model = TenantLLM
@classmethod
@DB.connection_context()
def get_api_key(cls, tenant_id, model_name):
mdlnm, fid = TenantLLMService.split_model_name_and_factory(model_name)
if not fid:
objs = cls.query(tenant_id=tenant_id, llm_name=mdlnm)
else:
objs = cls.query(tenant_id=tenant_id, llm_name=mdlnm, llm_factory=fid)
if (not objs) and fid:
if fid == "LocalAI":
mdlnm += "___LocalAI"
elif fid == "HuggingFace":
mdlnm += "___HuggingFace"
elif fid == "OpenAI-API-Compatible":
mdlnm += "___OpenAI-API"
elif fid == "VLLM":
mdlnm += "___VLLM"
objs = cls.query(tenant_id=tenant_id, llm_name=mdlnm, llm_factory=fid)
if not objs:
return
return objs[0]
@classmethod
@DB.connection_context()
def get_my_llms(cls, tenant_id):
fields = [cls.model.llm_factory, LLMFactories.logo, LLMFactories.tags, cls.model.model_type, cls.model.llm_name, cls.model.used_tokens]
objs = cls.model.select(*fields).join(LLMFactories, on=(cls.model.llm_factory == LLMFactories.name)).where(cls.model.tenant_id == tenant_id, ~cls.model.api_key.is_null()).dicts()
return list(objs)
@staticmethod
def split_model_name_and_factory(model_name):
arr = model_name.split("@")
if len(arr) < 2:
return model_name, None
if len(arr) > 2:
return "@".join(arr[0:-1]), arr[-1]
# model name must be xxx@yyy
try:
model_factories = settings.FACTORY_LLM_INFOS
model_providers = set([f["name"] for f in model_factories])
if arr[-1] not in model_providers:
return model_name, None
return arr[0], arr[-1]
except Exception as e:
logging.exception(f"TenantLLMService.split_model_name_and_factory got exception: {e}")
return model_name, None
@classmethod
@DB.connection_context()
def get_model_config(cls, tenant_id, llm_type, llm_name=None):
from api.db.services.llm_service import LLMService
e, tenant = TenantService.get_by_id(tenant_id)
if not e:
raise LookupError("Tenant not found")
if llm_type == LLMType.EMBEDDING.value:
mdlnm = tenant.embd_id if not llm_name else llm_name
elif llm_type == LLMType.SPEECH2TEXT.value:
mdlnm = tenant.asr_id
elif llm_type == LLMType.IMAGE2TEXT.value:
mdlnm = tenant.img2txt_id if not llm_name else llm_name
elif llm_type == LLMType.CHAT.value:
mdlnm = tenant.llm_id if not llm_name else llm_name
elif llm_type == LLMType.RERANK:
mdlnm = tenant.rerank_id if not llm_name else llm_name
elif llm_type == LLMType.TTS:
mdlnm = tenant.tts_id if not llm_name else llm_name
else:
assert False, "LLM type error"
model_config = cls.get_api_key(tenant_id, mdlnm)
mdlnm, fid = TenantLLMService.split_model_name_and_factory(mdlnm)
if not model_config: # for some cases seems fid mismatch
model_config = cls.get_api_key(tenant_id, mdlnm)
if model_config:
model_config = model_config.to_dict()
llm = LLMService.query(llm_name=mdlnm) if not fid else LLMService.query(llm_name=mdlnm, fid=fid)
if not llm and fid: # for some cases seems fid mismatch
llm = LLMService.query(llm_name=mdlnm)
if llm:
model_config["is_tools"] = llm[0].is_tools
if not model_config:
if llm_type in [LLMType.EMBEDDING, LLMType.RERANK]:
llm = LLMService.query(llm_name=mdlnm) if not fid else LLMService.query(llm_name=mdlnm, fid=fid)
if llm and llm[0].fid in ["Youdao", "FastEmbed", "BAAI"]:
model_config = {"llm_factory": llm[0].fid, "api_key": "", "llm_name": mdlnm, "api_base": ""}
if not model_config:
if mdlnm == "flag-embedding":
model_config = {"llm_factory": "Tongyi-Qianwen", "api_key": "", "llm_name": llm_name, "api_base": ""}
else:
if not mdlnm:
raise LookupError(f"Type of {llm_type} model is not set.")
raise LookupError("Model({}) not authorized".format(mdlnm))
return model_config
@classmethod
@DB.connection_context()
def model_instance(cls, tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs):
model_config = TenantLLMService.get_model_config(tenant_id, llm_type, llm_name)
kwargs.update({"provider": model_config["llm_factory"]})
if llm_type == LLMType.EMBEDDING.value:
if model_config["llm_factory"] not in EmbeddingModel:
return
return EmbeddingModel[model_config["llm_factory"]](model_config["api_key"], model_config["llm_name"], base_url=model_config["api_base"])
if llm_type == LLMType.RERANK:
if model_config["llm_factory"] not in RerankModel:
return
return RerankModel[model_config["llm_factory"]](model_config["api_key"], model_config["llm_name"], base_url=model_config["api_base"])
if llm_type == LLMType.IMAGE2TEXT.value:
if model_config["llm_factory"] not in CvModel:
return
return CvModel[model_config["llm_factory"]](model_config["api_key"], model_config["llm_name"], lang, base_url=model_config["api_base"], **kwargs)
if llm_type == LLMType.CHAT.value:
if model_config["llm_factory"] not in ChatModel:
return
return ChatModel[model_config["llm_factory"]](model_config["api_key"], model_config["llm_name"], base_url=model_config["api_base"], **kwargs)
if llm_type == LLMType.SPEECH2TEXT:
if model_config["llm_factory"] not in Seq2txtModel:
return
return Seq2txtModel[model_config["llm_factory"]](key=model_config["api_key"], model_name=model_config["llm_name"], lang=lang, base_url=model_config["api_base"])
if llm_type == LLMType.TTS:
if model_config["llm_factory"] not in TTSModel:
return
return TTSModel[model_config["llm_factory"]](
model_config["api_key"],
model_config["llm_name"],
base_url=model_config["api_base"],
)
@classmethod
@DB.connection_context()
def increase_usage(cls, tenant_id, llm_type, used_tokens, llm_name=None):
e, tenant = TenantService.get_by_id(tenant_id)
if not e:
logging.error(f"Tenant not found: {tenant_id}")
return 0
llm_map = {
LLMType.EMBEDDING.value: tenant.embd_id if not llm_name else llm_name,
LLMType.SPEECH2TEXT.value: tenant.asr_id,
LLMType.IMAGE2TEXT.value: tenant.img2txt_id,
LLMType.CHAT.value: tenant.llm_id if not llm_name else llm_name,
LLMType.RERANK.value: tenant.rerank_id if not llm_name else llm_name,
LLMType.TTS.value: tenant.tts_id if not llm_name else llm_name,
}
mdlnm = llm_map.get(llm_type)
if mdlnm is None:
logging.error(f"LLM type error: {llm_type}")
return 0
llm_name, llm_factory = TenantLLMService.split_model_name_and_factory(mdlnm)
try:
num = (
cls.model.update(used_tokens=cls.model.used_tokens + used_tokens)
.where(cls.model.tenant_id == tenant_id, cls.model.llm_name == llm_name, cls.model.llm_factory == llm_factory if llm_factory else True)
.execute()
)
except Exception:
logging.exception("TenantLLMService.increase_usage got exception,Failed to update used_tokens for tenant_id=%s, llm_name=%s", tenant_id, llm_name)
return 0
return num
@classmethod
@DB.connection_context()
def get_openai_models(cls):
objs = cls.model.select().where((cls.model.llm_factory == "OpenAI"), ~(cls.model.llm_name == "text-embedding-3-small"), ~(cls.model.llm_name == "text-embedding-3-large")).dicts()
return list(objs)
@staticmethod
def llm_id2llm_type(llm_id: str) -> str | None:
from api.db.services.llm_service import LLMService
llm_id, *_ = TenantLLMService.split_model_name_and_factory(llm_id)
llm_factories = settings.FACTORY_LLM_INFOS
for llm_factory in llm_factories:
for llm in llm_factory["llm"]:
if llm_id == llm["llm_name"]:
return llm["model_type"].split(",")[-1]
for llm in LLMService.query(llm_name=llm_id):
return llm.model_type
llm = TenantLLMService.get_or_none(llm_name=llm_id)
if llm:
return llm.model_type
for llm in TenantLLMService.query(llm_name=llm_id):
return llm.model_type
class LLM4Tenant:
def __init__(self, tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs):
self.tenant_id = tenant_id
self.llm_type = llm_type
self.llm_name = llm_name
self.mdl = TenantLLMService.model_instance(tenant_id, llm_type, llm_name, lang=lang, **kwargs)
assert self.mdl, "Can't find model for {}/{}/{}".format(tenant_id, llm_type, llm_name)
model_config = TenantLLMService.get_model_config(tenant_id, llm_type, llm_name)
self.max_length = model_config.get("max_tokens", 8192)
self.is_tools = model_config.get("is_tools", False)
self.verbose_tool_use = kwargs.get("verbose_tool_use")
langfuse_keys = TenantLangfuseService.filter_by_tenant(tenant_id=tenant_id)
self.langfuse = None
if langfuse_keys:
langfuse = Langfuse(public_key=langfuse_keys.public_key, secret_key=langfuse_keys.secret_key, host=langfuse_keys.host)
if langfuse.auth_check():
self.langfuse = langfuse
trace_id = self.langfuse.create_trace_id()
self.trace_context = {"trace_id": trace_id}

View File

@ -33,7 +33,7 @@ import uuid
from werkzeug.serving import run_simple
from api import settings
from api.apps import app
from api.apps import app, smtp_mail_server
from api.db.runtime_config import RuntimeConfig
from api.db.services.document_service import DocumentService
from api import utils
@ -59,11 +59,14 @@ def update_progress():
if redis_lock.acquire():
DocumentService.update_progress()
redis_lock.release()
stop_event.wait(6)
except Exception:
logging.exception("update_progress exception")
finally:
redis_lock.release()
try:
redis_lock.release()
except Exception:
logging.exception("update_progress exception")
stop_event.wait(6)
def signal_handler(sig, frame):
logging.info("Received interrupt signal, shutting down...")
@ -74,11 +77,11 @@ def signal_handler(sig, frame):
if __name__ == '__main__':
logging.info(r"""
____ ___ ______ ______ __
____ ___ ______ ______ __
/ __ \ / | / ____// ____// /____ _ __
/ /_/ // /| | / / __ / /_ / // __ \| | /| / /
/ _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ /
/_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/
/ _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ /
/_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/
""")
logging.info(
@ -137,6 +140,18 @@ if __name__ == '__main__':
else:
threading.Timer(1.0, delayed_start_update_progress).start()
# init smtp server
if settings.SMTP_CONF:
app.config["MAIL_SERVER"] = settings.MAIL_SERVER
app.config["MAIL_PORT"] = settings.MAIL_PORT
app.config["MAIL_USE_SSL"] = settings.MAIL_USE_SSL
app.config["MAIL_USE_TLS"] = settings.MAIL_USE_TLS
app.config["MAIL_USERNAME"] = settings.MAIL_USERNAME
app.config["MAIL_PASSWORD"] = settings.MAIL_PASSWORD
app.config["MAIL_DEFAULT_SENDER"] = settings.MAIL_DEFAULT_SENDER
smtp_mail_server.init_app(app)
# start http server
try:
logging.info("RAGFlow HTTP server start...")

View File

@ -38,6 +38,11 @@ EMBEDDING_MDL = ""
RERANK_MDL = ""
ASR_MDL = ""
IMAGE2TEXT_MDL = ""
CHAT_CFG = ""
EMBEDDING_CFG = ""
RERANK_CFG = ""
ASR_CFG = ""
IMAGE2TEXT_CFG = ""
API_KEY = None
PARSERS = None
HOST_IP = None
@ -70,26 +75,36 @@ REGISTER_ENABLED = 1
# sandbox-executor-manager
SANDBOX_ENABLED = 0
SANDBOX_HOST = None
STRONG_TEST_COUNT = int(os.environ.get("STRONG_TEST_COUNT", "8"))
BUILTIN_EMBEDDING_MODELS = ["BAAI/bge-large-zh-v1.5@BAAI", "maidalun1020/bce-embedding-base_v1@Youdao"]
SMTP_CONF = None
MAIL_SERVER = ""
MAIL_PORT = 000
MAIL_USE_SSL= True
MAIL_USE_TLS = False
MAIL_USERNAME = ""
MAIL_PASSWORD = ""
MAIL_DEFAULT_SENDER = ()
MAIL_FRONTEND_URL = ""
def get_or_create_secret_key():
secret_key = os.environ.get("RAGFLOW_SECRET_KEY")
if secret_key and len(secret_key) >= 32:
return secret_key
# Check if there's a configured secret key
configured_key = get_base_config(RAG_FLOW_SERVICE_NAME, {}).get("secret_key")
if configured_key and configured_key != str(date.today()) and len(configured_key) >= 32:
return configured_key
# Generate a new secure key and warn about it
import logging
new_key = secrets.token_hex(32)
logging.warning(
"SECURITY WARNING: Using auto-generated SECRET_KEY. "
f"Generated key: {new_key}"
)
logging.warning(f"SECURITY WARNING: Using auto-generated SECRET_KEY. Generated key: {new_key}")
return new_key
@ -98,10 +113,10 @@ def init_settings():
LIGHTEN = int(os.environ.get("LIGHTEN", "0"))
DATABASE_TYPE = os.getenv("DB_TYPE", "mysql")
DATABASE = decrypt_database_config(name=DATABASE_TYPE)
LLM = get_base_config("user_default_llm", {})
LLM_DEFAULT_MODELS = LLM.get("default_models", {})
LLM_FACTORY = LLM.get("factory")
LLM_BASE_URL = LLM.get("base_url")
LLM = get_base_config("user_default_llm", {}) or {}
LLM_DEFAULT_MODELS = LLM.get("default_models", {}) or {}
LLM_FACTORY = LLM.get("factory", "") or ""
LLM_BASE_URL = LLM.get("base_url", "") or ""
try:
REGISTER_ENABLED = int(os.environ.get("REGISTER_ENABLED", "1"))
except Exception:
@ -114,29 +129,34 @@ def init_settings():
FACTORY_LLM_INFOS = []
global CHAT_MDL, EMBEDDING_MDL, RERANK_MDL, ASR_MDL, IMAGE2TEXT_MDL
global CHAT_CFG, EMBEDDING_CFG, RERANK_CFG, ASR_CFG, IMAGE2TEXT_CFG
if not LIGHTEN:
EMBEDDING_MDL = BUILTIN_EMBEDDING_MODELS[0]
if LLM_DEFAULT_MODELS:
CHAT_MDL = LLM_DEFAULT_MODELS.get("chat_model", CHAT_MDL)
EMBEDDING_MDL = LLM_DEFAULT_MODELS.get("embedding_model", EMBEDDING_MDL)
RERANK_MDL = LLM_DEFAULT_MODELS.get("rerank_model", RERANK_MDL)
ASR_MDL = LLM_DEFAULT_MODELS.get("asr_model", ASR_MDL)
IMAGE2TEXT_MDL = LLM_DEFAULT_MODELS.get("image2text_model", IMAGE2TEXT_MDL)
# factory can be specified in the config name with "@". LLM_FACTORY will be used if not specified
CHAT_MDL = CHAT_MDL + (f"@{LLM_FACTORY}" if "@" not in CHAT_MDL and CHAT_MDL != "" else "")
EMBEDDING_MDL = EMBEDDING_MDL + (f"@{LLM_FACTORY}" if "@" not in EMBEDDING_MDL and EMBEDDING_MDL != "" else "")
RERANK_MDL = RERANK_MDL + (f"@{LLM_FACTORY}" if "@" not in RERANK_MDL and RERANK_MDL != "" else "")
ASR_MDL = ASR_MDL + (f"@{LLM_FACTORY}" if "@" not in ASR_MDL and ASR_MDL != "" else "")
IMAGE2TEXT_MDL = IMAGE2TEXT_MDL + (f"@{LLM_FACTORY}" if "@" not in IMAGE2TEXT_MDL and IMAGE2TEXT_MDL != "" else "")
global API_KEY, PARSERS, HOST_IP, HOST_PORT, SECRET_KEY
API_KEY = LLM.get("api_key")
PARSERS = LLM.get(
"parsers", "naive:General,qa:Q&A,resume:Resume,manual:Manual,table:Table,paper:Paper,book:Book,laws:Laws,presentation:Presentation,picture:Picture,one:One,audio:Audio,email:Email,tag:Tag"
)
chat_entry = _parse_model_entry(LLM_DEFAULT_MODELS.get("chat_model", CHAT_MDL))
embedding_entry = _parse_model_entry(LLM_DEFAULT_MODELS.get("embedding_model", EMBEDDING_MDL))
rerank_entry = _parse_model_entry(LLM_DEFAULT_MODELS.get("rerank_model", RERANK_MDL))
asr_entry = _parse_model_entry(LLM_DEFAULT_MODELS.get("asr_model", ASR_MDL))
image2text_entry = _parse_model_entry(LLM_DEFAULT_MODELS.get("image2text_model", IMAGE2TEXT_MDL))
CHAT_CFG = _resolve_per_model_config(chat_entry, LLM_FACTORY, API_KEY, LLM_BASE_URL)
EMBEDDING_CFG = _resolve_per_model_config(embedding_entry, LLM_FACTORY, API_KEY, LLM_BASE_URL)
RERANK_CFG = _resolve_per_model_config(rerank_entry, LLM_FACTORY, API_KEY, LLM_BASE_URL)
ASR_CFG = _resolve_per_model_config(asr_entry, LLM_FACTORY, API_KEY, LLM_BASE_URL)
IMAGE2TEXT_CFG = _resolve_per_model_config(image2text_entry, LLM_FACTORY, API_KEY, LLM_BASE_URL)
CHAT_MDL = CHAT_CFG.get("model", "") or ""
EMBEDDING_MDL = EMBEDDING_CFG.get("model", "") or ""
RERANK_MDL = RERANK_CFG.get("model", "") or ""
ASR_MDL = ASR_CFG.get("model", "") or ""
IMAGE2TEXT_MDL = IMAGE2TEXT_CFG.get("model", "") or ""
HOST_IP = get_base_config(RAG_FLOW_SERVICE_NAME, {}).get("host", "127.0.0.1")
HOST_PORT = get_base_config(RAG_FLOW_SERVICE_NAME, {}).get("http_port")
@ -169,12 +189,28 @@ def init_settings():
retrievaler = search.Dealer(docStoreConn)
from graphrag import search as kg_search
kg_retrievaler = kg_search.KGSearch(docStoreConn)
if int(os.environ.get("SANDBOX_ENABLED", "0")):
global SANDBOX_HOST
SANDBOX_HOST = os.environ.get("SANDBOX_HOST", "sandbox-executor-manager")
global SMTP_CONF, MAIL_SERVER, MAIL_PORT, MAIL_USE_SSL, MAIL_USE_TLS
global MAIL_USERNAME, MAIL_PASSWORD, MAIL_DEFAULT_SENDER, MAIL_FRONTEND_URL
SMTP_CONF = get_base_config("smtp", {})
MAIL_SERVER = SMTP_CONF.get("mail_server", "")
MAIL_PORT = SMTP_CONF.get("mail_port", 000)
MAIL_USE_SSL = SMTP_CONF.get("mail_use_ssl", True)
MAIL_USE_TLS = SMTP_CONF.get("mail_use_tls", False)
MAIL_USERNAME = SMTP_CONF.get("mail_username", "")
MAIL_PASSWORD = SMTP_CONF.get("mail_password", "")
mail_default_sender = SMTP_CONF.get("mail_default_sender", [])
if mail_default_sender and len(mail_default_sender) >= 2:
MAIL_DEFAULT_SENDER = (mail_default_sender[0], mail_default_sender[1])
MAIL_FRONTEND_URL = SMTP_CONF.get("mail_frontend_url", "")
class CustomEnum(Enum):
@classmethod
@ -209,3 +245,34 @@ class RetCode(IntEnum, CustomEnum):
SERVER_ERROR = 500
FORBIDDEN = 403
NOT_FOUND = 404
def _parse_model_entry(entry):
if isinstance(entry, str):
return {"name": entry, "factory": None, "api_key": None, "base_url": None}
if isinstance(entry, dict):
name = entry.get("name") or entry.get("model") or ""
return {
"name": name,
"factory": entry.get("factory"),
"api_key": entry.get("api_key"),
"base_url": entry.get("base_url"),
}
return {"name": "", "factory": None, "api_key": None, "base_url": None}
def _resolve_per_model_config(entry_dict, backup_factory, backup_api_key, backup_base_url):
name = (entry_dict.get("name") or "").strip()
m_factory = entry_dict.get("factory") or backup_factory or ""
m_api_key = entry_dict.get("api_key") or backup_api_key or ""
m_base_url = entry_dict.get("base_url") or backup_base_url or ""
if name and "@" not in name and m_factory:
name = f"{name}@{m_factory}"
return {
"model": name,
"factory": m_factory,
"api_key": m_api_key,
"base_url": m_base_url,
}

View File

@ -17,6 +17,7 @@ import asyncio
import functools
import json
import logging
import os
import queue
import random
import threading
@ -48,7 +49,8 @@ from werkzeug.http import HTTP_STATUS_CODES
from api import settings
from api.constants import REQUEST_MAX_WAIT_SEC, REQUEST_WAIT_SEC
from api.db.db_models import APIToken
from api.db.services.llm_service import LLMService, TenantLLMService
from api.db.services.llm_service import LLMService
from api.db.services.tenant_llm_service import TenantLLMService
from api.utils import CustomJSONEncoder, get_uuid, json_dumps
from rag.utils.mcp_tool_call_conn import MCPToolCallSession, close_multiple_mcp_toolcall_sessions
@ -352,7 +354,7 @@ def get_parser_config(chunk_method, parser_config):
if not chunk_method:
chunk_method = "naive"
# Define default configurations for each chunk method
# Define default configurations for each chunking method
key_mapping = {
"naive": {"chunk_token_num": 512, "delimiter": r"\n", "html4excel": False, "layout_recognize": "DeepDOC", "raptor": {"use_raptor": False}, "graphrag": {"use_graphrag": False}},
"qa": {"raptor": {"use_raptor": False}, "graphrag": {"use_graphrag": False}},
@ -402,8 +404,22 @@ def get_data_openai(
finish_reason=None,
object="chat.completion",
param=None,
stream=False
):
total_tokens = prompt_tokens + completion_tokens
if stream:
return {
"id": f"{id}",
"object": "chat.completion.chunk",
"model": model,
"choices": [{
"delta": {"content": content},
"finish_reason": finish_reason,
"index": 0,
}],
}
return {
"id": f"{id}",
"object": object,
@ -414,9 +430,21 @@ def get_data_openai(
"prompt_tokens": prompt_tokens,
"completion_tokens": completion_tokens,
"total_tokens": total_tokens,
"completion_tokens_details": {"reasoning_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0},
"completion_tokens_details": {
"reasoning_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0,
},
},
"choices": [{"message": {"role": "assistant", "content": content}, "logprobs": None, "finish_reason": finish_reason, "index": 0}],
"choices": [{
"message": {
"role": "assistant",
"content": content
},
"logprobs": None,
"finish_reason": finish_reason,
"index": 0,
}],
}
@ -640,7 +668,10 @@ def timeout(seconds: float | int = None, attempts: int = 2, *, exception: Option
for a in range(attempts):
try:
result = result_queue.get(timeout=seconds)
if os.environ.get("ENABLE_TIMEOUT_ASSERTION"):
result = result_queue.get(timeout=seconds)
else:
result = result_queue.get()
if isinstance(result, Exception):
raise result
return result
@ -655,7 +686,10 @@ def timeout(seconds: float | int = None, attempts: int = 2, *, exception: Option
for a in range(attempts):
try:
with trio.fail_after(seconds):
if os.environ.get("ENABLE_TIMEOUT_ASSERTION"):
with trio.fail_after(seconds):
return await func(*args, **kwargs)
else:
return await func(*args, **kwargs)
except trio.TooSlowError:
if a < attempts - 1:
@ -687,7 +721,13 @@ def timeout(seconds: float | int = None, attempts: int = 2, *, exception: Option
async def is_strong_enough(chat_model, embedding_model):
@timeout(30, 2)
count = settings.STRONG_TEST_COUNT
if not chat_model or not embedding_model:
return
if isinstance(count, int) and count <= 0:
return
@timeout(60, 2)
async def _is_strong_enough():
nonlocal chat_model, embedding_model
if embedding_model:
@ -701,5 +741,5 @@ async def is_strong_enough(chat_model, embedding_model):
# Pressure test for GraphRAG task
async with trio.open_nursery() as nursery:
for _ in range(32):
for _ in range(count):
nursery.start_soon(_is_strong_enough)

View File

@ -1 +1,3 @@
import base64
test_image_base64 = "iVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAIAAAD/gAIDAAAA6ElEQVR4nO3QwQ3AIBDAsIP9d25XIC+EZE8QZc18w5l9O+AlZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBWYFZgVmBT+IYAHHLHkdEgAAAABJRU5ErkJggg=="
test_image = base64.b64decode(test_image_base64)

View File

@ -21,6 +21,9 @@ import re
import socket
from urllib.parse import urlparse
from api.apps import smtp_mail_server
from flask_mail import Message
from flask import render_template_string
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.chrome.options import Options
@ -31,6 +34,7 @@ from selenium.webdriver.support.ui import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager
CONTENT_TYPE_MAP = {
# Office
"docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
@ -172,3 +176,26 @@ def get_float(req: dict, key: str, default: float | int = 10.0) -> float:
return parsed if parsed > 0 else default
except (TypeError, ValueError):
return default
INVITE_EMAIL_TMPL = """
<p>Hi {{email}},</p>
<p>{{inviter}} has invited you to join their team (ID: {{tenant_id}}).</p>
<p>Click the link below to complete your registration:<br>
<a href="{{invite_url}}">{{invite_url}}</a></p>
<p>If you did not request this, please ignore this email.</p>
"""
def send_invite_email(to_email, invite_url, tenant_id, inviter):
from api.apps import app
with app.app_context():
msg = Message(subject="RAGFlow Invitation",
recipients=[to_email])
msg.html = render_template_string(
INVITE_EMAIL_TMPL,
email=to_email,
invite_url=invite_url,
tenant_id=tenant_id,
inviter=inviter,
)
smtp_mail_server.send(msg)

View File

@ -6,6 +6,34 @@
"tags": "LLM,TEXT EMBEDDING,TTS,TEXT RE-RANK,SPEECH2TEXT,MODERATION",
"status": "1",
"llm": [
{
"llm_name": "gpt-5",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-mini",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-nano",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gpt-5-chat-latest",
"tags": "LLM,CHAT,400k,IMAGE2TEXT",
"max_tokens": 400000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "gpt-4.1",
"tags": "LLM,CHAT,1M,IMAGE2TEXT",
@ -477,6 +505,24 @@
"tags": "RE-RANK,4k",
"max_tokens": 4000,
"model_type": "rerank"
},
{
"llm_name": "qwen-audio-asr",
"tags": "SPEECH2TEXT,8k",
"max_tokens": 8000,
"model_type": "speech2text"
},
{
"llm_name": "qwen-audio-asr-latest",
"tags": "SPEECH2TEXT,8k",
"max_tokens": 8000,
"model_type": "speech2text"
},
{
"llm_name": "qwen-audio-asr-1204",
"tags": "SPEECH2TEXT,8k",
"max_tokens": 8000,
"model_type": "speech2text"
}
]
},
@ -486,23 +532,65 @@
"tags": "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION",
"status": "1",
"llm": [
{
"llm_name": "glm-4.5",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5-x",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5-air",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5-airx",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5-flash",
"tags": "LLM,CHAT,128k",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4.5v",
"tags": "LLM,IMAGE2TEXT,64,",
"max_tokens": 64000,
"model_type": "image2text",
"is_tools": false
},
{
"llm_name": "glm-4-plus",
"tags": "LLM,CHAT,",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4-0520",
"tags": "LLM,CHAT,",
"tags": "LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "glm-4",
"tags": "LLM,CHAT,",
"tags":"LLM,CHAT,128K",
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
@ -1118,60 +1206,35 @@
"llm_name": "gemini-2.5-flash",
"tags": "LLM,CHAT,1024K,IMAGE2TEXT",
"max_tokens": 1048576,
"model_type": "image2text",
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-pro",
"tags": "LLM,CHAT,IMAGE2TEXT,1024K",
"max_tokens": 1048576,
"model_type": "image2text",
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.5-flash-preview-05-20",
"llm_name": "gemini-2.5-flash-lite",
"tags": "LLM,CHAT,1024K,IMAGE2TEXT",
"max_tokens": 1048576,
"model_type": "image2text",
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-2.0-flash-001",
"tags": "LLM,CHAT,1024K",
"max_tokens": 1048576,
"model_type": "image2text",
"is_tools": true
},
{
"llm_name": "gemini-2.0-flash-thinking-exp-01-21",
"llm_name": "gemini-2.0-flash",
"tags": "LLM,CHAT,1024K",
"max_tokens": 1048576,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "gemini-1.5-flash",
"tags": "LLM,IMAGE2TEXT,1024K",
"llm_name": "gemini-2.0-flash-lite",
"tags": "LLM,CHAT,1024K",
"max_tokens": 1048576,
"model_type": "image2text"
},
{
"llm_name": "gemini-2.5-pro-preview-05-06",
"tags": "LLM,IMAGE2TEXT,1024K",
"max_tokens": 1048576,
"model_type": "image2text"
},
{
"llm_name": "gemini-1.5-pro",
"tags": "LLM,IMAGE2TEXT,2048K",
"max_tokens": 2097152,
"model_type": "image2text"
},
{
"llm_name": "gemini-1.5-flash-8b",
"tags": "LLM,IMAGE2TEXT,1024K",
"max_tokens": 1048576,
"model_type": "image2text",
"model_type": "chat",
"is_tools": true
},
{
@ -2598,234 +2661,255 @@
"tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK,IMAGE2TEXT",
"status": "1",
"llm": [
{
"llm_name": "Qwen3-Embedding-8B",
"tags": "TEXT EMBEDDING,TEXT RE-RANK,32k",
"max_tokens": 32000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen3-Embedding-4B",
"tags": "TEXT EMBEDDING,TEXT RE-RANK,32k",
"max_tokens": 32000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen3-Embedding-0.6B",
"tags": "TEXT EMBEDDING,TEXT RE-RANK,32k",
"max_tokens": 32000,
"model_type": "embedding",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen3-235B-A22B",
"tags": "LLM,CHAT,128k",
"max_tokens": 8192,
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-30B-A3B",
"tags": "LLM,CHAT,128k",
"max_tokens": 8192,
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-32B",
"tags": "LLM,CHAT,128k",
"max_tokens": 8192,
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-14B",
"tags": "LLM,CHAT,128k",
"max_tokens": 8192,
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen3-8B",
"tags": "LLM,CHAT,64k",
"max_tokens": 8192,
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/QVQ-72B-Preview",
"tags": "LLM,CHAT,IMAGE2TEXT,32k",
"max_tokens": 16384,
"max_tokens": 32000,
"model_type": "image2text",
"is_tools": false
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-R1",
"tags": "LLM,CHAT,64k",
"max_tokens": 16384,
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1",
"tags": "LLM,CHAT,64k",
"max_tokens": 16384,
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-V3",
"tags": "LLM,CHAT,64k",
"max_tokens": 8192,
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V3",
"tags": "LLM,CHAT,64k",
"max_tokens": 8192,
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-V3-1226",
"tags": "LLM,CHAT,64k",
"max_tokens": 4096,
"max_tokens": 64000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
"tags": "LLM,CHAT,32k",
"max_tokens": 16384,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B",
"tags": "LLM,CHAT,32k",
"max_tokens": 16384,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
"tags": "LLM,CHAT,32k",
"max_tokens": 16384,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
"tags": "LLM,CHAT,32k",
"max_tokens": 16384,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
"tags": "LLM,CHAT,32k",
"max_tokens": 16384,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
"tags": "LLM,CHAT,32k",
"max_tokens": 16384,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "deepseek-ai/DeepSeek-V2.5",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/QwQ-32B",
"tags": "LLM,CHAT,32k",
"max_tokens": 32768,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-VL-72B-Instruct",
"tags": "LLM,CHAT,IMAGE2TEXT,128k",
"max_tokens": 4096,
"max_tokens": 128000,
"model_type": "image2text",
"is_tools": true
},
{
"llm_name": "Pro/Qwen/Qwen2.5-VL-7B-Instruct",
"tags": "LLM,CHAT,IMAGE2TEXT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "image2text",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-Z1-32B-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-4-32B-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 8192,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-Z1-9B-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 8192,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/GLM-4-9B-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "THUDM/chatglm3-6b",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Pro/THUDM/glm-4-9b-chat",
"tags": "LLM,CHAT,128k",
"max_tokens": 4096,
"max_tokens": 128000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "THUDM/GLM-Z1-Rumination-32B-0414",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "THUDM/glm-4-9b-chat",
"tags": "LLM,CHAT,128k",
"max_tokens": 4096,
"max_tokens": 128000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/QwQ-32B-Preview",
"tags": "LLM,CHAT,32k",
"max_tokens": 8192,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen2.5-Coder-32B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen2-VL-72B-Instruct",
"tags": "LLM,IMAGE2TEXT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "image2text",
"is_tools": false
},
{
"llm_name": "Qwen/Qwen2.5-72B-Instruct-128Kt",
"tags": "LLM,IMAGE2TEXT,128k",
"max_tokens": 4096,
"max_tokens": 128000,
"model_type": "image2text",
"is_tools": false
},
@ -2839,98 +2923,98 @@
{
"llm_name": "Qwen/Qwen2.5-72B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-32B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-14B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-7B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2.5-Coder-7B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "internlm/internlm2_5-20b-chat",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "internlm/internlm2_5-7b-chat",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2-7B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Qwen/Qwen2-1.5B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/Qwen/Qwen2.5-Coder-7B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Pro/Qwen/Qwen2-VL-7B-Instruct",
"tags": "LLM,CHAT,IMAGE2TEXT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "image2text",
"is_tools": false
},
{
"llm_name": "Pro/Qwen/Qwen2.5-7B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "Pro/Qwen/Qwen2-7B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
{
"llm_name": "Pro/Qwen/Qwen2-1.5B-Instruct",
"tags": "LLM,CHAT,32k",
"max_tokens": 4096,
"max_tokens": 32000,
"model_type": "chat",
"is_tools": false
},
@ -3267,45 +3351,52 @@
"status": "1",
"llm": [
{
"llm_name": "claude-opus-4-20250514",
"tags": "LLM,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "image2text",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514",
"tags": "LLM,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "image2text",
"is_tools": true
},
{
"llm_name": "claude-3-7-sonnet-20250219",
"tags": "LLM,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "image2text",
"is_tools": true
},
{
"llm_name": "claude-3-5-sonnet-20241022",
"tags": "LLM,IMAGE2TEXT,200k",
"llm_name": "claude-opus-4-1-20250805",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-opus-20240229",
"tags": "LLM,IMAGE2TEXT,200k",
"llm_name": "claude-opus-4-20250514",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-sonnet-4-20250514",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-7-sonnet-20250219",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-5-sonnet-20241022",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-5-haiku-20241022",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "chat",
"is_tools": true
},
{
"llm_name": "claude-3-haiku-20240307",
"tags": "LLM,IMAGE2TEXT,200k",
"tags": "LLM,CHAT,IMAGE2TEXT,200k",
"max_tokens": 204800,
"model_type": "image2text",
"model_type": "chat",
"is_tools": true
}
]

View File

@ -64,9 +64,21 @@ redis:
# config:
# oss_table: 'opendal_storage'
# user_default_llm:
# factory: 'Tongyi-Qianwen'
# api_key: 'sk-xxxxxxxxxxxxx'
# base_url: ''
# factory: 'BAAI'
# api_key: 'backup'
# base_url: 'backup_base_url'
# default_models:
# chat_model:
# name: 'qwen2.5-7b-instruct'
# factory: 'xxxx'
# api_key: 'xxxx'
# base_url: 'https://api.xx.com'
# embedding_model:
# name: 'bge-m3'
# rerank_model: 'bge-reranker-v2'
# asr_model:
# model: 'whisper-large-v3' # alias of name
# image2text_model: ''
# oauth:
# oauth2:
# display_name: "OAuth2"
@ -101,3 +113,14 @@ redis:
# switch: false
# component: false
# dataset: false
# smtp:
# mail_server: ""
# mail_port: 465
# mail_use_ssl: true
# mail_use_tls: false
# mail_username: ""
# mail_password: ""
# mail_default_sender:
# - "RAGFlow" # display name
# - "" # sender email address
# mail_frontend_url: "https://your-frontend.example.com"

View File

@ -14,13 +14,15 @@
# limitations under the License.
#
from .pdf_parser import RAGFlowPdfParser as PdfParser, PlainParser
from .docx_parser import RAGFlowDocxParser as DocxParser
from .excel_parser import RAGFlowExcelParser as ExcelParser
from .ppt_parser import RAGFlowPptParser as PptParser
from .html_parser import RAGFlowHtmlParser as HtmlParser
from .json_parser import RAGFlowJsonParser as JsonParser
from .markdown_parser import MarkdownElementExtractor
from .markdown_parser import RAGFlowMarkdownParser as MarkdownParser
from .pdf_parser import PlainParser
from .pdf_parser import RAGFlowPdfParser as PdfParser
from .ppt_parser import RAGFlowPptParser as PptParser
from .txt_parser import RAGFlowTxtParser as TxtParser
__all__ = [
@ -33,4 +35,6 @@ __all__ = [
"JsonParser",
"MarkdownParser",
"TxtParser",
]
"MarkdownElementExtractor",
]

View File

@ -12,6 +12,7 @@
#
import logging
import re
import sys
from io import BytesIO
@ -20,6 +21,8 @@ from openpyxl import Workbook, load_workbook
from rag.nlp import find_codec
# copied from `/openpyxl/cell/cell.py`
ILLEGAL_CHARACTERS_RE = re.compile(r'[\000-\010]|[\013-\014]|[\016-\037]')
class RAGFlowExcelParser:
@ -50,13 +53,29 @@ class RAGFlowExcelParser:
logging.info(f"openpyxl load error: {e}, try pandas instead")
try:
file_like_object.seek(0)
df = pd.read_excel(file_like_object)
return RAGFlowExcelParser._dataframe_to_workbook(df)
try:
df = pd.read_excel(file_like_object)
return RAGFlowExcelParser._dataframe_to_workbook(df)
except Exception as ex:
logging.info(f"pandas with default engine load error: {ex}, try calamine instead")
file_like_object.seek(0)
df = pd.read_excel(file_like_object, engine='calamine')
return RAGFlowExcelParser._dataframe_to_workbook(df)
except Exception as e_pandas:
raise Exception(f"pandas.read_excel error: {e_pandas}, original openpyxl error: {e}")
@staticmethod
def _clean_dataframe(df: pd.DataFrame):
def clean_string(s):
if isinstance(s, str):
return ILLEGAL_CHARACTERS_RE.sub(" ", s)
return s
return df.apply(lambda col: col.map(clean_string))
@staticmethod
def _dataframe_to_workbook(df):
df = RAGFlowExcelParser._clean_dataframe(df)
wb = Workbook()
ws = wb.active
ws.title = "Data"
@ -71,9 +90,17 @@ class RAGFlowExcelParser:
return wb
def html(self, fnm, chunk_rows=256):
from html import escape
file_like_object = BytesIO(fnm) if not isinstance(fnm, str) else fnm
wb = RAGFlowExcelParser._load_excel_to_workbook(file_like_object)
tb_chunks = []
def _fmt(v):
if v is None:
return ""
return str(v).strip()
for sheetname in wb.sheetnames:
ws = wb[sheetname]
rows = list(ws.rows)
@ -82,7 +109,7 @@ class RAGFlowExcelParser:
tb_rows_0 = "<tr>"
for t in list(rows[0]):
tb_rows_0 += f"<th>{t.value}</th>"
tb_rows_0 += f"<th>{escape(_fmt(t.value))}</th>"
tb_rows_0 += "</tr>"
for chunk_i in range((len(rows) - 1) // chunk_rows + 1):
@ -90,7 +117,7 @@ class RAGFlowExcelParser:
tb += f"<table><caption>{sheetname}</caption>"
tb += tb_rows_0
for r in list(
rows[1 + chunk_i * chunk_rows: 1 + (chunk_i + 1) * chunk_rows]
rows[1 + chunk_i * chunk_rows: min(1 + (chunk_i + 1) * chunk_rows, len(rows))]
):
tb += "<tr>"
for i, c in enumerate(r):

View File

@ -15,35 +15,200 @@
# limitations under the License.
#
from rag.nlp import find_codec
import readability
import html_text
from rag.nlp import find_codec, rag_tokenizer
import uuid
import chardet
from bs4 import BeautifulSoup, NavigableString, Tag, Comment
import html
def get_encoding(file):
with open(file,'rb') as f:
tmp = chardet.detect(f.read())
return tmp['encoding']
BLOCK_TAGS = [
"h1", "h2", "h3", "h4", "h5", "h6",
"p", "div", "article", "section", "aside",
"ul", "ol", "li",
"table", "pre", "code", "blockquote",
"figure", "figcaption"
]
TITLE_TAGS = {"h1": "#", "h2": "##", "h3": "###", "h4": "#####", "h5": "#####", "h6": "######"}
class RAGFlowHtmlParser:
def __call__(self, fnm, binary=None):
def __call__(self, fnm, binary=None, chunk_token_num=None):
if binary:
encoding = find_codec(binary)
txt = binary.decode(encoding, errors="ignore")
else:
with open(fnm, "r",encoding=get_encoding(fnm)) as f:
txt = f.read()
return self.parser_txt(txt)
return self.parser_txt(txt, chunk_token_num)
@classmethod
def parser_txt(cls, txt):
def parser_txt(cls, txt, chunk_token_num):
if not isinstance(txt, str):
raise TypeError("txt type should be string!")
html_doc = readability.Document(txt)
title = html_doc.title()
content = html_text.extract_text(html_doc.summary(html_partial=True))
txt = f"{title}\n{content}"
sections = txt.split("\n")
temp_sections = []
soup = BeautifulSoup(txt, "html5lib")
# delete <style> tag
for style_tag in soup.find_all(["style", "script"]):
style_tag.decompose()
# delete <script> tag in <div>
for div_tag in soup.find_all("div"):
for script_tag in div_tag.find_all("script"):
script_tag.decompose()
# delete inline style
for tag in soup.find_all(True):
if 'style' in tag.attrs:
del tag.attrs['style']
# delete HTML comment
for comment in soup.find_all(string=lambda text: isinstance(text, Comment)):
comment.extract()
cls.read_text_recursively(soup.body, temp_sections, chunk_token_num=chunk_token_num)
block_txt_list, table_list = cls.merge_block_text(temp_sections)
sections = cls.chunk_block(block_txt_list, chunk_token_num=chunk_token_num)
for table in table_list:
sections.append(table.get("content", ""))
return sections
@classmethod
def split_table(cls, html_table, chunk_token_num=512):
soup = BeautifulSoup(html_table, "html.parser")
rows = soup.find_all("tr")
tables = []
current_table = []
current_count = 0
table_str_list = []
for row in rows:
tks_str = rag_tokenizer.tokenize(str(row))
token_count = len(tks_str.split(" ")) if tks_str else 0
if current_count + token_count > chunk_token_num:
tables.append(current_table)
current_table = []
current_count = 0
current_table.append(row)
current_count += token_count
if current_table:
tables.append(current_table)
for table_rows in tables:
new_table = soup.new_tag("table")
for row in table_rows:
new_table.append(row)
table_str_list.append(str(new_table))
return table_str_list
@classmethod
def read_text_recursively(cls, element, parser_result, chunk_token_num=512, parent_name=None, block_id=None):
if isinstance(element, NavigableString):
content = element.strip()
def is_valid_html(content):
try:
soup = BeautifulSoup(content, "html.parser")
return bool(soup.find())
except Exception:
return False
return_info = []
if content:
if is_valid_html(content):
soup = BeautifulSoup(content, "html.parser")
child_info = cls.read_text_recursively(soup, parser_result, chunk_token_num, element.name, block_id)
parser_result.extend(child_info)
else:
info = {"content": element.strip(), "tag_name": "inner_text", "metadata": {"block_id": block_id}}
if parent_name:
info["tag_name"] = parent_name
return_info.append(info)
return return_info
elif isinstance(element, Tag):
if str.lower(element.name) == "table":
table_info_list = []
table_id = str(uuid.uuid1())
table_list = [html.unescape(str(element))]
for t in table_list:
table_info_list.append({"content": t, "tag_name": "table",
"metadata": {"table_id": table_id, "index": table_list.index(t)}})
return table_info_list
else:
block_id = None
if str.lower(element.name) in BLOCK_TAGS:
block_id = str(uuid.uuid1())
for child in element.children:
child_info = cls.read_text_recursively(child, parser_result, chunk_token_num, element.name,
block_id)
parser_result.extend(child_info)
return []
@classmethod
def merge_block_text(cls, parser_result):
block_content = []
current_content = ""
table_info_list = []
lask_block_id = None
for item in parser_result:
content = item.get("content")
tag_name = item.get("tag_name")
title_flag = tag_name in TITLE_TAGS
block_id = item.get("metadata", {}).get("block_id")
if block_id:
if title_flag:
content = f"{TITLE_TAGS[tag_name]} {content}"
if lask_block_id != block_id:
if lask_block_id is not None:
block_content.append(current_content)
current_content = content
lask_block_id = block_id
else:
current_content += (" " if current_content else "") + content
else:
if tag_name == "table":
table_info_list.append(item)
else:
current_content += (" " if current_content else "" + content)
if current_content:
block_content.append(current_content)
return block_content, table_info_list
@classmethod
def chunk_block(cls, block_txt_list, chunk_token_num=512):
chunks = []
current_block = ""
current_token_count = 0
for block in block_txt_list:
tks_str = rag_tokenizer.tokenize(block)
block_token_count = len(tks_str.split(" ")) if tks_str else 0
if block_token_count > chunk_token_num:
if current_block:
chunks.append(current_block)
start = 0
tokens = tks_str.split(" ")
while start < len(tokens):
end = start + chunk_token_num
split_tokens = tokens[start:end]
chunks.append(" ".join(split_tokens))
start = end
current_block = ""
current_token_count = 0
else:
if current_token_count + block_token_count <= chunk_token_num:
current_block += ("\n" if current_block else "") + block
current_token_count += block_token_count
else:
chunks.append(current_block)
current_block = block
current_token_count = block_token_count
if current_block:
chunks.append(current_block)
return chunks

View File

@ -17,8 +17,10 @@
import re
import mistune
from markdown import markdown
class RAGFlowMarkdownParser:
def __init__(self, chunk_token_num=128):
self.chunk_token_num = int(chunk_token_num)
@ -35,40 +37,44 @@ class RAGFlowMarkdownParser:
table_list.append(raw_table)
if separate_tables:
# Skip this match (i.e., remove it)
new_text += working_text[last_end:match.start()] + "\n\n"
new_text += working_text[last_end : match.start()] + "\n\n"
else:
# Replace with rendered HTML
html_table = markdown(raw_table, extensions=['markdown.extensions.tables']) if render else raw_table
new_text += working_text[last_end:match.start()] + html_table + "\n\n"
html_table = markdown(raw_table, extensions=["markdown.extensions.tables"]) if render else raw_table
new_text += working_text[last_end : match.start()] + html_table + "\n\n"
last_end = match.end()
new_text += working_text[last_end:]
return new_text
if "|" in markdown_text: # for optimize performance
if "|" in markdown_text: # for optimize performance
# Standard Markdown table
border_table_pattern = re.compile(
r'''
r"""
(?:\n|^)
(?:\|.*?\|.*?\|.*?\n)
(?:\|(?:\s*[:-]+[-| :]*\s*)\|.*?\n)
(?:\|.*?\|.*?\|.*?\n)+
''', re.VERBOSE)
""",
re.VERBOSE,
)
working_text = replace_tables_with_rendered_html(border_table_pattern, tables)
# Borderless Markdown table
no_border_table_pattern = re.compile(
r'''
r"""
(?:\n|^)
(?:\S.*?\|.*?\n)
(?:(?:\s*[:-]+[-| :]*\s*).*?\n)
(?:\S.*?\|.*?\n)+
''', re.VERBOSE)
""",
re.VERBOSE,
)
working_text = replace_tables_with_rendered_html(no_border_table_pattern, tables)
if "<table>" in working_text.lower(): # for optimize performance
#HTML table extraction - handle possible html/body wrapper tags
if "<table>" in working_text.lower(): # for optimize performance
# HTML table extraction - handle possible html/body wrapper tags
html_table_pattern = re.compile(
r'''
r"""
(?:\n|^)
\s*
(?:
@ -83,9 +89,10 @@ class RAGFlowMarkdownParser:
)
\s*
(?=\n|$)
''',
re.VERBOSE | re.DOTALL | re.IGNORECASE
""",
re.VERBOSE | re.DOTALL | re.IGNORECASE,
)
def replace_html_tables():
nonlocal working_text
new_text = ""
@ -94,9 +101,9 @@ class RAGFlowMarkdownParser:
raw_table = match.group()
tables.append(raw_table)
if separate_tables:
new_text += working_text[last_end:match.start()] + "\n\n"
new_text += working_text[last_end : match.start()] + "\n\n"
else:
new_text += working_text[last_end:match.start()] + raw_table + "\n\n"
new_text += working_text[last_end : match.start()] + raw_table + "\n\n"
last_end = match.end()
new_text += working_text[last_end:]
working_text = new_text
@ -104,3 +111,163 @@ class RAGFlowMarkdownParser:
replace_html_tables()
return working_text, tables
class MarkdownElementExtractor:
def __init__(self, markdown_content):
self.markdown_content = markdown_content
self.lines = markdown_content.split("\n")
self.ast_parser = mistune.create_markdown(renderer="ast")
self.ast_nodes = self.ast_parser(markdown_content)
def extract_elements(self):
"""Extract individual elements (headers, code blocks, lists, etc.)"""
sections = []
i = 0
while i < len(self.lines):
line = self.lines[i]
if re.match(r"^#{1,6}\s+.*$", line):
# header
element = self._extract_header(i)
sections.append(element["content"])
i = element["end_line"] + 1
elif line.strip().startswith("```"):
# code block
element = self._extract_code_block(i)
sections.append(element["content"])
i = element["end_line"] + 1
elif re.match(r"^\s*[-*+]\s+.*$", line) or re.match(r"^\s*\d+\.\s+.*$", line):
# list block
element = self._extract_list_block(i)
sections.append(element["content"])
i = element["end_line"] + 1
elif line.strip().startswith(">"):
# blockquote
element = self._extract_blockquote(i)
sections.append(element["content"])
i = element["end_line"] + 1
elif line.strip():
# text block (paragraphs and inline elements until next block element)
element = self._extract_text_block(i)
sections.append(element["content"])
i = element["end_line"] + 1
else:
i += 1
sections = [section for section in sections if section.strip()]
return sections
def _extract_header(self, start_pos):
return {
"type": "header",
"content": self.lines[start_pos],
"start_line": start_pos,
"end_line": start_pos,
}
def _extract_code_block(self, start_pos):
end_pos = start_pos
content_lines = [self.lines[start_pos]]
# Find the end of the code block
for i in range(start_pos + 1, len(self.lines)):
content_lines.append(self.lines[i])
end_pos = i
if self.lines[i].strip().startswith("```"):
break
return {
"type": "code_block",
"content": "\n".join(content_lines),
"start_line": start_pos,
"end_line": end_pos,
}
def _extract_list_block(self, start_pos):
end_pos = start_pos
content_lines = []
i = start_pos
while i < len(self.lines):
line = self.lines[i]
# check if this line is a list item or continuation of a list
if (
re.match(r"^\s*[-*+]\s+.*$", line)
or re.match(r"^\s*\d+\.\s+.*$", line)
or (i > start_pos and not line.strip())
or (i > start_pos and re.match(r"^\s{2,}[-*+]\s+.*$", line))
or (i > start_pos and re.match(r"^\s{2,}\d+\.\s+.*$", line))
or (i > start_pos and re.match(r"^\s+\w+.*$", line))
):
content_lines.append(line)
end_pos = i
i += 1
else:
break
return {
"type": "list_block",
"content": "\n".join(content_lines),
"start_line": start_pos,
"end_line": end_pos,
}
def _extract_blockquote(self, start_pos):
end_pos = start_pos
content_lines = []
i = start_pos
while i < len(self.lines):
line = self.lines[i]
if line.strip().startswith(">") or (i > start_pos and not line.strip()):
content_lines.append(line)
end_pos = i
i += 1
else:
break
return {
"type": "blockquote",
"content": "\n".join(content_lines),
"start_line": start_pos,
"end_line": end_pos,
}
def _extract_text_block(self, start_pos):
"""Extract a text block (paragraphs, inline elements) until next block element"""
end_pos = start_pos
content_lines = [self.lines[start_pos]]
i = start_pos + 1
while i < len(self.lines):
line = self.lines[i]
# stop if we encounter a block element
if re.match(r"^#{1,6}\s+.*$", line) or line.strip().startswith("```") or re.match(r"^\s*[-*+]\s+.*$", line) or re.match(r"^\s*\d+\.\s+.*$", line) or line.strip().startswith(">"):
break
elif not line.strip():
# check if the next line is a block element
if i + 1 < len(self.lines) and (
re.match(r"^#{1,6}\s+.*$", self.lines[i + 1])
or self.lines[i + 1].strip().startswith("```")
or re.match(r"^\s*[-*+]\s+.*$", self.lines[i + 1])
or re.match(r"^\s*\d+\.\s+.*$", self.lines[i + 1])
or self.lines[i + 1].strip().startswith(">")
):
break
else:
content_lines.append(line)
end_pos = i
i += 1
else:
content_lines.append(line)
end_pos = i
i += 1
return {
"type": "text_block",
"content": "\n".join(content_lines),
"start_line": start_pos,
"end_line": end_pos,
}

View File

@ -87,7 +87,7 @@ class RAGFlowPptParser:
break
texts = []
for shape in sorted(
slide.shapes, key=lambda x: ((x.top if x.top is not None else 0) // 10, x.left)):
slide.shapes, key=lambda x: ((x.top if x.top is not None else 0) // 10, x.left if x.left is not None else 0)):
try:
txt = self.__extract(shape)
if txt:
@ -96,4 +96,4 @@ class RAGFlowPptParser:
logging.exception(e)
txts.append("\n".join(texts))
return txts
return txts

View File

@ -62,6 +62,8 @@ MYSQL_DBNAME=rag_flow
# The port used to expose the MySQL service to the host machine,
# allowing EXTERNAL access to the MySQL database running inside the Docker container.
MYSQL_PORT=5455
# The maximum size of communication packets sent to the MySQL server
MYSQL_MAX_PACKET=1073741824
# The hostname where the MinIO service is exposed
MINIO_HOST=minio
@ -91,13 +93,13 @@ REDIS_PASSWORD=infini_rag_flow
SVR_HTTP_PORT=9380
# The RAGFlow Docker image to download.
# Defaults to the v0.20.0-slim edition, which is the RAGFlow Docker image without embedding models.
RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0-slim
# Defaults to the v0.20.4-slim edition, which is the RAGFlow Docker image without embedding models.
RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4-slim
#
# To download the RAGFlow Docker image with embedding models, uncomment the following line instead:
# RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0
# RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4
#
# The Docker image of the v0.20.0 edition includes built-in embedding models:
# The Docker image of the v0.20.4 edition includes built-in embedding models:
# - BAAI/bge-large-zh-v1.5
# - maidalun1020/bce-embedding-base_v1
#

View File

@ -79,8 +79,8 @@ The [.env](./.env) file contains important environment variables for Docker.
- `RAGFLOW-IMAGE`
The Docker image edition. Available editions:
- `infiniflow/ragflow:v0.20.0-slim` (default): The RAGFlow Docker image without embedding models.
- `infiniflow/ragflow:v0.20.0`: The RAGFlow Docker image with embedding models including:
- `infiniflow/ragflow:v0.20.4-slim` (default): The RAGFlow Docker image without embedding models.
- `infiniflow/ragflow:v0.20.4`: The RAGFlow Docker image with embedding models including:
- Built-in embedding models:
- `BAAI/bge-large-zh-v1.5`
- `maidalun1020/bce-embedding-base_v1`

298
docker/migration.sh Normal file
View File

@ -0,0 +1,298 @@
#!/bin/bash
# RAGFlow Data Migration Script
# Usage: ./migration.sh [backup|restore] [backup_folder]
#
# This script helps you backup and restore RAGFlow Docker volumes
# including MySQL, MinIO, Redis, and Elasticsearch data.
set -e # Exit on any error
# Instead, we'll handle errors manually for better debugging experience
# Default values
DEFAULT_BACKUP_FOLDER="backup"
VOLUMES=("docker_mysql_data" "docker_minio_data" "docker_redis_data" "docker_esdata01")
BACKUP_FILES=("mysql_backup.tar.gz" "minio_backup.tar.gz" "redis_backup.tar.gz" "es_backup.tar.gz")
# Function to display help information
show_help() {
echo "RAGFlow Data Migration Tool"
echo ""
echo "USAGE:"
echo " $0 <operation> [backup_folder]"
echo ""
echo "OPERATIONS:"
echo " backup - Create backup of all RAGFlow data volumes"
echo " restore - Restore RAGFlow data volumes from backup"
echo " help - Show this help message"
echo ""
echo "PARAMETERS:"
echo " backup_folder - Name of backup folder (default: '$DEFAULT_BACKUP_FOLDER')"
echo ""
echo "EXAMPLES:"
echo " $0 backup # Backup to './backup' folder"
echo " $0 backup my_backup # Backup to './my_backup' folder"
echo " $0 restore # Restore from './backup' folder"
echo " $0 restore my_backup # Restore from './my_backup' folder"
echo ""
echo "DOCKER VOLUMES:"
echo " - docker_mysql_data (MySQL database)"
echo " - docker_minio_data (MinIO object storage)"
echo " - docker_redis_data (Redis cache)"
echo " - docker_esdata01 (Elasticsearch indices)"
}
# Function to check if Docker is running
check_docker() {
if ! docker info >/dev/null 2>&1; then
echo "❌ Error: Docker is not running or not accessible"
echo "Please start Docker and try again"
exit 1
fi
}
# Function to check if volume exists
volume_exists() {
local volume_name=$1
docker volume inspect "$volume_name" >/dev/null 2>&1
}
# Function to check if any containers are using the target volumes
check_containers_using_volumes() {
echo "🔍 Checking for running containers that might be using target volumes..."
# Get all running containers
local running_containers=$(docker ps --format "{{.Names}}")
if [ -z "$running_containers" ]; then
echo "✅ No running containers found"
return 0
fi
# Check each running container for volume usage
local containers_using_volumes=()
local volume_usage_details=()
for container in $running_containers; do
# Get container's mount information
local mounts=$(docker inspect "$container" --format '{{range .Mounts}}{{.Source}}{{"|"}}{{end}}' 2>/dev/null || echo "")
# Check if any of our target volumes are used by this container
for volume in "${VOLUMES[@]}"; do
if echo "$mounts" | grep -q "$volume"; then
containers_using_volumes+=("$container")
volume_usage_details+=("$container -> $volume")
break
fi
done
done
# If any containers are using our volumes, show error and exit
if [ ${#containers_using_volumes[@]} -gt 0 ]; then
echo ""
echo "❌ ERROR: Found running containers using target volumes!"
echo ""
echo "📋 Running containers status:"
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Image}}"
echo ""
echo "🔗 Volume usage details:"
for detail in "${volume_usage_details[@]}"; do
echo " - $detail"
done
echo ""
echo "🛑 SOLUTION: Stop the containers before performing backup/restore operations:"
echo " docker-compose -f docker/<your-docker-compose-file>.yml down"
echo ""
echo "💡 After backup/restore, you can restart with:"
echo " docker-compose -f docker/<your-docker-compose-file>.yml up -d"
echo ""
exit 1
fi
echo "✅ No containers are using target volumes, safe to proceed"
return 0
}
# Function to confirm user action
confirm_action() {
local message=$1
echo -n "$message (y/N): "
read -r response
case "$response" in
[yY]|[yY][eE][sS]) return 0 ;;
*) return 1 ;;
esac
}
# Function to perform backup
perform_backup() {
local backup_folder=$1
echo "🚀 Starting RAGFlow data backup..."
echo "📁 Backup folder: $backup_folder"
echo ""
# Check if any containers are using the volumes
check_containers_using_volumes
# Create backup folder if it doesn't exist
mkdir -p "$backup_folder"
# Backup each volume
for i in "${!VOLUMES[@]}"; do
local volume="${VOLUMES[$i]}"
local backup_file="${BACKUP_FILES[$i]}"
local step=$((i + 1))
echo "📦 Step $step/4: Backing up $volume..."
if volume_exists "$volume"; then
docker run --rm \
-v "$volume":/source \
-v "$(pwd)/$backup_folder":/backup \
alpine tar czf "/backup/$backup_file" -C /source .
echo "✅ Successfully backed up $volume to $backup_folder/$backup_file"
else
echo "⚠️ Warning: Volume $volume does not exist, skipping..."
fi
echo ""
done
echo "🎉 Backup completed successfully!"
echo "📍 Backup location: $(pwd)/$backup_folder"
# List backup files with sizes
echo ""
echo "📋 Backup files created:"
for backup_file in "${BACKUP_FILES[@]}"; do
if [ -f "$backup_folder/$backup_file" ]; then
local size=$(ls -lh "$backup_folder/$backup_file" | awk '{print $5}')
echo " - $backup_file ($size)"
fi
done
}
# Function to perform restore
perform_restore() {
local backup_folder=$1
echo "🔄 Starting RAGFlow data restore..."
echo "📁 Backup folder: $backup_folder"
echo ""
# Check if any containers are using the volumes
check_containers_using_volumes
# Check if backup folder exists
if [ ! -d "$backup_folder" ]; then
echo "❌ Error: Backup folder '$backup_folder' does not exist"
exit 1
fi
# Check if all backup files exist
local missing_files=()
for backup_file in "${BACKUP_FILES[@]}"; do
if [ ! -f "$backup_folder/$backup_file" ]; then
missing_files+=("$backup_file")
fi
done
if [ ${#missing_files[@]} -gt 0 ]; then
echo "❌ Error: Missing backup files:"
for file in "${missing_files[@]}"; do
echo " - $file"
done
echo "Please ensure all backup files are present in '$backup_folder'"
exit 1
fi
# Check for existing volumes and warn user
local existing_volumes=()
for volume in "${VOLUMES[@]}"; do
if volume_exists "$volume"; then
existing_volumes+=("$volume")
fi
done
if [ ${#existing_volumes[@]} -gt 0 ]; then
echo "⚠️ WARNING: The following Docker volumes already exist:"
for volume in "${existing_volumes[@]}"; do
echo " - $volume"
done
echo ""
echo "🔴 IMPORTANT: Restoring will OVERWRITE existing data!"
echo "💡 Recommendation: Create a backup of your current data first:"
echo " $0 backup current_backup_$(date +%Y%m%d_%H%M%S)"
echo ""
if ! confirm_action "Do you want to continue with the restore operation?"; then
echo "❌ Restore operation cancelled by user"
exit 0
fi
fi
# Create volumes and restore data
for i in "${!VOLUMES[@]}"; do
local volume="${VOLUMES[$i]}"
local backup_file="${BACKUP_FILES[$i]}"
local step=$((i + 1))
echo "🔧 Step $step/4: Restoring $volume..."
# Create volume if it doesn't exist
if ! volume_exists "$volume"; then
echo " 📋 Creating Docker volume: $volume"
docker volume create "$volume"
else
echo " 📋 Using existing Docker volume: $volume"
fi
# Restore data
echo " 📥 Restoring data from $backup_file..."
docker run --rm \
-v "$volume":/target \
-v "$(pwd)/$backup_folder":/backup \
alpine tar xzf "/backup/$backup_file" -C /target
echo "✅ Successfully restored $volume"
echo ""
done
echo "🎉 Restore completed successfully!"
echo "💡 You can now start your RAGFlow services"
}
# Main script logic
main() {
# Check if Docker is available
check_docker
# Parse command line arguments
local operation=${1:-}
local backup_folder=${2:-$DEFAULT_BACKUP_FOLDER}
# Handle help or no arguments
if [ -z "$operation" ] || [ "$operation" = "help" ] || [ "$operation" = "-h" ] || [ "$operation" = "--help" ]; then
show_help
exit 0
fi
# Validate operation
case "$operation" in
backup)
perform_backup "$backup_folder"
;;
restore)
perform_restore "$backup_folder"
;;
*)
echo "❌ Error: Invalid operation '$operation'"
echo ""
show_help
exit 1
;;
esac
}
# Run main function with all arguments
main "$@"

View File

@ -6,3 +6,7 @@ proxy_set_header Connection "";
proxy_buffering off;
proxy_read_timeout 3600s;
proxy_send_timeout 3600s;
proxy_buffer_size 1024k;
proxy_buffers 16 1024k;
proxy_busy_buffers_size 2048k;
proxy_temp_file_write_size 2048k;

View File

@ -9,6 +9,7 @@ mysql:
port: 3306
max_connections: 900
stale_timeout: 300
max_allowed_packet: ${MYSQL_MAX_PACKET:-1073741824}
minio:
user: '${MINIO_USER:-rag_flow}'
password: '${MINIO_PASSWORD:-infini_rag_flow}'

View File

@ -99,8 +99,8 @@ RAGFlow utilizes MinIO as its object storage solution, leveraging its scalabilit
- `RAGFLOW-IMAGE`
The Docker image edition. Available editions:
- `infiniflow/ragflow:v0.20.0-slim` (default): The RAGFlow Docker image without embedding models.
- `infiniflow/ragflow:v0.20.0`: The RAGFlow Docker image with embedding models including:
- `infiniflow/ragflow:v0.20.4-slim` (default): The RAGFlow Docker image without embedding models.
- `infiniflow/ragflow:v0.20.4`: The RAGFlow Docker image with embedding models including:
- Built-in embedding models:
- `BAAI/bge-large-zh-v1.5`
- `maidalun1020/bce-embedding-base_v1`

View File

@ -11,7 +11,7 @@ An API key is required for the RAGFlow server to authenticate your HTTP/Python o
2. Click **API** to switch to the **API** page.
3. Obtain a RAGFlow API key:
![ragflow_api_key](https://github.com/user-attachments/assets/f461ed61-04c6-4faf-b3d8-6b5fa56be4e7)
![ragflow_api_key](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/ragflow_api_key.jpg)
:::tip NOTE
See the [RAGFlow HTTP API reference](../references/http_api_reference.md) or the [RAGFlow Python API reference](../references/python_api_reference.md) for a complete reference of RAGFlow's HTTP or Python APIs.

View File

@ -77,7 +77,7 @@ After building the infiniflow/ragflow:nightly-slim image, you are ready to launc
1. Edit Docker Compose Configuration
Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.20.0-slim` to `infiniflow/ragflow:nightly-slim` to use the pre-built image.
Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.20.4-slim` to `infiniflow/ragflow:nightly-slim` to use the pre-built image.
2. Launch the Service

View File

@ -30,17 +30,17 @@ The "garbage in garbage out" status quo remains unchanged despite the fact that
Each RAGFlow release is available in two editions:
- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.0-slim`
- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.0`
- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.4-slim`
- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.4`
---
### Which embedding models can be deployed locally?
RAGFlow offers two Docker image editions, `v0.20.0-slim` and `v0.20.0`:
RAGFlow offers two Docker image editions, `v0.20.4-slim` and `v0.20.4`:
- `infiniflow/ragflow:v0.20.0-slim` (default): The RAGFlow Docker image without embedding models.
- `infiniflow/ragflow:v0.20.0`: The RAGFlow Docker image with embedding models including:
- `infiniflow/ragflow:v0.20.4-slim` (default): The RAGFlow Docker image without embedding models.
- `infiniflow/ragflow:v0.20.4`: The RAGFlow Docker image with embedding models including:
- Built-in embedding models:
- `BAAI/bge-large-zh-v1.5`
- `maidalun1020/bce-embedding-base_v1`

View File

@ -9,7 +9,7 @@ The component equipped with reasoning, tool usage, and multi-agent collaboration
---
An **Agent** component fine-tunes the LLM and sets its prompt. From v0.20.0 onwards, an **Agent** component is able to work independently and with the following capabilities:
An **Agent** component fine-tunes the LLM and sets its prompt. From v0.20.4 onwards, an **Agent** component is able to work independently and with the following capabilities:
- Autonomous reasoning with reflection and adjustment based on environmental feedback.
- Use of tools or subagents to complete tasks.
@ -82,7 +82,7 @@ An integer specifying the number of previous dialogue rounds to input into the L
This feature is used for multi-turn dialogue *only*.
:::
### Max retrieves
### Max retries
Defines the maximum number of attempts the agent will make to retry a failed task or operation before stopping or reporting failure.
@ -92,7 +92,11 @@ The waiting period in seconds that the agent observes before retrying a failed t
### Max rounds
Defines the maximum number reflection rounds of the selected chat model. Defaults to 5 rounds.
Defines the maximum number reflection rounds of the selected chat model. Defaults to 1 round.
:::tip NOTE
Increasing this value will significantly extend your agent's response time.
:::
### Output

View File

@ -9,7 +9,7 @@ A component that retrieves information from specified datasets.
## Scenarios
A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated knowledge bases before being sent to the LLM for content generation. As of v0.20.0, a **Retrieval** component can operate either as a workflow component or as a tool of an **Agent**, enabling the Agent to control its invocation and search queries.
A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated knowledge bases before being sent to the LLM for content generation. As of v0.20.4, a **Retrieval** component can operate either as a workflow component or as a tool of an **Agent**, enabling the Agent to control its invocation and search queries.
## Configurations

View File

@ -63,7 +63,7 @@ docker build -t sandbox-executor-manager:latest ./executor_manager
3. Add the following entry to your /etc/hosts file to resolve the executor manager service:
```bash
127.0.0.1 sandbox-executor-manager
127.0.0.1 es01 infinity mysql minio redis sandbox-executor-manager
```
4. Start the RAGFlow service as usual.

View File

@ -48,7 +48,7 @@ You start an AI conversation by creating an assistant.
- If no target language is selected, the system will search only in the language of your query, which may cause relevant information in other languages to be missed.
- **Variable** refers to the variables (keys) to be used in the system prompt. `{knowledge}` is a reserved variable. Click **Add** to add more variables for the system prompt.
- If you are uncertain about the logic behind **Variable**, leave it *as-is*.
- As of v0.20.0, if you add custom variables here, the only way you can pass in their values is to call:
- As of v0.20.4, if you add custom variables here, the only way you can pass in their values is to call:
- HTTP method [Converse with chat assistant](../../references/http_api_reference.md#converse-with-chat-assistant), or
- Python method [Converse with chat assistant](../../references/python_api_reference.md#converse-with-chat-assistant).

View File

@ -128,7 +128,7 @@ See [Run retrieval test](./run_retrieval_test.md) for details.
## Search for knowledge base
As of RAGFlow v0.20.0, the search feature is still in a rudimentary form, supporting only knowledge base search by name.
As of RAGFlow v0.20.4, the search feature is still in a rudimentary form, supporting only knowledge base search by name.
![search knowledge base](https://github.com/infiniflow/ragflow/assets/93570324/836ae94c-2438-42be-879e-c7ad2a59693e)

View File

@ -87,4 +87,4 @@ RAGFlow's file management allows you to download an uploaded file:
![download_file](https://github.com/infiniflow/ragflow/assets/93570324/cf3b297f-7d9b-4522-bf5f-4f45743e4ed5)
> As of RAGFlow v0.20.0, bulk download is not supported, nor can you download an entire folder.
> As of RAGFlow v0.20.4, bulk download is not supported, nor can you download an entire folder.

View File

@ -0,0 +1,108 @@
# Data Migration Guide
A common scenario is processing large datasets on a powerful instance (e.g., with a GPU) and then migrating the entire RAGFlow service to a different production environment (e.g., a CPU-only server). This guide explains how to safely back up and restore your data using our provided migration script.
## Identifying Your Data
By default, RAGFlow uses Docker volumes to store all persistent data, including your database, uploaded files, and search indexes. You can see these volumes by running:
```bash
docker volume ls
```
The output will look similar to this:
```text
DRIVER VOLUME NAME
local docker_esdata01
local docker_minio_data
local docker_mysql_data
local docker_redis_data
```
These volumes contain all the data you need to migrate.
## Step 1: Stop RAGFlow Services
Before starting the migration, you must stop all running RAGFlow services on the **source machine**. Navigate to the project's root directory and run:
```bash
docker-compose -f docker/docker-compose.yml down
```
**Important:** Do **not** use the `-v` flag (e.g., `docker-compose down -v`), as this will delete all your data volumes. The migration script includes a check and will prevent you from running it if services are active.
## Step 2: Back Up Your Data
We provide a convenient script to package all your data volumes into a single backup folder.
For a quick reference of the script's commands and options, you can run:
```bash
bash docker/migration.sh help
```
To create a backup, run the following command from the project's root directory:
```bash
bash docker/migration.sh backup
```
This will create a `backup/` folder in your project root containing compressed archives of your data volumes.
You can also specify a custom name for your backup folder:
```bash
bash docker/migration.sh backup my_ragflow_backup
```
This will create a folder named `my_ragflow_backup/` instead.
## Step 3: Transfer the Backup Folder
Copy the entire backup folder (e.g., `backup/` or `my_ragflow_backup/`) from your source machine to the RAGFlow project directory on your **target machine**. You can use tools like `scp`, `rsync`, or a physical drive for the transfer.
## Step 4: Restore Your Data
On the **target machine**, ensure that RAGFlow services are not running. Then, use the migration script to restore your data from the backup folder.
If your backup folder is named `backup/`, run:
```bash
bash docker/migration.sh restore
```
If you used a custom name, specify it in the command:
```bash
bash docker/migration.sh restore my_ragflow_backup
```
The script will automatically create the necessary Docker volumes and unpack the data.
**Note:** If the script detects that Docker volumes with the same names already exist on the target machine, it will warn you that restoring will overwrite the existing data and ask for confirmation before proceeding.
## Step 5: Start RAGFlow Services
Once the restore process is complete, you can start the RAGFlow services on your new machine:
```bash
docker-compose -f docker/docker-compose.yml up -d
```
**Note:** If you already have build an service by docker-compose before, you may need to backup your data for target machine like this guide above and run like:
```bash
# Please backup by `sh docker/migration.sh backup backup_dir_name` before you do the following line.
# !!! this line -v flag will delete the original docker volume
docker-compose -f docker/docker-compose.yml down -v
docker-compose -f docker/docker-compose.yml up -d
```
Your RAGFlow instance is now running with all the data from your original machine.

View File

@ -18,7 +18,7 @@ RAGFlow ships with a built-in [Langfuse](https://langfuse.com) integration so th
Langfuse stores traces, spans and prompt payloads in a purpose-built observability backend and offers filtering and visualisations on top.
:::info NOTE
• RAGFlow **≥ 0.20.0** (contains the Langfuse connector)
• RAGFlow **≥ 0.20.4** (contains the Langfuse connector)
• A Langfuse workspace (cloud or self-hosted) with a _Project Public Key_ and _Secret Key_
:::

View File

@ -66,10 +66,10 @@ To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker imag
git clone https://github.com/infiniflow/ragflow.git
```
2. Switch to the latest, officially published release, e.g., `v0.20.0`:
2. Switch to the latest, officially published release, e.g., `v0.20.4`:
```bash
git checkout -f v0.20.0
git checkout -f v0.20.4
```
3. Update **ragflow/docker/.env**:
@ -83,14 +83,14 @@ To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker imag
<TabItem value="slim">
```bash
RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0-slim
RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4-slim
```
</TabItem>
<TabItem value="full">
```bash
RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0
RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4
```
</TabItem>
@ -114,10 +114,10 @@ No, you do not need to. Upgrading RAGFlow in itself will *not* remove your uploa
1. From an environment with Internet access, pull the required Docker image.
2. Save the Docker image to a **.tar** file.
```bash
docker save -o ragflow.v0.20.0.tar infiniflow/ragflow:v0.20.0
docker save -o ragflow.v0.20.4.tar infiniflow/ragflow:v0.20.4
```
3. Copy the **.tar** file to the target server.
4. Load the **.tar** file into Docker:
```bash
docker load -i ragflow.v0.20.0.tar
docker load -i ragflow.v0.20.4.tar
```

View File

@ -44,7 +44,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
`vm.max_map_count`. This value sets the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abnormal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.
RAGFlow v0.20.0 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.
RAGFlow v0.20.4 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.
<Tabs
defaultValue="linux"
@ -184,13 +184,13 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
```bash
$ git clone https://github.com/infiniflow/ragflow.git
$ cd ragflow/docker
$ git checkout -f v0.20.0
$ git checkout -f v0.20.4
```
3. Use the pre-built Docker images and start up the server:
:::tip NOTE
The command below downloads the `v0.20.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.0` for the full edition `v0.20.0`.
The command below downloads the `v0.20.4-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.4-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` for the full edition `v0.20.4`.
:::
```bash
@ -207,8 +207,8 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
| RAGFlow image tag | Image size (GB) | Has embedding models and Python packages? | Stable? |
| ------------------- | --------------- | ----------------------------------------- | ------------------------ |
| `v0.20.0` | &approx;9 | :heavy_check_mark: | Stable release |
| `v0.20.0-slim` | &approx;2 | ❌ | Stable release |
| `v0.20.4` | &approx;9 | :heavy_check_mark: | Stable release |
| `v0.20.4-slim` | &approx;2 | ❌ | Stable release |
| `nightly` | &approx;9 | :heavy_check_mark: | *Unstable* nightly build |
| `nightly-slim` | &approx;2 | ❌ | *Unstable* nightly build |
@ -217,7 +217,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
```
:::danger IMPORTANT
The embedding models included in `v0.20.0` and `nightly` are:
The embedding models included in `v0.20.4` and `nightly` are:
- BAAI/bge-large-zh-v1.5
- maidalun1020/bce-embedding-base_v1

View File

@ -19,7 +19,7 @@ import TOCInline from '@theme/TOCInline';
### Cross-language search
Cross-language search (also known as cross-lingual retrieval) is a feature introduced in version 0.20.0. It enables users to submit queries in one language (for example, English) and retrieve relevant documents written in other languages such as Chinese or Spanish. This feature is enabled by the systems default chat model, which translates queries to ensure accurate matching of semantic meaning across languages.
Cross-language search (also known as cross-lingual retrieval) is a feature introduced in version 0.20.4. It enables users to submit queries in one language (for example, English) and retrieve relevant documents written in other languages such as Chinese or Spanish. This feature is enabled by the systems default chat model, which translates queries to ensure accurate matching of semantic meaning across languages.
By enabling cross-language search, users can effortlessly access a broader range of information regardless of language barriers, significantly enhancing the systems usability and inclusiveness.

File diff suppressed because it is too large Load Diff

View File

@ -5,7 +5,7 @@ slug: /python_api_reference
# Python API
A complete reference for RAGFlow's Python APIs. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](../guides/models/llm_api_key_setup.md).
A complete reference for RAGFlow's Python APIs. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](https://ragflow.io/docs/dev/acquire_ragflow_api_key).
:::tip NOTE
Run the following command to download the Python SDK:
@ -507,7 +507,16 @@ print(doc)
### List documents
```python
Dataset.list_documents(id:str =None, keywords: str=None, page: int=1, page_size:int = 30, order_by:str = "create_time", desc: bool = True) -> list[Document]
Dataset.list_documents(
id: str = None,
keywords: str = None,
page: int = 1,
page_size: int = 30,
order_by: str = "create_time",
desc: bool = True,
create_time_from: int = 0,
create_time_to: int = 0
) -> list[Document]
```
Lists documents in the current dataset.
@ -541,6 +550,12 @@ The field by which documents should be sorted. Available options:
Indicates whether the retrieved documents should be sorted in descending order. Defaults to `True`.
##### create_time_from: `int`
Unix timestamp for filtering documents created after this time. 0 means no filter. Defaults to 0.
##### create_time_to: `int`
Unix timestamp for filtering documents created before this time. 0 means no filter. Defaults to 0.
#### Returns
- Success: A list of `Document` objects.

View File

@ -9,8 +9,8 @@ Key features, improvements and bug fixes in the latest releases.
:::info
Each RAGFlow release is available in two editions:
- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.19.1-slim`
- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.19.1`
- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.4-slim`
- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.4`
:::
:::danger IMPORTANT
@ -22,6 +22,129 @@ The embedding models included in a full edition are:
These two embedding models are optimized specifically for English and Chinese, so performance may be compromised if you use them to embed documents in other languages.
:::
## v0.20.4
Released on August 27, 2025.
### Improvements
- Agent component: Completes Chinese localization for the Agent component.
- Introduces the `ENABLE_TIMEOUT_ASSERTION` environment variable to enable or disable timeout assertions for file parsing tasks.
- Dataset:
- Improves Markdown file parsing, with AST support to avoid unintended chunking.
- Enhances HTML parsing, supporting bs4-based HTML tag traversal.
### Added models
ZHIPU GLM-4.5
### New Agent templates
Ecommerce Customer Service Workflow: A template designed to handle enquiries about product features and multi-product comparisons using the internal knowledge base, as well as to manage installation appointment bookings.
### Fixed issues
- Dataset:
- Unable to share resources with the team.
- Inappropriate restrictions on the number and size of uploaded files.
- Chat:
- Unable to preview referenced files in responses.
- Unable to send out messages after file uploads.
- An OAuth2 authentication failure.
- A logical error in multi-conditioned metadata searches within a dataset.
- Citations infinitely increased in multi-turn conversations.
## v0.20.3
Released on August 20, 2025.
### Improvements
- Revamps the user interface for the **Datasets**, **Chat**, and **Search** pages.
- Search and Chat: Introduces document-level metadata filtering, allowing automatic or manual filtering during chats or searches.
- Search: Supports creating search apps tailored to various business scenarios
- Chat: Supports comparing answer performance of up to three chat model settings on a single **Chat** page.
- Agent:
- Implements a toggle in the **Agent** component to enable or disable citation.
- Introduces a drag-and-drop method for creating components.
- Documentation: Corrects inaccuracies in the API reference.
### New Agent templates
- Report Agent: A template for generating summary reports in internal question-answering scenarios, supporting the display of tables and formulae. [#9427](https://github.com/infiniflow/ragflow/pull/9427)
### Fixed issues
- The timeout mechanism introduced in v0.20.0 caused tasks like GraphRAG to halt.
- Predefined opening greeting in the **Agent** component was missing during conversations.
- An automatic line break issue in the prompt editor.
- A memory leak issue caused by PyPDF. [#9469](https://github.com/infiniflow/ragflow/pull/9469)
### API changes
#### Deprecated
[Create session with agent](./references/http_api_reference.md#create-session-with-agent)
## v0.20.1
Released on August 8, 2025.
### New Features
- The **Retrieval** component now supports the dynamic specification of knowledge base names using variables.
- The user interface now includes a French language option.
### Added Models
- GPT-5
- Claude 4.1
### New agent templates (both workflow and agentic)
- SQL Assistant Workflow: Empowers non-technical teams (e.g., operations, product) to independently query business data.
- Choose Your Knowledge Base Workflow: Lets users select a knowledge base to query during conversations. [#9325](https://github.com/infiniflow/ragflow/pull/9325)
- Choose Your Knowledge Base Agent: Delivers higher-quality responses with extended reasoning time, suited for complex queries. [#9325](https://github.com/infiniflow/ragflow/pull/9325)
### Fixed Issues
- The **Agent** component was unable to invoke models installed via vLLM.
- Agents could not be shared with the team.
- Embedding an Agent into a webpage was not functioning properly.
## v0.20.0
Released on August 4, 2025.
### Compatibility changes
From v0.20.0 onwards, Agents are no longer compatible with earlier versions, and all existing Agents from previous versions must be rebuilt following the upgrade.
### New features
- Unified orchestration of both Agents and Workflows.
- A comprehensive refactor of the Agent, greatly enhancing its capabilities and usability, with support for Multi-Agent configurations, planning and reflection, and visual functionalities.
- Fully implemented MCP functionality, allowing for MCP Server import, Agents functioning as MCP Clients, and RAGFlow itself operating as an MCP Server.
- Access to runtime logs for Agents.
- Chat histories with Agents available through the management panel.
- Integration of a new, more robust version of Infinity, enabling the auto-tagging functionality with Infinity as the underlying document engine.
- An OpenAI-compatible API that supports file reference information.
- Support for new models, including Kimi K2, Grok 4, and Voyage embedding.
- RAGFlows codebase is now mirrored on Gitee.
- Introduction of a new model provider, Gitee AI.
### New agent templates introduced
- Multi-Agent based Deep Research: Collaborative Agent teamwork led by a Lead Agent with multiple Subagents, distinct from traditional workflow orchestration.
- An intelligent Q&A chatbot leveraging internal knowledge bases, designed for customer service and training scenarios.
- A resume analysis template used by the RAGFlow team to screen, analyze, and record candidate information.
- A blog generation workflow that transforms raw ideas into SEO-friendly blog content.
- An intelligent customer service workflow.
- A user feedback analysis template that directs user feedback to appropriate teams through semantic analysis.
- Trip Planner: Uses web search and map MCP servers to assist with travel planning.
- Image Lingo: Translates content from uploaded photos.
- An information search assistant that retrieves answers from both internal knowledge bases and the web.
## v0.19.1
Released on June 23, 2025.
@ -123,7 +246,7 @@ From this release onwards, if you still see RAGFlow's responses being cut short
- Unable to add models via Ollama/Xinference, an issue introduced in v0.17.1.
### Related APIs
### API changes
#### HTTP APIs
@ -184,7 +307,7 @@ The following is a screenshot of a conversation that integrates Deep Research:
![Image](https://github.com/user-attachments/assets/165b88ff-1f5d-4fb8-90e2-c836b25e32e9)
### Related APIs
### API changes
#### HTTP APIs
@ -259,7 +382,7 @@ This release fixes the following issues:
- Using the **Table** parsing method results in information loss.
- Miscellaneous API issues.
### Related APIs
### API changes
#### HTTP APIs
@ -295,7 +418,7 @@ Released on December 18, 2024.
- Upgrades the Document Layout Analysis model in DeepDoc.
- Significantly enhances the retrieval performance when using [Infinity](https://github.com/infiniflow/infinity) as document engine.
### Related APIs
### API changes
#### HTTP APIs
@ -352,7 +475,7 @@ This approach eliminates the need to manually update **service_config.yaml** aft
Ensure that you [upgrade **both** your code **and** Docker image to this release](https://ragflow.io/docs/dev/upgrade_ragflow#upgrade-ragflow-to-the-most-recent-officially-published-release) before trying this new approach.
:::
### Related APIs
### API changes
#### HTTP APIs
@ -511,7 +634,7 @@ While we also test RAGFlow on ARM64 platforms, we do not maintain RAGFlow Docker
If you are on an ARM platform, follow [this guide](./develop/build_docker_image.mdx) to build a RAGFlow Docker image.
:::
### Related APIs
### API changes
#### HTTP API
@ -532,7 +655,7 @@ Released on May 21, 2024.
- Supports monitoring of system components, including Elasticsearch, MySQL, Redis, and MinIO.
- Supports disabling **Layout Recognition** in the GENERAL chunking method to reduce file chunking time.
### Related APIs
### API changes
#### HTTP API

View File

@ -15,6 +15,7 @@
#
import logging
import itertools
import os
import re
from dataclasses import dataclass
from typing import Any, Callable
@ -106,7 +107,8 @@ class EntityResolution(Extractor):
nonlocal remain_candidates_to_resolve, callback
async with semaphore:
try:
with trio.move_on_after(180) as cancel_scope:
enable_timeout_assertion = os.environ.get("ENABLE_TIMEOUT_ASSERTION")
with trio.move_on_after(280 if enable_timeout_assertion else 1000000000) as cancel_scope:
await self._resolve_candidate(candidate_batch, result_set, result_lock)
remain_candidates_to_resolve = remain_candidates_to_resolve - len(candidate_batch[1])
callback(msg=f"Resolved {len(candidate_batch[1])} pairs, {remain_candidates_to_resolve} are remained to resolve. ")
@ -169,7 +171,8 @@ class EntityResolution(Extractor):
logging.info(f"Created resolution prompt {len(text)} bytes for {len(candidate_resolution_i[1])} entity pairs of type {candidate_resolution_i[0]}")
async with chat_limiter:
try:
with trio.move_on_after(120) as cancel_scope:
enable_timeout_assertion = os.environ.get("ENABLE_TIMEOUT_ASSERTION")
with trio.move_on_after(280 if enable_timeout_assertion else 1000000000) as cancel_scope:
response = await trio.to_thread.run_sync(self._chat, text, [{"role": "user", "content": "Output:"}], {})
if cancel_scope.cancelled_caught:
logging.warning("_resolve_candidate._chat timeout, skipping...")

View File

@ -7,6 +7,7 @@ Reference:
import logging
import json
import os
import re
from typing import Callable
from dataclasses import dataclass
@ -51,6 +52,7 @@ class CommunityReportsExtractor(Extractor):
self._max_report_length = max_report_length or 1500
async def __call__(self, graph: nx.Graph, callback: Callable | None = None):
enable_timeout_assertion = os.environ.get("ENABLE_TIMEOUT_ASSERTION")
for node_degree in graph.degree:
graph.nodes[str(node_degree[0])]["rank"] = int(node_degree[1])
@ -92,7 +94,7 @@ class CommunityReportsExtractor(Extractor):
text = perform_variable_replacements(self._extraction_prompt, variables=prompt_variables)
async with chat_limiter:
try:
with trio.move_on_after(80) as cancel_scope:
with trio.move_on_after(180 if enable_timeout_assertion else 1000000000) as cancel_scope:
response = await trio.to_thread.run_sync( self._chat, text, [{"role": "user", "content": "Output:"}], {})
if cancel_scope.cancelled_caught:
logging.warning("extract_community_report._chat timeout, skipping...")

View File

@ -47,7 +47,7 @@ class Extractor:
self._language = language
self._entity_types = entity_types or DEFAULT_ENTITY_TYPES
@timeout(60*3)
@timeout(60*20)
def _chat(self, system, history, gen_conf={}):
hist = deepcopy(history)
conf = deepcopy(gen_conf)

View File

@ -15,6 +15,8 @@
#
import json
import logging
import os
import networkx as nx
import trio
@ -49,6 +51,7 @@ async def run_graphrag(
embedding_model,
callback,
):
enable_timeout_assertion=os.environ.get("ENABLE_TIMEOUT_ASSERTION")
start = trio.current_time()
tenant_id, kb_id, doc_id = row["tenant_id"], str(row["kb_id"]), row["doc_id"]
chunks = []
@ -57,20 +60,22 @@ async def run_graphrag(
):
chunks.append(d["content_with_weight"])
subgraph = await generate_subgraph(
LightKGExt
if "method" not in row["kb_parser_config"].get("graphrag", {}) or row["kb_parser_config"]["graphrag"]["method"] != "general"
else GeneralKGExt,
tenant_id,
kb_id,
doc_id,
chunks,
language,
row["kb_parser_config"]["graphrag"].get("entity_types", []),
chat_model,
embedding_model,
callback,
)
with trio.fail_after(max(120, len(chunks)*60*10) if enable_timeout_assertion else 10000000000):
subgraph = await generate_subgraph(
LightKGExt
if "method" not in row["kb_parser_config"].get("graphrag", {}) or row["kb_parser_config"]["graphrag"]["method"] != "general"
else GeneralKGExt,
tenant_id,
kb_id,
doc_id,
chunks,
language,
row["kb_parser_config"]["graphrag"].get("entity_types", []),
chat_model,
embedding_model,
callback,
)
if not subgraph:
return
@ -125,7 +130,6 @@ async def run_graphrag(
return
@timeout(60*60, 1)
async def generate_subgraph(
extractor: Extractor,
tenant_id: str,

View File

@ -130,7 +130,36 @@ Output:
PROMPTS[
"entiti_continue_extraction"
] = """MANY entities were missed in the last extraction. Add them below using the same format:
] = """
MANY entities and relationships were missed in the last extraction. Please find only the missing entities and relationships from previous text.
---Remember Steps---
1. Identify all entities. For each identified entity, extract the following information:
- entity_name: Name of the entity, use same language as input text. If English, capitalized the name
- entity_type: One of the following types: [{entity_types}]
- entity_description: Provide a comprehensive description of the entity's attributes and activities *based solely on the information present in the input text*. **Do not infer or hallucinate information not explicitly stated.** If the text provides insufficient information to create a comprehensive description, state "Description not available in text."
Format each entity as ("entity"{tuple_delimiter}<entity_name>{tuple_delimiter}<entity_type>{tuple_delimiter}<entity_description>)
2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.
For each pair of related entities, extract the following information:
- source_entity: name of the source entity, as identified in step 1
- target_entity: name of the target entity, as identified in step 1
- relationship_description: explanation as to why you think the source entity and the target entity are related to each other
- relationship_strength: a numeric score indicating strength of the relationship between the source entity and target entity
- relationship_keywords: one or more high-level key words that summarize the overarching nature of the relationship, focusing on concepts or themes rather than specific details
Format each relationship as ("relationship"{tuple_delimiter}<source_entity>{tuple_delimiter}<target_entity>{tuple_delimiter}<relationship_description>{tuple_delimiter}<relationship_keywords>{tuple_delimiter}<relationship_strength>)
3. Identify high-level key words that summarize the main concepts, themes, or topics of the entire text. These should capture the overarching ideas present in the document.
Format the content-level key words as ("content_keywords"{tuple_delimiter}<high_level_keywords>)
4. Return output in {language} as a single list of all the entities and relationships identified in steps 1 and 2. Use **{record_delimiter}** as the list delimiter.
5. When finished, output {completion_delimiter}
---Output---
Add new entities and relations below using the same format, and do not include entities and relations that have been previously extracted. :
"""
PROMPTS[
@ -252,4 +281,4 @@ When handling information with timestamps:
- List up to 5 most important reference sources at the end under "References", clearly indicating whether each source is from Knowledge Graph (KG) or Vector Data (VD)
Format: [KG/VD] Source content
Add sections and commentary to the response as appropriate for the length and format. If the provided information is insufficient to answer the question, clearly state that you don't know or cannot provide an answer in the same language as the user's question."""
Add sections and commentary to the response as appropriate for the length and format. If the provided information is insufficient to answer the question, clearly state that you don't know or cannot provide an answer in the same language as the user's question."""

View File

@ -307,6 +307,7 @@ def chunk_id(chunk):
async def graph_node_to_chunk(kb_id, embd_mdl, ent_name, meta, chunks):
global chat_limiter
enable_timeout_assertion=os.environ.get("ENABLE_TIMEOUT_ASSERTION")
chunk = {
"id": get_uuid(),
"important_kwd": [ent_name],
@ -324,7 +325,7 @@ async def graph_node_to_chunk(kb_id, embd_mdl, ent_name, meta, chunks):
ebd = get_embed_cache(embd_mdl.llm_name, ent_name)
if ebd is None:
async with chat_limiter:
with trio.fail_after(3):
with trio.fail_after(3 if enable_timeout_assertion else 30000000):
ebd, _ = await trio.to_thread.run_sync(lambda: embd_mdl.encode([ent_name]))
ebd = ebd[0]
set_embed_cache(embd_mdl.llm_name, ent_name, ebd)
@ -362,6 +363,7 @@ def get_relation(tenant_id, kb_id, from_ent_name, to_ent_name, size=1):
async def graph_edge_to_chunk(kb_id, embd_mdl, from_ent_name, to_ent_name, meta, chunks):
enable_timeout_assertion=os.environ.get("ENABLE_TIMEOUT_ASSERTION")
chunk = {
"id": get_uuid(),
"from_entity_kwd": from_ent_name,
@ -380,7 +382,7 @@ async def graph_edge_to_chunk(kb_id, embd_mdl, from_ent_name, to_ent_name, meta,
ebd = get_embed_cache(embd_mdl.llm_name, txt)
if ebd is None:
async with chat_limiter:
with trio.fail_after(3):
with trio.fail_after(3 if enable_timeout_assertion else 300000000):
ebd, _ = await trio.to_thread.run_sync(lambda: embd_mdl.encode([txt+f": {meta['description']}"]))
ebd = ebd[0]
set_embed_cache(embd_mdl.llm_name, txt, ebd)
@ -514,9 +516,10 @@ async def set_graph(tenant_id: str, kb_id: str, embd_mdl, graph: nx.Graph, chang
callback(msg=f"set_graph converted graph change to {len(chunks)} chunks in {now - start:.2f}s.")
start = now
enable_timeout_assertion=os.environ.get("ENABLE_TIMEOUT_ASSERTION")
es_bulk_size = 4
for b in range(0, len(chunks), es_bulk_size):
with trio.fail_after(3):
with trio.fail_after(3 if enable_timeout_assertion else 30000000):
doc_store_result = await trio.to_thread.run_sync(lambda: settings.docStoreConn.insert(chunks[b:b + es_bulk_size], search.index_name(tenant_id), kb_id))
if b % 100 == es_bulk_size and callback:
callback(msg=f"Insert chunks: {b}/{len(chunks)}")

View File

@ -44,9 +44,21 @@ spec:
checksum/config-es: {{ include (print $.Template.BasePath "/elasticsearch-config.yaml") . | sha256sum }}
checksum/config-env: {{ include (print $.Template.BasePath "/env.yaml") . | sha256sum }}
spec:
{{- if or .Values.imagePullSecrets .Values.elasticsearch.image.pullSecrets }}
imagePullSecrets:
{{- with .Values.imagePullSecrets }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.elasticsearch.image.pullSecrets }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
initContainers:
- name: fix-data-volume-permissions
image: alpine
image: {{ .Values.elasticsearch.initContainers.alpine.repository }}:{{ .Values.elasticsearch.initContainers.alpine.tag }}
{{- with .Values.elasticsearch.initContainers.alpine.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
command:
- sh
- -c
@ -55,14 +67,20 @@ spec:
- mountPath: /usr/share/elasticsearch/data
name: es-data
- name: sysctl
image: busybox
image: {{ .Values.elasticsearch.initContainers.busybox.repository }}:{{ .Values.elasticsearch.initContainers.busybox.tag }}
{{- with .Values.elasticsearch.initContainers.busybox.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
securityContext:
privileged: true
runAsUser: 0
command: ["sysctl", "-w", "vm.max_map_count=262144"]
containers:
- name: elasticsearch
image: elasticsearch:{{ .Values.env.STACK_VERSION }}
image: {{ .Values.elasticsearch.image.repository }}:{{ .Values.elasticsearch.image.tag }}
{{- with .Values.elasticsearch.image.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
envFrom:
- secretRef:
name: {{ include "ragflow.fullname" . }}-env-config

View File

@ -43,9 +43,21 @@ spec:
annotations:
checksum/config: {{ include (print $.Template.BasePath "/env.yaml") . | sha256sum }}
spec:
{{- if or .Values.imagePullSecrets .Values.infinity.image.pullSecrets }}
imagePullSecrets:
{{- with .Values.imagePullSecrets }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.infinity.image.pullSecrets }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
containers:
- name: infinity
image: {{ .Values.infinity.image.repository }}:{{ .Values.infinity.image.tag }}
{{- with .Values.infinity.image.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
envFrom:
- secretRef:
name: {{ include "ragflow.fullname" . }}-env-config

View File

@ -43,9 +43,21 @@ spec:
{{- include "ragflow.labels" . | nindent 8 }}
app.kubernetes.io/component: minio
spec:
{{- if or .Values.imagePullSecrets .Values.minio.image.pullSecrets }}
imagePullSecrets:
{{- with .Values.imagePullSecrets }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.minio.image.pullSecrets }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
containers:
- name: minio
image: {{ .Values.minio.image.repository }}:{{ .Values.minio.image.tag }}
{{- with .Values.minio.image.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
envFrom:
- secretRef:
name: {{ include "ragflow.fullname" . }}-env-config

Some files were not shown because too many files have changed in this diff Show More