Compare commits

...

25 Commits

Author SHA1 Message Date
1deb0a2d42 Fix:local variable 'response' referenced before assignment (#9230)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9227

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-05 11:00:06 +08:00
dd055deee9 Docs: Updated tips for max rounds (#9235)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-08-05 10:59:37 +08:00
a249803961 Refa: ensure Redis stream queue could be created properly (#9223)
### What problem does this PR solve?

Ensure Redis queue could be created properly.

### Type of change

- [x] Refactoring
2025-08-05 09:54:31 +08:00
6ec3f18e22 Fix: self-deployed LLM error, (#9217)
### What problem does this PR solve?

Close #9197
Close #9145

### Type of change

- [x] Refactoring
- [x] Bug fixing.
2025-08-05 09:49:47 +08:00
7724acbadb Perf Impr: Decouple reasoning and extraction for faster, more precise logic (#9191)
### What problem does this PR solve?

This commit refactors the core prompts to decouple the high-level
reasoning from the low-level information extraction. By making
REASON_PROMPT a dedicated strategist that only generates search queries
and re-tasking RELEVANT_EXTRACTION_PROMPT to be a specialized tool for
single-fact extraction, we eliminate redundant information
summarization. This clear separation of concerns makes the overall
reasoning process significantly faster and more precise, as each
component now has a single, well-defined responsibility.

### Type of change

- [x] Performance Improvement
2025-08-05 09:36:14 +08:00
a36ba95c1c Fix: Add prompt text to the form in the MCP module (#9222)
### What problem does this PR solve?

Fix: Add prompt text to the form in the MCP module #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 09:26:59 +08:00
30ccc4a66c Fix: correct single base64 image handling in image prompt (#9220)
### What problem does this PR solve?

Correct single base64 image handling in image prompt.


![img_v3_02or_ec4757c2-a9d4-4774-9a76-f7c6be633ebg](https://github.com/user-attachments/assets/872a86bf-e2a8-48d1-9b71-2a0c7a35ba9e)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 09:26:42 +08:00
dda5a0080a Fix: Fixed the issue where the agent's chat box could not automatically scroll to the bottom #3221 (#9219)
### What problem does this PR solve?

Fix: Fixed the issue where the agent's chat box could not automatically
scroll to the bottom #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 09:26:15 +08:00
9db999ccae v0.20.0 release notes (#9218)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-08-04 18:07:53 +08:00
5f5c6a7990 Fix: Fixed the loss of Await Response function on the share page and other style issues #3221 (#9216)
### What problem does this PR solve?

Fix: Fixed the loss of Await Response function on the share page and
other style issues #3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 18:06:56 +08:00
53618d13bb Fix: Fixed the issue where the prompt word edit box had no scroll bar #3221 (#9215)
### What problem does this PR solve?
Fix: Fixed the issue where the prompt word edit box had no scroll bar
#3221

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 18:06:19 +08:00
60d652d2e1 Feat: list documents supports range filtering (#9214)
### What problem does this PR solve?

list_document supports range filtering.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-04 16:35:35 +08:00
448bdda73d Fix: Web Server Accepts Invalid Data That Could Cause Problems in uv.lock (#8966)
**Context and Purpose:**

This PR automatically remediates a security vulnerability:
- **Description:** h11: h11 accepts some malformed Chunked-Encoding
bodies
- **Rule ID:** CVE-2025-43859
- **Severity:** CRITICAL
- **File:** uv.lock
- **Lines Affected:** None - None

This change is necessary to protect the application from potential
security risks associated with this vulnerability.

**Solution Implemented:**

The automated remediation process has applied the necessary changes to
the affected code in `uv.lock` to resolve the identified issue.

Please review the changes to ensure they are correct and integrate as
expected.
2025-08-04 16:09:15 +08:00
26b85a10d1 Feat: New Agent startup parameters add knowledge base parameter #9194 (#9210)
### What problem does this PR solve?

Feat: New Agent startup parameters add knowledge base parameter #9194

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-04 16:08:41 +08:00
cae11201ef fix "out of memory" if slide.get_thumbnail() to a huge image (#9211)
### What problem does this PR solve?

fix "out of memory" if slide.get_thumbnail() to a huge image

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 16:08:24 +08:00
6ad8b54754 fix "TypeError: '<' not supported between instances of 'Emu' and 'Non… (#9209)
…eType'"

### What problem does this PR solve?

fix "TypeError: '<' not supported between instances of 'Emu' and
'NoneType'"

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 16:07:03 +08:00
83aca2d07b fix #8424 NPE in dify_retrieval.py, add log exception (#9212)
### What problem does this PR solve?

fix #8424 NPE in dify_retrieval.py, add log exception

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 15:36:31 +08:00
34f829e1b1 docs(agent): Correct several spelling errors, such as: Ouline -> Outline (#9188)
### What problem does this PR solve?

Correct several spelling errors, such as: Ouline -> Outline

### Type of change

- [x] Documentation Update
2025-08-04 14:53:32 +08:00
52a349349d Fix: migrate deprecated Langfuse API from v2 to v3 (#9204)
### What problem does this PR solve?

Fix:

```bash
'Langfuse' object has no attribute 'trace'
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 14:45:43 +08:00
45bf294117 Refactor: support config strong test (#9198)
### What problem does this PR solve?


https://github.com/infiniflow/ragflow/issues/9189#issuecomment-3148920950

### Type of change
- [x] Refactoring

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-04 13:54:18 +08:00
667c5812d0 Fix:Repeated images when parsing markdown files with images (#9196)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9149

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 13:35:58 +08:00
30e9212db9 Fix: enlarge the timeout limits. (#9201)
### What problem does this PR solve?

#9189

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 13:34:34 +08:00
e9cbf4611d Fix:Error when parsing files using Gemini: **ERROR**: GENERIC_ERROR - Unknown field for GenerationConfig: max_tokens (#9195)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9177
The reason should be due to the gemin internal use a different parameter
name
`
        max_output_tokens (int):
            Optional. The maximum number of tokens to include in a
            response candidate.

            Note: The default value varies by model, see the
            ``Model.output_token_limit`` attribute of the ``Model``
            returned from the ``getModel`` function.

            This field is a member of `oneof`_ ``_max_output_tokens``.
`
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 10:06:09 +08:00
d4b1d163dd Fix: list tags api by using tenant id instead of user id (#9103)
### What problem does this PR solve?

The index name of the tag chunks is generated by the tenant id of the
knowledge base, so it should use the tenant id instead of the current
user id in the listing tags API.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 09:57:00 +08:00
fca94509e8 Feat: Add the migration script and its doc, added backup as default… (#8245)
### What problem does this PR solve?

This PR adds a data backup and migration solution for RAGFlow Docker
Compose deployments. Currently, users lack a standardized way to backup
and restore RAGFlow data volumes (MySQL, MinIO, Redis, Elasticsearch),
which is essential for data safety and environment migration.

**Solution:**
- **Migration Script** (`docker/migration.sh`) - Automates
backup/restore operations for all RAGFlow data volumes
- **Documentation**
(`docs/guides/migration/migrate_from_docker_compose.md`) - Usage guide
and best practices
- **Safety Features** - Container conflict detection and user
confirmations to prevent data loss

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update

Co-authored-by: treedy <treedy2022@icloud.com>
2025-08-04 09:43:43 +08:00
53 changed files with 1201 additions and 333 deletions

2
.gitignore vendored
View File

@ -193,3 +193,5 @@ dist
# SvelteKit build / generate output
.svelte-kit
# Default backup dir
backup

15
.trivyignore Normal file
View File

@ -0,0 +1,15 @@
**/*.md
**/*.min.js
**/*.min.css
**/*.svg
**/*.png
**/*.jpg
**/*.jpeg
**/*.gif
**/*.woff
**/*.woff2
**/*.map
**/*.webp
**/*.ico
**/*.ttf
**/*.eot

View File

@ -170,7 +170,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -250,7 +250,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -602,7 +602,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -715,7 +715,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],

View File

@ -169,7 +169,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -249,7 +249,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -601,7 +601,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -714,7 +714,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -912,4 +912,4 @@
"retrieval": []
},
"avatar": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/4gHYSUNDX1BST0ZJTEUAAQEAAAHIAAAAAAQwAABtbnRyUkdCIFhZWiAH4AABAAEAAAAAAABhY3NwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAA9tYAAQAAAADTLQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAlkZXNjAAAA8AAAACRyWFlaAAABFAAAABRnWFlaAAABKAAAABRiWFlaAAABPAAAABR3dHB0AAABUAAAABRyVFJDAAABZAAAAChnVFJDAAABZAAAAChiVFJDAAABZAAAAChjcHJ0AAABjAAAADxtbHVjAAAAAAAAAAEAAAAMZW5VUwAAAAgAAAAcAHMAUgBHAEJYWVogAAAAAAAAb6IAADj1AAADkFhZWiAAAAAAAABimQAAt4UAABjaWFlaIAAAAAAAACSgAAAPhAAAts9YWVogAAAAAAAA9tYAAQAAAADTLXBhcmEAAAAAAAQAAAACZmYAAPKnAAANWQAAE9AAAApbAAAAAAAAAABtbHVjAAAAAAAAAAEAAAAMZW5VUwAAACAAAAAcAEcAbwBvAGcAbABlACAASQBuAGMALgAgADIAMAAxADb/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/2wBDAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/wAARCAAwADADASIAAhEBAxEB/8QAGQAAAwEBAQAAAAAAAAAAAAAABgkKBwUI/8QAMBAAAAYCAQIEBQQCAwAAAAAAAQIDBAUGBxEhCAkAEjFBFFFhcaETFiKRFyOx8PH/xAAaAQACAwEBAAAAAAAAAAAAAAACAwABBgQF/8QALBEAAgIBAgUCBAcAAAAAAAAAAQIDBBEFEgATITFRIkEGIzJhFBUWgaGx8P/aAAwDAQACEQMRAD8AfF2hez9089t7pvxgQMa1Gb6qZ6oQE9m/NEvCIStyPfJSOF/M1epzMugo/qtMqbiRc1mJjoJKCLMNIxKcsLJedfO1Ct9cI63x9fx6CA/19t+oh4LFA5HfuAgP/A8eOIsnsTBrkBHXA7+v53+Q+ficTgJft9gIgA+/P9/1r342O/YA8A8k3/if+IbAN7+2/f8AAiI6H19PGoPyESTMZQPKUAHkQEN+3r9dh78/YPGUTk2wb/qAZZIugH1OHH5DjkdfbnWw2DsOxPj+xjrnx2H39unBopJGBn9s+PHv1HXjPJtH+J+B40O9a16h/wB/92j/ALrPa/wR104UyAobHlXhuo2HrEtK4qy3CwjKOuJLRHJLSkXWrFKs/gVrJVrE8TUiH8bPrP20UEu8m4hNpMJJuTOfnbUw/kUqyZgMHGjAO9+mtDsQ53sdcB6eMhnpEjhNQxRKICAgHy5+/roOdjr7c+J6O4x07dx484/n7nzw1gexBGfIPkZ/3t39uGpqc6+fP5/Ht8vGFZCzJjWpWuBxvO2yPjrtclUUK7BqmUI4fuASeyhG5FzFI0Bw4aQ0iZNoDgzvRW4qtyFkI4XmwyEk2YNnDp0sVBu3IUyy5iqH8gqKERSIRNIii67hddRJs1at01Xbx2sgzZoLu10UFJR+4V1A5cxF3FqNcLvjwcno43uuLrOxZYjujaClcb4QQfxEizpFiQyM9olcueRnjC2ZMt9iY06zL0qytrMSqSOVGsfHMaGhZ3l4lSRI2MqE74zJvRTveNFWWIh3RWw+XCAM5icKQLrCH57T17FhErSlRXnWvyZXKQwWJ3eraD14p5YuZCFgacskK2oGkVuKO5GYTHzf7DaD12cBD3DgPOIDrWw9PnrXPgDkpVsUDGMG+DD6E9gHXIjrYjwUPQTCXYgHPhIV974+F6E1hpC14Yzmzj56YaQEeZhXsayD1zLPW7pygxaMf81Nzu1iJsnIuDIKnaJAkPldqrHaoORZ73tMVEbFdSXT9nVgRQgnBq6j8e/HCIEATpAnH5KlmRVkFRFJwks/bqImSXJ5VFyA3N6Ikh3bCW3YHp5cowOmCfTgA+xJCnrjtwHKcLvJj2ZGcTRFj19kEhckdzgEjKnABGSSzdc1Fe5byXXGNjKdvRcw5NxvLidNZFFCxUa62KrzMaChw8hhYScFJtROAgmuLByq1MsgkZYPaVVuDe0wraRaqAdJwgRQo+YR8xTlAQNx6b49w41vXiJpCalLh1jZhyrTqRM4+jstdRmYryNkydLQRWg1LNGcWd5jIFFvCythlIySa0mNu74sKRQtaWsTmupqPItw0lE52ufpyYzrSkx6cw5bLmBEpkTsz+dt8P5QFuCRtAIkBH9MuwKHICIaDQhnojMs9mKaeGcrMxXlQtAYkdVljimRrE5MqI4zL8oSqQ6wxjodBqK05qdK3Vo3aCSVkBW7bjuC1NFJJBPaqyx6fp6pWkliYLXK2XrukkRu2CCVoSWMgsdMyySKwoLFcIGWSTUMg4IBgTcICoBhRcplMcpFkhIqQp1ClMBTmA0Zfe1zpjvHfXff65bZlzXpB3jjGTgiirmPjAfs16PHqHeQ75Wbj3xxZpOEkV3LRJJSPdomUBZISJLncV2k+8D07dxXp7xsYuTapA9UkJUYWIzNhadnWEZeCXGLQQiJi1ViHfhHL2unWh+mlORsrW0JFpEFnGVfm1mU4kq0FY3eD6corJncv6dr5NLSMNXVaTUksjTiMnaq8uFfSVuDyiJ1iZpy0LOJtpa3YfkcQ5fdozyxI2m5qqcrHN61YYmHsh6v3o9ParYmYJEtlhIx6+gUbjgD23M6oqg92YL0JyF6Bps+qDValVA9h9Lj5SZI3SHXdEQlj1wiQtLLIe6pGzjO3BlBkK1hxpblLVH5wdW0BcFKf/JwRtjsot2z8omaSdxbzzk1iEjsE0AM9rrRZNRIrVyo7dGO6E+oh8axLlJ5H5VaJKx7ePRGFbW6vUeFfHQIWPTI9Tm7HHfuhqY7E6C7JFqUzM6iZXIoncNxX7+bIVdJnTT48x3OQU1krIDW3UeixVhyISzYz6cadY5Xph6TseRNTRsTElzzBn9Vlly0TAERsdgnMYyLROjyFbg5R4ZlsGaMT4yNi2Zlq1GwjZB3jq0PsaJfA3t0jL0W0Y9xf1V41lpWckXMLaZiwxuKYPqc6LlHdkeRF+Qxswx5ASDqBVrsL+2A/N6SiCbYymV2BywJiMZj3GRRMTnL+lVyHCll3R7Szv0vqXMtQ74T+HijljIScLaEpkKCB3rqMBIi0jPs5JeOKTZMZEi5VVnouzy0k3jXjWSMlY6UcVGDxlKMVDqx91SILWSi3D2KdgYy3kP8E9X/AE1SnRXBNdNRMlefT6g7aY6giK+cPLGNg0bY68rcnpsNh9PqIBve/EcPQ3WIq2dR93xpSgk5SAZ9R6MLAOZFUkpLSUDXp6/KPpGUkmTdswlnKnwbl5ITMdGwcXJi7LKsqzUmT5tWYmkXuF9wjBvb76b7dHheazJ9RElUJOCxViuMlUJC0Gtz6PKyjLBY4qMWUe12r1xZ6lOyT6XPEBKN2CkTDOlZd02TBdTMt7Upx2knrkdCv1UKjDKn1A7XBYH6SCOOrWn5Oi/DtRiu+GleRthDL8rXdVjZlcfWrSIxVlGGGCOnH//Z"
}
}

View File

@ -169,7 +169,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -249,7 +249,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -601,7 +601,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\n\n\nThe Outline agent output is {Agent:BetterSitesSend@content}",
"role": "user"
}
],
@ -714,7 +714,7 @@
"presence_penalty": 0.5,
"prompts": [
{
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Ouline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"content": "The parse and keyword agent output is {Agent:ClearRabbitsScream@content}\n\nThe Outline agent output is {Agent:BetterSitesSend@content}\n\nThe Body agent output is {Agent:EagerNailsRemain@content}",
"role": "user"
}
],
@ -912,4 +912,4 @@
"retrieval": []
},
"avatar": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/4gHYSUNDX1BST0ZJTEUAAQEAAAHIAAAAAAQwAABtbnRyUkdCIFhZWiAH4AABAAEAAAAAAABhY3NwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAA9tYAAQAAAADTLQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAlkZXNjAAAA8AAAACRyWFlaAAABFAAAABRnWFlaAAABKAAAABRiWFlaAAABPAAAABR3dHB0AAABUAAAABRyVFJDAAABZAAAAChnVFJDAAABZAAAAChiVFJDAAABZAAAAChjcHJ0AAABjAAAADxtbHVjAAAAAAAAAAEAAAAMZW5VUwAAAAgAAAAcAHMAUgBHAEJYWVogAAAAAAAAb6IAADj1AAADkFhZWiAAAAAAAABimQAAt4UAABjaWFlaIAAAAAAAACSgAAAPhAAAts9YWVogAAAAAAAA9tYAAQAAAADTLXBhcmEAAAAAAAQAAAACZmYAAPKnAAANWQAAE9AAAApbAAAAAAAAAABtbHVjAAAAAAAAAAEAAAAMZW5VUwAAACAAAAAcAEcAbwBvAGcAbABlACAASQBuAGMALgAgADIAMAAxADb/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/2wBDAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/wAARCAAwADADASIAAhEBAxEB/8QAGQAAAwEBAQAAAAAAAAAAAAAABgkKBwUI/8QAMBAAAAYCAQIEBQQCAwAAAAAAAQIDBAUGBxEhCAkAEjFBFFFhcaETFiKRFyOx8PH/xAAaAQACAwEBAAAAAAAAAAAAAAACAwABBgQF/8QALBEAAgIBAgUCBAcAAAAAAAAAAQIDBBEFEgATITFRIkEGIzJhFBUWgaGx8P/aAAwDAQACEQMRAD8AfF2hez9089t7pvxgQMa1Gb6qZ6oQE9m/NEvCIStyPfJSOF/M1epzMugo/qtMqbiRc1mJjoJKCLMNIxKcsLJedfO1Ct9cI63x9fx6CA/19t+oh4LFA5HfuAgP/A8eOIsnsTBrkBHXA7+v53+Q+ficTgJft9gIgA+/P9/1r342O/YA8A8k3/if+IbAN7+2/f8AAiI6H19PGoPyESTMZQPKUAHkQEN+3r9dh78/YPGUTk2wb/qAZZIugH1OHH5DjkdfbnWw2DsOxPj+xjrnx2H39unBopJGBn9s+PHv1HXjPJtH+J+B40O9a16h/wB/92j/ALrPa/wR104UyAobHlXhuo2HrEtK4qy3CwjKOuJLRHJLSkXWrFKs/gVrJVrE8TUiH8bPrP20UEu8m4hNpMJJuTOfnbUw/kUqyZgMHGjAO9+mtDsQ53sdcB6eMhnpEjhNQxRKICAgHy5+/roOdjr7c+J6O4x07dx484/n7nzw1gexBGfIPkZ/3t39uGpqc6+fP5/Ht8vGFZCzJjWpWuBxvO2yPjrtclUUK7BqmUI4fuASeyhG5FzFI0Bw4aQ0iZNoDgzvRW4qtyFkI4XmwyEk2YNnDp0sVBu3IUyy5iqH8gqKERSIRNIii67hddRJs1at01Xbx2sgzZoLu10UFJR+4V1A5cxF3FqNcLvjwcno43uuLrOxZYjujaClcb4QQfxEizpFiQyM9olcueRnjC2ZMt9iY06zL0qytrMSqSOVGsfHMaGhZ3l4lSRI2MqE74zJvRTveNFWWIh3RWw+XCAM5icKQLrCH57T17FhErSlRXnWvyZXKQwWJ3eraD14p5YuZCFgacskK2oGkVuKO5GYTHzf7DaD12cBD3DgPOIDrWw9PnrXPgDkpVsUDGMG+DD6E9gHXIjrYjwUPQTCXYgHPhIV974+F6E1hpC14Yzmzj56YaQEeZhXsayD1zLPW7pygxaMf81Nzu1iJsnIuDIKnaJAkPldqrHaoORZ73tMVEbFdSXT9nVgRQgnBq6j8e/HCIEATpAnH5KlmRVkFRFJwks/bqImSXJ5VFyA3N6Ikh3bCW3YHp5cowOmCfTgA+xJCnrjtwHKcLvJj2ZGcTRFj19kEhckdzgEjKnABGSSzdc1Fe5byXXGNjKdvRcw5NxvLidNZFFCxUa62KrzMaChw8hhYScFJtROAgmuLByq1MsgkZYPaVVuDe0wraRaqAdJwgRQo+YR8xTlAQNx6b49w41vXiJpCalLh1jZhyrTqRM4+jstdRmYryNkydLQRWg1LNGcWd5jIFFvCythlIySa0mNu74sKRQtaWsTmupqPItw0lE52ufpyYzrSkx6cw5bLmBEpkTsz+dt8P5QFuCRtAIkBH9MuwKHICIaDQhnojMs9mKaeGcrMxXlQtAYkdVljimRrE5MqI4zL8oSqQ6wxjodBqK05qdK3Vo3aCSVkBW7bjuC1NFJJBPaqyx6fp6pWkliYLXK2XrukkRu2CCVoSWMgsdMyySKwoLFcIGWSTUMg4IBgTcICoBhRcplMcpFkhIqQp1ClMBTmA0Zfe1zpjvHfXff65bZlzXpB3jjGTgiirmPjAfs16PHqHeQ75Wbj3xxZpOEkV3LRJJSPdomUBZISJLncV2k+8D07dxXp7xsYuTapA9UkJUYWIzNhadnWEZeCXGLQQiJi1ViHfhHL2unWh+mlORsrW0JFpEFnGVfm1mU4kq0FY3eD6corJncv6dr5NLSMNXVaTUksjTiMnaq8uFfSVuDyiJ1iZpy0LOJtpa3YfkcQ5fdozyxI2m5qqcrHN61YYmHsh6v3o9ParYmYJEtlhIx6+gUbjgD23M6oqg92YL0JyF6Bps+qDValVA9h9Lj5SZI3SHXdEQlj1wiQtLLIe6pGzjO3BlBkK1hxpblLVH5wdW0BcFKf/JwRtjsot2z8omaSdxbzzk1iEjsE0AM9rrRZNRIrVyo7dGO6E+oh8axLlJ5H5VaJKx7ePRGFbW6vUeFfHQIWPTI9Tm7HHfuhqY7E6C7JFqUzM6iZXIoncNxX7+bIVdJnTT48x3OQU1krIDW3UeixVhyISzYz6cadY5Xph6TseRNTRsTElzzBn9Vlly0TAERsdgnMYyLROjyFbg5R4ZlsGaMT4yNi2Zlq1GwjZB3jq0PsaJfA3t0jL0W0Y9xf1V41lpWckXMLaZiwxuKYPqc6LlHdkeRF+Qxswx5ASDqBVrsL+2A/N6SiCbYymV2BywJiMZj3GRRMTnL+lVyHCll3R7Szv0vqXMtQ74T+HijljIScLaEpkKCB3rqMBIi0jPs5JeOKTZMZEi5VVnouzy0k3jXjWSMlY6UcVGDxlKMVDqx91SILWSi3D2KdgYy3kP8E9X/AE1SnRXBNdNRMlefT6g7aY6giK+cPLGNg0bY68rcnpsNh9PqIBve/EcPQ3WIq2dR93xpSgk5SAZ9R6MLAOZFUkpLSUDXp6/KPpGUkmTdswlnKnwbl5ITMdGwcXJi7LKsqzUmT5tWYmkXuF9wjBvb76b7dHheazJ9RElUJOCxViuMlUJC0Gtz6PKyjLBY4qMWUe12r1xZ6lOyT6XPEBKN2CkTDOlZd02TBdTMt7Upx2knrkdCv1UKjDKn1A7XBYH6SCOOrWn5Oi/DtRiu+GleRthDL8rXdVjZlcfWrSIxVlGGGCOnH//Z"
}
}

View File

@ -20,94 +20,128 @@ BEGIN_SEARCH_RESULT = "<|begin_search_result|>"
END_SEARCH_RESULT = "<|end_search_result|>"
MAX_SEARCH_LIMIT = 6
REASON_PROMPT = (
"You are a reasoning assistant with the ability to perform dataset searches to help "
"you answer the user's question accurately. You have special tools:\n\n"
f"- To perform a search: write {BEGIN_SEARCH_QUERY} your query here {END_SEARCH_QUERY}.\n"
f"Then, the system will search and analyze relevant content, then provide you with helpful information in the format {BEGIN_SEARCH_RESULT} ...search results... {END_SEARCH_RESULT}.\n\n"
f"You can repeat the search process multiple times if necessary. The maximum number of search attempts is limited to {MAX_SEARCH_LIMIT}.\n\n"
"Once you have all the information you need, continue your reasoning.\n\n"
"-- Example 1 --\n" ########################################
"Question: \"Are both the directors of Jaws and Casino Royale from the same country?\"\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who is the director of Jaws?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nThe director of Jaws is Steven Spielberg...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information.\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Where is Steven Spielberg from?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nSteven Allan Spielberg is an American filmmaker...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who is the director of Casino Royale?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCasino Royale is a 2006 spy film directed by Martin Campbell...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Where is Martin Campbell from?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nMartin Campbell (born 24 October 1943) is a New Zealand film and television director...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\nIt's enough to answer the question\n"
REASON_PROMPT = f"""You are an advanced reasoning agent. Your goal is to answer the user's question by breaking it down into a series of verifiable steps.
"-- Example 2 --\n" #########################################
"Question: \"When was the founder of craigslist born?\"\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who was the founder of craigslist?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCraigslist was founded by Craig Newmark...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information.\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY} When was Craig Newmark born?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCraig Newmark was born on December 6, 1952...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\nIt's enough to answer the question\n"
"**Remember**:\n"
f"- You have a dataset to search, so you just provide a proper search query.\n"
f"- Use {BEGIN_SEARCH_QUERY} to request a dataset search and end with {END_SEARCH_QUERY}.\n"
"- The language of query MUST be as the same as 'Question' or 'search result'.\n"
"- If no helpful information can be found, rewrite the search query to be less and precise keywords.\n"
"- When done searching, continue your reasoning.\n\n"
'Please answer the following question. You should think step by step to solve it.\n\n'
)
You have access to a powerful search tool to find information.
RELEVANT_EXTRACTION_PROMPT = """**Task Instruction:**
**Your Task:**
1. Analyze the user's question.
2. If you need information, issue a search query to find a specific fact.
3. Review the search results.
4. Repeat the search process until you have all the facts needed to answer the question.
5. Once you have gathered sufficient information, synthesize the facts and provide the final answer directly.
You are tasked with reading and analyzing web pages based on the following inputs: **Previous Reasoning Steps**, **Current Search Query**, and **Searched Web Pages**. Your objective is to extract relevant and helpful information for **Current Search Query** from the **Searched Web Pages** and seamlessly integrate this information into the **Previous Reasoning Steps** to continue reasoning for the original question.
**Tool Usage:**
- To search, you MUST write your query between the special tokens: {BEGIN_SEARCH_QUERY}your query{END_SEARCH_QUERY}.
- The system will provide results between {BEGIN_SEARCH_RESULT}search results{END_SEARCH_RESULT}.
- You have a maximum of {MAX_SEARCH_LIMIT} search attempts.
**Guidelines:**
---
**Example 1: Multi-hop Question**
1. **Analyze the Searched Web Pages:**
- Carefully review the content of each searched web page.
- Identify factual information that is relevant to the **Current Search Query** and can aid in the reasoning process for the original question.
**Question:** "Are both the directors of Jaws and Casino Royale from the same country?"
2. **Extract Relevant Information:**
- Select the information from the Searched Web Pages that directly contributes to advancing the **Previous Reasoning Steps**.
- Ensure that the extracted information is accurate and relevant.
**Your Thought Process & Actions:**
First, I need to identify the director of Jaws.
{BEGIN_SEARCH_QUERY}who is the director of Jaws?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Jaws is a 1975 American thriller film directed by Steven Spielberg.
{END_SEARCH_RESULT}
Okay, the director of Jaws is Steven Spielberg. Now I need to find out his nationality.
{BEGIN_SEARCH_QUERY}where is Steven Spielberg from?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Steven Allan Spielberg is an American filmmaker. Born in Cincinnati, Ohio...
{END_SEARCH_RESULT}
So, Steven Spielberg is from the USA. Next, I need to find the director of Casino Royale.
{BEGIN_SEARCH_QUERY}who is the director of Casino Royale 2006?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Casino Royale is a 2006 spy film directed by Martin Campbell.
{END_SEARCH_RESULT}
The director of Casino Royale is Martin Campbell. Now I need his nationality.
{BEGIN_SEARCH_QUERY}where is Martin Campbell from?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Martin Campbell (born 24 October 1943) is a New Zealand film and television director.
{END_SEARCH_RESULT}
I have all the information. Steven Spielberg is from the USA, and Martin Campbell is from New Zealand. They are not from the same country.
3. **Output Format:**
- **If the web pages provide helpful information for current search query:** Present the information beginning with `**Final Information**` as shown below.
- The language of query **MUST BE** as the same as 'Search Query' or 'Web Pages'.\n"
**Final Information**
Final Answer: No, the directors of Jaws and Casino Royale are not from the same country. Steven Spielberg is from the USA, and Martin Campbell is from New Zealand.
[Helpful information]
---
**Example 2: Simple Fact Retrieval**
- **If the web pages do not provide any helpful information for current search query:** Output the following text.
**Question:** "When was the founder of craigslist born?"
**Final Information**
**Your Thought Process & Actions:**
First, I need to know who founded craigslist.
{BEGIN_SEARCH_QUERY}who founded craigslist?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Craigslist was founded in 1995 by Craig Newmark.
{END_SEARCH_RESULT}
The founder is Craig Newmark. Now I need his birth date.
{BEGIN_SEARCH_QUERY}when was Craig Newmark born?{END_SEARCH_QUERY}
[System returns search results]
{BEGIN_SEARCH_RESULT}
Craig Newmark was born on December 6, 1952.
{END_SEARCH_RESULT}
I have found the answer.
No helpful information found.
Final Answer: The founder of craigslist, Craig Newmark, was born on December 6, 1952.
**Inputs:**
- **Previous Reasoning Steps:**
{prev_reasoning}
---
**Important Rules:**
- **One Fact at a Time:** Decompose the problem and issue one search query at a time to find a single, specific piece of information.
- **Be Precise:** Formulate clear and precise search queries. If a search fails, rephrase it.
- **Synthesize at the End:** Do not provide the final answer until you have completed all necessary searches.
- **Language Consistency:** Your search queries should be in the same language as the user's question.
- **Current Search Query:**
{search_query}
Now, begin your work. Please answer the following question by thinking step-by-step.
"""
- **Searched Web Pages:**
{document}
RELEVANT_EXTRACTION_PROMPT = """You are a highly efficient information extraction module. Your sole purpose is to extract the single most relevant piece of information from the provided `Searched Web Pages` that directly answers the `Current Search Query`.
"""
**Your Task:**
1. Read the `Current Search Query` to understand what specific information is needed.
2. Scan the `Searched Web Pages` to find the answer to that query.
3. Extract only the essential, factual information that answers the query. Be concise.
**Context (For Your Information Only):**
The `Previous Reasoning Steps` are provided to give you context on the overall goal, but your primary focus MUST be on answering the `Current Search Query`. Do not use information from the previous steps in your output.
**Output Format:**
Your response must follow one of two formats precisely.
1. **If a direct and relevant answer is found:**
- Start your response immediately with `Final Information`.
- Provide only the extracted fact(s). Do not add any extra conversational text.
*Example:*
`Current Search Query`: Where is Martin Campbell from?
`Searched Web Pages`: [Long article snippet about Martin Campbell's career, which includes the sentence "Martin Campbell (born 24 October 1943) is a New Zealand film and television director..."]
*Your Output:*
Final Information
Martin Campbell is a New Zealand film and television director.
2. **If no relevant answer that directly addresses the query is found in the web pages:**
- Start your response immediately with `Final Information`.
- Write the exact phrase: `No helpful information found.`
---
**BEGIN TASK**
**Inputs:**
- **Previous Reasoning Steps:**
{prev_reasoning}
- **Current Search Query:**
{search_query}
- **Searched Web Pages:**
{document}
"""

View File

@ -206,6 +206,8 @@ def list_docs():
desc = False
else:
desc = True
create_time_from = int(request.args.get("create_time_from", 0))
create_time_to = int(request.args.get("create_time_to", 0))
req = request.get_json()
@ -226,6 +228,14 @@ def list_docs():
try:
docs, tol = DocumentService.get_by_kb_id(kb_id, page_number, items_per_page, orderby, desc, keywords, run_status, types, suffix)
if create_time_from or create_time_to:
filtered_docs = []
for doc in docs:
doc_create_time = doc.get("create_time", 0)
if (create_time_from == 0 or doc_create_time >= create_time_from) and (create_time_to == 0 or doc_create_time <= create_time_to):
filtered_docs.append(doc)
docs = filtered_docs
for doc_item in docs:
if doc_item["thumbnail"] and not doc_item["thumbnail"].startswith(IMG_BASE64_PREFIX):
doc_item["thumbnail"] = f"/v1/document/image/{kb_id}-{doc_item['thumbnail']}"

View File

@ -247,7 +247,10 @@ def list_tags(kb_id):
code=settings.RetCode.AUTHENTICATION_ERROR
)
tags = settings.retrievaler.all_tags(current_user.id, [kb_id])
tenants = UserTenantService.get_tenants_by_user_id(current_user.id)
tags = []
for tenant in tenants:
tags += settings.retrievaler.all_tags(tenant["tenant_id"], [kb_id])
return get_json_result(data=tags)
@ -263,7 +266,10 @@ def list_tags_from_kbs():
code=settings.RetCode.AUTHENTICATION_ERROR
)
tags = settings.retrievaler.all_tags(current_user.id, kb_ids)
tenants = UserTenantService.get_tenants_by_user_id(current_user.id)
tags = []
for tenant in tenants:
tags += settings.retrievaler.all_tags(tenant["tenant_id"], kb_ids)
return get_json_result(data=tags)

View File

@ -1,4 +1,4 @@
#
#
# Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
@ -13,6 +13,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
import logging
from flask import request, jsonify
from api.db import LLMType
@ -73,11 +75,13 @@ def retrieval(tenant_id):
for c in ranks["chunks"]:
e, doc = DocumentService.get_by_id( c["doc_id"])
c.pop("vector", None)
meta = getattr(doc, 'meta_fields', {})
meta["doc_id"] = c["doc_id"]
records.append({
"content": c["content_with_weight"],
"score": c["similarity"],
"title": c["docnm_kwd"],
"metadata": doc.meta_fields
"metadata": meta
})
return jsonify({"records": records})
@ -87,4 +91,5 @@ def retrieval(tenant_id):
message='No chunk found! Check the chunk status please!',
code=settings.RetCode.NOT_FOUND
)
logging.exception(e)
return build_error_result(message=str(e), code=settings.RetCode.SERVER_ERROR)

View File

@ -38,7 +38,7 @@ from api.utils.api_utils import check_duplicate_ids, construct_json_result, get_
from rag.app.qa import beAdoc, rmPrefix
from rag.app.tag import label_question
from rag.nlp import rag_tokenizer, search
from rag.prompts import keyword_extraction, cross_languages
from rag.prompts import cross_languages, keyword_extraction
from rag.utils import rmSpace
from rag.utils.storage_factory import STORAGE_IMPL
@ -456,6 +456,18 @@ def list_docs(dataset_id, tenant_id):
required: false
default: true
description: Order in descending.
- in: query
name: create_time_from
type: integer
required: false
default: 0
description: Unix timestamp for filtering documents created after this time. 0 means no filter.
- in: query
name: create_time_to
type: integer
required: false
default: 0
description: Unix timestamp for filtering documents created before this time. 0 means no filter.
- in: header
name: Authorization
type: string
@ -517,6 +529,17 @@ def list_docs(dataset_id, tenant_id):
desc = True
docs, tol = DocumentService.get_list(dataset_id, page, page_size, orderby, desc, keywords, id, name)
create_time_from = int(request.args.get("create_time_from", 0))
create_time_to = int(request.args.get("create_time_to", 0))
if create_time_from or create_time_to:
filtered_docs = []
for doc in docs:
doc_create_time = doc.get("create_time", 0)
if (create_time_from == 0 or doc_create_time >= create_time_from) and (create_time_to == 0 or doc_create_time <= create_time_to):
filtered_docs.append(doc)
docs = filtered_docs
# rename key's name
renamed_doc_list = []
key_mapping = {

View File

@ -173,7 +173,8 @@ def completion(tenant_id, agent_id, session_id=None, **kwargs):
conv.message.append({"role": "assistant", "content": txt, "created_at": time.time(), "id": message_id})
conv.reference = canvas.get_reference()
conv.errors = canvas.error
API4ConversationService.append_message(conv.id, conv.to_dict())
conv = conv.to_dict()
API4ConversationService.append_message(conv["id"], conv)
def completionOpenAI(tenant_id, agent_id, question, session_id=None, stream=True, **kwargs):

View File

@ -208,12 +208,14 @@ def chat(dialog, messages, stream=True, **kwargs):
check_llm_ts = timer()
langfuse_tracer = None
trace_context = {}
langfuse_keys = TenantLangfuseService.filter_by_tenant(tenant_id=dialog.tenant_id)
if langfuse_keys:
langfuse = Langfuse(public_key=langfuse_keys.public_key, secret_key=langfuse_keys.secret_key, host=langfuse_keys.host)
if langfuse.auth_check():
langfuse_tracer = langfuse
langfuse.trace = langfuse_tracer.trace(name=f"{dialog.name}-{llm_model_config['llm_name']}")
trace_id = langfuse_tracer.create_trace_id()
trace_context = {"trace_id": trace_id}
check_langfuse_tracer_ts = timer()
kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog)
@ -400,17 +402,19 @@ def chat(dialog, messages, stream=True, **kwargs):
f" - Token speed: {int(tk_num / (generate_result_time_cost / 1000.0))}/s"
)
langfuse_output = "\n" + re.sub(r"^.*?(### Query:.*)", r"\1", prompt, flags=re.DOTALL)
langfuse_output = {"time_elapsed:": re.sub(r"\n", " \n", langfuse_output), "created_at": time.time()}
# Add a condition check to call the end method only if langfuse_tracer exists
if langfuse_tracer and "langfuse_generation" in locals():
langfuse_generation.end(output=langfuse_output)
langfuse_output = "\n" + re.sub(r"^.*?(### Query:.*)", r"\1", prompt, flags=re.DOTALL)
langfuse_output = {"time_elapsed:": re.sub(r"\n", " \n", langfuse_output), "created_at": time.time()}
langfuse_generation.update(output=langfuse_output)
langfuse_generation.end()
return {"answer": think + answer, "reference": refs, "prompt": re.sub(r"\n", " \n", prompt), "created_at": time.time()}
if langfuse_tracer:
langfuse_generation = langfuse_tracer.trace.generation(name="chat", model=llm_model_config["llm_name"], input={"prompt": prompt, "prompt4citation": prompt4citation, "messages": msg})
langfuse_generation = langfuse_tracer.start_generation(
trace_context=trace_context, name="chat", model=llm_model_config["llm_name"], input={"prompt": prompt, "prompt4citation": prompt4citation, "messages": msg}
)
if stream:
last_ans = ""

View File

@ -217,7 +217,7 @@ class TenantLLMService(CommonService):
return list(objs)
@staticmethod
def llm_id2llm_type(llm_id: str) ->str|None:
def llm_id2llm_type(llm_id: str) -> str | None:
llm_id, *_ = TenantLLMService.split_model_name_and_factory(llm_id)
llm_factories = settings.FACTORY_LLM_INFOS
for llm_factory in llm_factories:
@ -225,6 +225,9 @@ class TenantLLMService(CommonService):
if llm_id == llm["llm_name"]:
return llm["model_type"].split(",")[-1]
for llm in LLMService.query(llm_name=llm_id):
return llm.model_type
class LLMBundle:
def __init__(self, tenant_id, llm_type, llm_name=None, lang="Chinese", **kwargs):
@ -240,13 +243,13 @@ class LLMBundle:
self.verbose_tool_use = kwargs.get("verbose_tool_use")
langfuse_keys = TenantLangfuseService.filter_by_tenant(tenant_id=tenant_id)
self.langfuse = None
if langfuse_keys:
langfuse = Langfuse(public_key=langfuse_keys.public_key, secret_key=langfuse_keys.secret_key, host=langfuse_keys.host)
if langfuse.auth_check():
self.langfuse = langfuse
self.trace = self.langfuse.trace(name=f"{self.llm_type}-{self.llm_name}")
else:
self.langfuse = None
trace_id = self.langfuse.create_trace_id()
self.trace_context = {"trace_id": trace_id}
def bind_tools(self, toolcall_session, tools):
if not self.is_tools:
@ -256,7 +259,7 @@ class LLMBundle:
def encode(self, texts: list):
if self.langfuse:
generation = self.trace.generation(name="encode", model=self.llm_name, input={"texts": texts})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="encode", model=self.llm_name, input={"texts": texts})
embeddings, used_tokens = self.mdl.encode(texts)
llm_name = getattr(self, "llm_name", None)
@ -264,13 +267,14 @@ class LLMBundle:
logging.error("LLMBundle.encode can't update token usage for {}/EMBEDDING used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(usage_details={"total_tokens": used_tokens})
generation.update(usage_details={"total_tokens": used_tokens})
generation.end()
return embeddings, used_tokens
def encode_queries(self, query: str):
if self.langfuse:
generation = self.trace.generation(name="encode_queries", model=self.llm_name, input={"query": query})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="encode_queries", model=self.llm_name, input={"query": query})
emd, used_tokens = self.mdl.encode_queries(query)
llm_name = getattr(self, "llm_name", None)
@ -278,65 +282,70 @@ class LLMBundle:
logging.error("LLMBundle.encode_queries can't update token usage for {}/EMBEDDING used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(usage_details={"total_tokens": used_tokens})
generation.update(usage_details={"total_tokens": used_tokens})
generation.end()
return emd, used_tokens
def similarity(self, query: str, texts: list):
if self.langfuse:
generation = self.trace.generation(name="similarity", model=self.llm_name, input={"query": query, "texts": texts})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="similarity", model=self.llm_name, input={"query": query, "texts": texts})
sim, used_tokens = self.mdl.similarity(query, texts)
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
logging.error("LLMBundle.similarity can't update token usage for {}/RERANK used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(usage_details={"total_tokens": used_tokens})
generation.update(usage_details={"total_tokens": used_tokens})
generation.end()
return sim, used_tokens
def describe(self, image, max_tokens=300):
if self.langfuse:
generation = self.trace.generation(name="describe", metadata={"model": self.llm_name})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="describe", metadata={"model": self.llm_name})
txt, used_tokens = self.mdl.describe(image)
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
logging.error("LLMBundle.describe can't update token usage for {}/IMAGE2TEXT used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.update(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.end()
return txt
def describe_with_prompt(self, image, prompt):
if self.langfuse:
generation = self.trace.generation(name="describe_with_prompt", metadata={"model": self.llm_name, "prompt": prompt})
generation = self.language.start_generation(trace_context=self.trace_context, name="describe_with_prompt", metadata={"model": self.llm_name, "prompt": prompt})
txt, used_tokens = self.mdl.describe_with_prompt(image, prompt)
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
logging.error("LLMBundle.describe can't update token usage for {}/IMAGE2TEXT used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.update(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.end()
return txt
def transcription(self, audio):
if self.langfuse:
generation = self.trace.generation(name="transcription", metadata={"model": self.llm_name})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="transcription", metadata={"model": self.llm_name})
txt, used_tokens = self.mdl.transcription(audio)
if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
logging.error("LLMBundle.transcription can't update token usage for {}/SEQUENCE2TXT used_tokens: {}".format(self.tenant_id, used_tokens))
if self.langfuse:
generation.end(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.update(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.end()
return txt
def tts(self, text: str) -> Generator[bytes, None, None]:
if self.langfuse:
span = self.trace.span(name="tts", input={"text": text})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="tts", input={"text": text})
for chunk in self.mdl.tts(text):
if isinstance(chunk, int):
@ -346,7 +355,7 @@ class LLMBundle:
yield chunk
if self.langfuse:
span.end()
generation.end()
def _remove_reasoning_content(self, txt: str) -> str:
first_think_start = txt.find("<think>")
@ -362,9 +371,9 @@ class LLMBundle:
return txt[last_think_end + len("</think>") :]
def chat(self, system: str, history: list, gen_conf: dict={}, **kwargs) -> str:
def chat(self, system: str, history: list, gen_conf: dict = {}, **kwargs) -> str:
if self.langfuse:
generation = self.trace.generation(name="chat", model=self.llm_name, input={"system": system, "history": history})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="chat", model=self.llm_name, input={"system": system, "history": history})
chat_partial = partial(self.mdl.chat, system, history, gen_conf)
if self.is_tools and self.mdl.is_tools:
@ -380,13 +389,14 @@ class LLMBundle:
logging.error("LLMBundle.chat can't update token usage for {}/CHAT llm_name: {}, used_tokens: {}".format(self.tenant_id, self.llm_name, used_tokens))
if self.langfuse:
generation.end(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.update(output={"output": txt}, usage_details={"total_tokens": used_tokens})
generation.end()
return txt
def chat_streamly(self, system: str, history: list, gen_conf: dict={}, **kwargs):
def chat_streamly(self, system: str, history: list, gen_conf: dict = {}, **kwargs):
if self.langfuse:
generation = self.trace.generation(name="chat_streamly", model=self.llm_name, input={"system": system, "history": history})
generation = self.langfuse.start_generation(trace_context=self.trace_context, name="chat_streamly", model=self.llm_name, input={"system": system, "history": history})
ans = ""
chat_partial = partial(self.mdl.chat_streamly, system, history, gen_conf)
@ -398,7 +408,8 @@ class LLMBundle:
if isinstance(txt, int):
total_tokens = txt
if self.langfuse:
generation.end(output={"output": ans})
generation.update(output={"output": ans})
generation.end()
break
if txt.endswith("</think>"):

View File

@ -70,6 +70,7 @@ REGISTER_ENABLED = 1
# sandbox-executor-manager
SANDBOX_ENABLED = 0
SANDBOX_HOST = None
STRONG_TEST_COUNT = int(os.environ.get("STRONG_TEST_COUNT", "32"))
BUILTIN_EMBEDDING_MODELS = ["BAAI/bge-large-zh-v1.5@BAAI", "maidalun1020/bce-embedding-base_v1@Youdao"]

View File

@ -687,7 +687,13 @@ def timeout(seconds: float | int = None, attempts: int = 2, *, exception: Option
async def is_strong_enough(chat_model, embedding_model):
@timeout(30, 2)
count = settings.STRONG_TEST_COUNT
if not chat_model or not embedding_model:
return
if isinstance(count, int) and count <= 0:
return
@timeout(60, 2)
async def _is_strong_enough():
nonlocal chat_model, embedding_model
if embedding_model:
@ -701,5 +707,5 @@ async def is_strong_enough(chat_model, embedding_model):
# Pressure test for GraphRAG task
async with trio.open_nursery() as nursery:
for _ in range(32):
for _ in range(count):
nursery.start_soon(_is_strong_enough)

View File

@ -87,7 +87,7 @@ class RAGFlowPptParser:
break
texts = []
for shape in sorted(
slide.shapes, key=lambda x: ((x.top if x.top is not None else 0) // 10, x.left)):
slide.shapes, key=lambda x: ((x.top if x.top is not None else 0) // 10, x.left if x.left is not None else 0)):
try:
txt = self.__extract(shape)
if txt:
@ -96,4 +96,4 @@ class RAGFlowPptParser:
logging.exception(e)
txts.append("\n".join(texts))
return txts
return txts

298
docker/migration.sh Normal file
View File

@ -0,0 +1,298 @@
#!/bin/bash
# RAGFlow Data Migration Script
# Usage: ./migration.sh [backup|restore] [backup_folder]
#
# This script helps you backup and restore RAGFlow Docker volumes
# including MySQL, MinIO, Redis, and Elasticsearch data.
set -e # Exit on any error
# Instead, we'll handle errors manually for better debugging experience
# Default values
DEFAULT_BACKUP_FOLDER="backup"
VOLUMES=("docker_mysql_data" "docker_minio_data" "docker_redis_data" "docker_esdata01")
BACKUP_FILES=("mysql_backup.tar.gz" "minio_backup.tar.gz" "redis_backup.tar.gz" "es_backup.tar.gz")
# Function to display help information
show_help() {
echo "RAGFlow Data Migration Tool"
echo ""
echo "USAGE:"
echo " $0 <operation> [backup_folder]"
echo ""
echo "OPERATIONS:"
echo " backup - Create backup of all RAGFlow data volumes"
echo " restore - Restore RAGFlow data volumes from backup"
echo " help - Show this help message"
echo ""
echo "PARAMETERS:"
echo " backup_folder - Name of backup folder (default: '$DEFAULT_BACKUP_FOLDER')"
echo ""
echo "EXAMPLES:"
echo " $0 backup # Backup to './backup' folder"
echo " $0 backup my_backup # Backup to './my_backup' folder"
echo " $0 restore # Restore from './backup' folder"
echo " $0 restore my_backup # Restore from './my_backup' folder"
echo ""
echo "DOCKER VOLUMES:"
echo " - docker_mysql_data (MySQL database)"
echo " - docker_minio_data (MinIO object storage)"
echo " - docker_redis_data (Redis cache)"
echo " - docker_esdata01 (Elasticsearch indices)"
}
# Function to check if Docker is running
check_docker() {
if ! docker info >/dev/null 2>&1; then
echo "❌ Error: Docker is not running or not accessible"
echo "Please start Docker and try again"
exit 1
fi
}
# Function to check if volume exists
volume_exists() {
local volume_name=$1
docker volume inspect "$volume_name" >/dev/null 2>&1
}
# Function to check if any containers are using the target volumes
check_containers_using_volumes() {
echo "🔍 Checking for running containers that might be using target volumes..."
# Get all running containers
local running_containers=$(docker ps --format "{{.Names}}")
if [ -z "$running_containers" ]; then
echo "✅ No running containers found"
return 0
fi
# Check each running container for volume usage
local containers_using_volumes=()
local volume_usage_details=()
for container in $running_containers; do
# Get container's mount information
local mounts=$(docker inspect "$container" --format '{{range .Mounts}}{{.Source}}{{"|"}}{{end}}' 2>/dev/null || echo "")
# Check if any of our target volumes are used by this container
for volume in "${VOLUMES[@]}"; do
if echo "$mounts" | grep -q "$volume"; then
containers_using_volumes+=("$container")
volume_usage_details+=("$container -> $volume")
break
fi
done
done
# If any containers are using our volumes, show error and exit
if [ ${#containers_using_volumes[@]} -gt 0 ]; then
echo ""
echo "❌ ERROR: Found running containers using target volumes!"
echo ""
echo "📋 Running containers status:"
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Image}}"
echo ""
echo "🔗 Volume usage details:"
for detail in "${volume_usage_details[@]}"; do
echo " - $detail"
done
echo ""
echo "🛑 SOLUTION: Stop the containers before performing backup/restore operations:"
echo " docker-compose -f docker/<your-docker-compose-file>.yml down"
echo ""
echo "💡 After backup/restore, you can restart with:"
echo " docker-compose -f docker/<your-docker-compose-file>.yml up -d"
echo ""
exit 1
fi
echo "✅ No containers are using target volumes, safe to proceed"
return 0
}
# Function to confirm user action
confirm_action() {
local message=$1
echo -n "$message (y/N): "
read -r response
case "$response" in
[yY]|[yY][eE][sS]) return 0 ;;
*) return 1 ;;
esac
}
# Function to perform backup
perform_backup() {
local backup_folder=$1
echo "🚀 Starting RAGFlow data backup..."
echo "📁 Backup folder: $backup_folder"
echo ""
# Check if any containers are using the volumes
check_containers_using_volumes
# Create backup folder if it doesn't exist
mkdir -p "$backup_folder"
# Backup each volume
for i in "${!VOLUMES[@]}"; do
local volume="${VOLUMES[$i]}"
local backup_file="${BACKUP_FILES[$i]}"
local step=$((i + 1))
echo "📦 Step $step/4: Backing up $volume..."
if volume_exists "$volume"; then
docker run --rm \
-v "$volume":/source \
-v "$(pwd)/$backup_folder":/backup \
alpine tar czf "/backup/$backup_file" -C /source .
echo "✅ Successfully backed up $volume to $backup_folder/$backup_file"
else
echo "⚠️ Warning: Volume $volume does not exist, skipping..."
fi
echo ""
done
echo "🎉 Backup completed successfully!"
echo "📍 Backup location: $(pwd)/$backup_folder"
# List backup files with sizes
echo ""
echo "📋 Backup files created:"
for backup_file in "${BACKUP_FILES[@]}"; do
if [ -f "$backup_folder/$backup_file" ]; then
local size=$(ls -lh "$backup_folder/$backup_file" | awk '{print $5}')
echo " - $backup_file ($size)"
fi
done
}
# Function to perform restore
perform_restore() {
local backup_folder=$1
echo "🔄 Starting RAGFlow data restore..."
echo "📁 Backup folder: $backup_folder"
echo ""
# Check if any containers are using the volumes
check_containers_using_volumes
# Check if backup folder exists
if [ ! -d "$backup_folder" ]; then
echo "❌ Error: Backup folder '$backup_folder' does not exist"
exit 1
fi
# Check if all backup files exist
local missing_files=()
for backup_file in "${BACKUP_FILES[@]}"; do
if [ ! -f "$backup_folder/$backup_file" ]; then
missing_files+=("$backup_file")
fi
done
if [ ${#missing_files[@]} -gt 0 ]; then
echo "❌ Error: Missing backup files:"
for file in "${missing_files[@]}"; do
echo " - $file"
done
echo "Please ensure all backup files are present in '$backup_folder'"
exit 1
fi
# Check for existing volumes and warn user
local existing_volumes=()
for volume in "${VOLUMES[@]}"; do
if volume_exists "$volume"; then
existing_volumes+=("$volume")
fi
done
if [ ${#existing_volumes[@]} -gt 0 ]; then
echo "⚠️ WARNING: The following Docker volumes already exist:"
for volume in "${existing_volumes[@]}"; do
echo " - $volume"
done
echo ""
echo "🔴 IMPORTANT: Restoring will OVERWRITE existing data!"
echo "💡 Recommendation: Create a backup of your current data first:"
echo " $0 backup current_backup_$(date +%Y%m%d_%H%M%S)"
echo ""
if ! confirm_action "Do you want to continue with the restore operation?"; then
echo "❌ Restore operation cancelled by user"
exit 0
fi
fi
# Create volumes and restore data
for i in "${!VOLUMES[@]}"; do
local volume="${VOLUMES[$i]}"
local backup_file="${BACKUP_FILES[$i]}"
local step=$((i + 1))
echo "🔧 Step $step/4: Restoring $volume..."
# Create volume if it doesn't exist
if ! volume_exists "$volume"; then
echo " 📋 Creating Docker volume: $volume"
docker volume create "$volume"
else
echo " 📋 Using existing Docker volume: $volume"
fi
# Restore data
echo " 📥 Restoring data from $backup_file..."
docker run --rm \
-v "$volume":/target \
-v "$(pwd)/$backup_folder":/backup \
alpine tar xzf "/backup/$backup_file" -C /target
echo "✅ Successfully restored $volume"
echo ""
done
echo "🎉 Restore completed successfully!"
echo "💡 You can now start your RAGFlow services"
}
# Main script logic
main() {
# Check if Docker is available
check_docker
# Parse command line arguments
local operation=${1:-}
local backup_folder=${2:-$DEFAULT_BACKUP_FOLDER}
# Handle help or no arguments
if [ -z "$operation" ] || [ "$operation" = "help" ] || [ "$operation" = "-h" ] || [ "$operation" = "--help" ]; then
show_help
exit 0
fi
# Validate operation
case "$operation" in
backup)
perform_backup "$backup_folder"
;;
restore)
perform_restore "$backup_folder"
;;
*)
echo "❌ Error: Invalid operation '$operation'"
echo ""
show_help
exit 1
;;
esac
}
# Run main function with all arguments
main "$@"

View File

@ -82,7 +82,7 @@ An integer specifying the number of previous dialogue rounds to input into the L
This feature is used for multi-turn dialogue *only*.
:::
### Max retrieves
### Max retries
Defines the maximum number of attempts the agent will make to retry a failed task or operation before stopping or reporting failure.
@ -94,6 +94,10 @@ The waiting period in seconds that the agent observes before retrying a failed t
Defines the maximum number reflection rounds of the selected chat model. Defaults to 5 rounds.
:::tip NOTE
You can set the value to 1 to shorten your agent's response time.
:::
### Output
The global variable name for the output of the **Agent** component, which can be referenced by other components in the workflow.

View File

@ -0,0 +1,108 @@
# Data Migration Guide
A common scenario is processing large datasets on a powerful instance (e.g., with a GPU) and then migrating the entire RAGFlow service to a different production environment (e.g., a CPU-only server). This guide explains how to safely back up and restore your data using our provided migration script.
## Identifying Your Data
By default, RAGFlow uses Docker volumes to store all persistent data, including your database, uploaded files, and search indexes. You can see these volumes by running:
```bash
docker volume ls
```
The output will look similar to this:
```text
DRIVER VOLUME NAME
local docker_esdata01
local docker_minio_data
local docker_mysql_data
local docker_redis_data
```
These volumes contain all the data you need to migrate.
## Step 1: Stop RAGFlow Services
Before starting the migration, you must stop all running RAGFlow services on the **source machine**. Navigate to the project's root directory and run:
```bash
docker-compose -f docker/docker-compose.yml down
```
**Important:** Do **not** use the `-v` flag (e.g., `docker-compose down -v`), as this will delete all your data volumes. The migration script includes a check and will prevent you from running it if services are active.
## Step 2: Back Up Your Data
We provide a convenient script to package all your data volumes into a single backup folder.
For a quick reference of the script's commands and options, you can run:
```bash
bash docker/migration.sh help
```
To create a backup, run the following command from the project's root directory:
```bash
bash docker/migration.sh backup
```
This will create a `backup/` folder in your project root containing compressed archives of your data volumes.
You can also specify a custom name for your backup folder:
```bash
bash docker/migration.sh backup my_ragflow_backup
```
This will create a folder named `my_ragflow_backup/` instead.
## Step 3: Transfer the Backup Folder
Copy the entire backup folder (e.g., `backup/` or `my_ragflow_backup/`) from your source machine to the RAGFlow project directory on your **target machine**. You can use tools like `scp`, `rsync`, or a physical drive for the transfer.
## Step 4: Restore Your Data
On the **target machine**, ensure that RAGFlow services are not running. Then, use the migration script to restore your data from the backup folder.
If your backup folder is named `backup/`, run:
```bash
bash docker/migration.sh restore
```
If you used a custom name, specify it in the command:
```bash
bash docker/migration.sh restore my_ragflow_backup
```
The script will automatically create the necessary Docker volumes and unpack the data.
**Note:** If the script detects that Docker volumes with the same names already exist on the target machine, it will warn you that restoring will overwrite the existing data and ask for confirmation before proceeding.
## Step 5: Start RAGFlow Services
Once the restore process is complete, you can start the RAGFlow services on your new machine:
```bash
docker-compose -f docker/docker-compose.yml up -d
```
**Note:** If you already have build an service by docker-compose before, you may need to backup your data for target machine like this guide above and run like:
```bash
# Please backup by `sh docker/migration.sh backup backup_dir_name` before you do the following line.
# !!! this line -v flag will delete the original docker volume
docker-compose -f docker/docker-compose.yml down -v
docker-compose -f docker/docker-compose.yml up -d
```
Your RAGFlow instance is now running with all the data from your original machine.

View File

@ -1118,14 +1118,14 @@ Failure:
### List documents
**GET** `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}`
**GET** `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}&create_time_from={timestamp}&create_time_to={timestamp}`
Lists documents in a specified dataset.
#### Request
- Method: GET
- URL: `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}`
- URL: `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}&create_time_from={timestamp}&create_time_to={timestamp}`
- Headers:
- `'content-Type: application/json'`
- `'Authorization: Bearer <YOUR_API_KEY>'`
@ -1134,7 +1134,7 @@ Lists documents in a specified dataset.
```bash
curl --request GET \
--url http://{address}/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name} \
--url http://{address}/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}&create_time_from={timestamp}&create_time_to={timestamp} \
--header 'Authorization: Bearer <YOUR_API_KEY>'
```
@ -1156,6 +1156,10 @@ curl --request GET \
Indicates whether the retrieved documents should be sorted in descending order. Defaults to `true`.
- `id`: (*Filter parameter*), `string`
The ID of the document to retrieve.
- `create_time_from`: (*Filter parameter*), `integer`
Unix timestamp for filtering documents created after this time. 0 means no filter. Defaults to `0`.
- `create_time_to`: (*Filter parameter*), `integer`
Unix timestamp for filtering documents created before this time. 0 means no filter. Defaults to `0`.
#### Response

View File

@ -507,7 +507,16 @@ print(doc)
### List documents
```python
Dataset.list_documents(id:str =None, keywords: str=None, page: int=1, page_size:int = 30, order_by:str = "create_time", desc: bool = True) -> list[Document]
Dataset.list_documents(
id: str = None,
keywords: str = None,
page: int = 1,
page_size: int = 30,
order_by: str = "create_time",
desc: bool = True,
create_time_from: int = 0,
create_time_to: int = 0
) -> list[Document]
```
Lists documents in the current dataset.
@ -541,6 +550,12 @@ The field by which documents should be sorted. Available options:
Indicates whether the retrieved documents should be sorted in descending order. Defaults to `True`.
##### create_time_from: `int`
Unix timestamp for filtering documents created after this time. 0 means no filter. Defaults to 0.
##### create_time_to: `int`
Unix timestamp for filtering documents created before this time. 0 means no filter. Defaults to 0.
#### Returns
- Success: A list of `Document` objects.

View File

@ -22,6 +22,39 @@ The embedding models included in a full edition are:
These two embedding models are optimized specifically for English and Chinese, so performance may be compromised if you use them to embed documents in other languages.
:::
## v0.20.0
Released on August 4, 2025.
### Compatibility changes
From v0.20.0 onwards, Agents are no longer compatible with earlier versions, and all existing Agents from previous versions must be rebuilt following the upgrade.
### New features
- Unified orchestration of both Agents and Workflows.
- A comprehensive refactor of the Agent, greatly enhancing its capabilities and usability, with support for Multi-Agent configurations, planning and reflection, and visual functionalities.
- Fully implemented MCP functionality, allowing for MCP Server import, Agents functioning as MCP Clients, and RAGFlow itself operating as an MCP Server.
- Access to runtime logs for Agents.
- Chat histories with Agents available through the management panel.
- Integration of a new, more robust version of Infinity, enabling the auto-tagging functionality with Infinity as the underlying document engine.
- An OpenAI-compatible API that supports file reference information.
- Support for new embedding models, including Kimi K2, Grok 4, and Voyage.
- RAGFlows codebase is now mirrored on Gitee.
- Introduction of a new model provider, Gitee AI.
### New agent templates introduced
- Multi-Agent based Deep Research: Collaborative Agent teamwork led by a Lead Agent with multiple Subagents, distinct from traditional workflow orchestration.
- An intelligent Q&A chatbot leveraging internal knowledge bases, designed for customer service and training scenarios.
- A resume analysis template used by the RAGFlow team to screen, analyze, and record candidate information.
- A blog generation workflow that transforms raw ideas into SEO-friendly blog content.
- An intelligent customer service workflow.
- A user feedback analysis template that directs user feedback to appropriate teams through semantic analysis.
- Trip Planner: Uses web search and map MCP servers to assist with travel planning.
- Image Lingo: Translates content from uploaded photos.
- An information search assistant that retrieves answers from both internal knowledge bases and the web.
## v0.19.1
Released on June 23, 2025.

View File

@ -47,7 +47,7 @@ class Extractor:
self._language = language
self._entity_types = entity_types or DEFAULT_ENTITY_TYPES
@timeout(60*3)
@timeout(60*5)
def _chat(self, system, history, gen_conf={}):
hist = deepcopy(history)
conf = deepcopy(gen_conf)

View File

@ -42,7 +42,7 @@ class Ppt(PptParser):
try:
with BytesIO() as buffered:
slide.get_thumbnail(
0.5, 0.5).save(
0.1, 0.1).save(
buffered, drawing.imaging.ImageFormat.jpeg)
buffered.seek(0)
imgs.append(Image.open(buffered).copy())

View File

@ -1075,6 +1075,9 @@ class GeminiChat(Base):
for k in list(gen_conf.keys()):
if k not in ["temperature", "top_p", "max_tokens"]:
del gen_conf[k]
# if max_tokens exists, rename it to max_output_tokens to match Gemini's API
if k == "max_tokens":
gen_conf["max_output_tokens"] = gen_conf.pop("max_tokens")
return gen_conf
def _chat(self, history, gen_conf={}, **kwargs):

View File

@ -59,6 +59,10 @@ class Base(ABC):
def _image_prompt(self, text, images):
if not images:
return text
if isinstance(images, str):
images = [images]
pmpt = [{"type": "text", "text": text}]
for img in images:
pmpt.append({
@ -518,6 +522,7 @@ class GeminiCV(Base):
def chat_streamly(self, system, history, gen_conf, images=[]):
from transformers import GenerationConfig
ans = ""
response = None
try:
response = self.model.generate_content(
self._form_history(system, history, images),
@ -533,8 +538,10 @@ class GeminiCV(Base):
except Exception as e:
yield ans + "\n**ERROR**: " + str(e)
yield response._chunks[-1].usage_metadata.total_token_count
if response and hasattr(response, "usage_metadata") and hasattr(response.usage_metadata, "total_token_count"):
yield response.usage_metadata.total_token_count
else:
yield 0
class NvidiaCV(Base):
_FACTORY_NAME = "NVIDIA"
@ -795,4 +802,4 @@ class GoogleCV(AnthropicCV, GeminiCV):
yield ans
else:
for ans in GeminiCV.chat_streamly(self, system, history, gen_conf, images):
yield ans
yield ans

View File

@ -634,6 +634,17 @@ def concat_img(img1, img2):
return img2
if not img1 and not img2:
return None
if img1 is img2:
return img1
if isinstance(img1, Image.Image) and isinstance(img2, Image.Image):
pixel_data1 = img1.tobytes()
pixel_data2 = img2.tobytes()
if pixel_data1 == pixel_data2:
img2.close()
return img1
width1, height1 = img1.size
width2, height2 = img2.size
@ -643,7 +654,8 @@ def concat_img(img1, img2):
new_image.paste(img1, (0, 0))
new_image.paste(img2, (0, height1))
img1.close()
img2.close()
return new_image

View File

@ -227,9 +227,20 @@ class RedisDB:
"""https://redis.io/docs/latest/commands/xreadgroup/"""
for _ in range(3):
try:
group_info = self.REDIS.xinfo_groups(queue_name)
if not any(gi["name"] == group_name for gi in group_info):
self.REDIS.xgroup_create(queue_name, group_name, id="0", mkstream=True)
try:
group_info = self.REDIS.xinfo_groups(queue_name)
if not any(gi["name"] == group_name for gi in group_info):
self.REDIS.xgroup_create(queue_name, group_name, id="0", mkstream=True)
except redis.exceptions.ResponseError as e:
if "no such key" in str(e).lower():
self.REDIS.xgroup_create(queue_name, group_name, id="0", mkstream=True)
elif "busygroup" in str(e).lower():
logging.warning("Group already exists, continue.")
pass
else:
raise
args = {
"groupname": group_name,
"consumername": consumer_name,
@ -338,8 +349,8 @@ class RedisDB:
logging.warning("RedisDB.delete " + str(key) + " got exception: " + str(e))
self.__open__()
return False
REDIS_CONN = RedisDB()

View File

@ -123,7 +123,7 @@ class RAGFlowS3:
@use_prefix_path
@use_default_bucket
def put(self, bucket, fnm, binary):
def put(self, bucket, fnm, binary, **kwargs):
logging.debug(f"bucket name {bucket}; filename :{fnm}:")
for _ in range(1):
try:
@ -140,7 +140,7 @@ class RAGFlowS3:
@use_prefix_path
@use_default_bucket
def rm(self, bucket, fnm):
def rm(self, bucket, fnm, **kwargs):
try:
self.conn.delete_object(Bucket=bucket, Key=fnm)
except Exception:
@ -148,7 +148,7 @@ class RAGFlowS3:
@use_prefix_path
@use_default_bucket
def get(self, bucket, fnm):
def get(self, bucket, fnm, **kwargs):
for _ in range(1):
try:
r = self.conn.get_object(Bucket=bucket, Key=fnm)
@ -162,7 +162,7 @@ class RAGFlowS3:
@use_prefix_path
@use_default_bucket
def obj_exist(self, bucket, fnm):
def obj_exist(self, bucket, fnm, **kwargs):
try:
if self.conn.head_object(Bucket=bucket, Key=fnm):
return True
@ -174,7 +174,7 @@ class RAGFlowS3:
@use_prefix_path
@use_default_bucket
def get_presigned_url(self, bucket, fnm, expires):
def get_presigned_url(self, bucket, fnm, expires, **kwargs):
for _ in range(10):
try:
r = self.conn.generate_presigned_url('get_object',

View File

@ -63,8 +63,30 @@ class DataSet(Base):
return doc_list
raise Exception(res.get("message"))
def list_documents(self, id: str | None = None, name: str | None = None, keywords: str | None = None, page: int = 1, page_size: int = 30, orderby: str = "create_time", desc: bool = True):
res = self.get(f"/datasets/{self.id}/documents", params={"id": id, "name": name, "keywords": keywords, "page": page, "page_size": page_size, "orderby": orderby, "desc": desc})
def list_documents(
self,
id: str | None = None,
name: str | None = None,
keywords: str | None = None,
page: int = 1,
page_size: int = 30,
orderby: str = "create_time",
desc: bool = True,
create_time_from: int = 0,
create_time_to: int = 0,
):
params = {
"id": id,
"name": name,
"keywords": keywords,
"page": page,
"page_size": page_size,
"orderby": orderby,
"desc": desc,
"create_time_from": create_time_from,
"create_time_to": create_time_to,
}
res = self.get(f"/datasets/{self.id}/documents", params=params)
res = res.json()
documents = []
if res.get("code") == 0:

View File

@ -1,9 +1,13 @@
import { DocumentParserType } from '@/constants/knowledge';
import { useTranslate } from '@/hooks/common-hooks';
import { useFetchKnowledgeList } from '@/hooks/knowledge-hooks';
import { useBuildQueryVariableOptions } from '@/pages/agent/hooks/use-get-begin-query';
import { UserOutlined } from '@ant-design/icons';
import { Avatar as AntAvatar, Form, Select, Space } from 'antd';
import { toLower } from 'lodash';
import { useMemo } from 'react';
import { useFormContext } from 'react-hook-form';
import { useTranslation } from 'react-i18next';
import { RAGFlowAvatar } from './ragflow-avatar';
import { FormControl, FormField, FormItem, FormLabel } from './ui/form';
import { MultiSelect } from './ui/multi-select';
@ -66,9 +70,13 @@ const KnowledgeBaseItem = ({
export default KnowledgeBaseItem;
export function KnowledgeBaseFormField() {
export function KnowledgeBaseFormField({
showVariable = false,
}: {
showVariable?: boolean;
}) {
const form = useFormContext();
const { t } = useTranslate('chat');
const { t } = useTranslation();
const { list: knowledgeList } = useFetchKnowledgeList(true);
@ -76,6 +84,8 @@ export function KnowledgeBaseFormField() {
(x) => x.parser_id !== DocumentParserType.Tag,
);
const nextOptions = useBuildQueryVariableOptions();
const knowledgeOptions = filteredKnowledgeList.map((x) => ({
label: x.name,
value: x.id,
@ -84,18 +94,48 @@ export function KnowledgeBaseFormField() {
),
}));
const options = useMemo(() => {
if (showVariable) {
return [
{
label: t('knowledgeDetails.dataset'),
options: knowledgeOptions,
},
...nextOptions.map((x) => {
return {
...x,
options: x.options
.filter((y) => toLower(y.type).includes('string'))
.map((x) => ({
...x,
icon: () => (
<RAGFlowAvatar
className="size-4 mr-2"
avatar={x.label}
name={x.label}
/>
),
})),
};
}),
];
}
return knowledgeOptions;
}, [knowledgeOptions, nextOptions, showVariable, t]);
return (
<FormField
control={form.control}
name="kb_ids"
render={({ field }) => (
<FormItem>
<FormLabel>{t('knowledgeBases')}</FormLabel>
<FormLabel>{t('chat.knowledgeBases')}</FormLabel>
<FormControl>
<MultiSelect
options={knowledgeOptions}
options={options}
onValueChange={field.onChange}
placeholder={t('knowledgeBasesMessage')}
placeholder={t('chat.knowledgeBasesMessage')}
variant="inverted"
maxCount={100}
defaultValue={field.value}

View File

@ -28,8 +28,11 @@ function AccordionItem({
function AccordionTrigger({
className,
children,
hideDownIcon = false,
...props
}: React.ComponentProps<typeof AccordionPrimitive.Trigger>) {
}: React.ComponentProps<typeof AccordionPrimitive.Trigger> & {
hideDownIcon?: boolean;
}) {
return (
<AccordionPrimitive.Header className="flex">
<AccordionPrimitive.Trigger
@ -41,7 +44,9 @@ function AccordionTrigger({
{...props}
>
{children}
<ChevronDownIcon className="text-muted-foreground pointer-events-none size-4 shrink-0 translate-y-0.5 transition-transform duration-200" />
{!hideDownIcon && (
<ChevronDownIcon className="text-muted-foreground pointer-events-none size-4 shrink-0 translate-y-0.5 transition-transform duration-200" />
)}
</AccordionPrimitive.Trigger>
</AccordionPrimitive.Header>
);

View File

@ -1,3 +1,4 @@
// https://github.com/sersavan/shadcn-multi-select-component
// src/components/multi-select.tsx
import { cva, type VariantProps } from 'class-variance-authority';
@ -29,6 +30,51 @@ import {
import { Separator } from '@/components/ui/separator';
import { cn } from '@/lib/utils';
export type MultiSelectOptionType = {
label: React.ReactNode;
value: string;
disabled?: boolean;
icon?: React.ComponentType<{ className?: string }>;
};
export type MultiSelectGroupOptionType = {
label: React.ReactNode;
options: MultiSelectOptionType[];
};
function MultiCommandItem({
option,
isSelected,
toggleOption,
}: {
option: MultiSelectOptionType;
isSelected: boolean;
toggleOption(value: string): void;
}) {
return (
<CommandItem
key={option.value}
onSelect={() => toggleOption(option.value)}
className="cursor-pointer"
>
<div
className={cn(
'mr-2 flex h-4 w-4 items-center justify-center rounded-sm border border-primary',
isSelected
? 'bg-primary text-primary-foreground'
: 'opacity-50 [&_svg]:invisible',
)}
>
<CheckIcon className="h-4 w-4" />
</div>
{option.icon && (
<option.icon className="mr-2 h-4 w-4 text-muted-foreground" />
)}
<span>{option.label}</span>
</CommandItem>
);
}
/**
* Variants for the multi-select component to handle different styles.
* Uses class-variance-authority (cva) to define different styles based on "variant" prop.
@ -63,14 +109,7 @@ interface MultiSelectProps
* An array of option objects to be displayed in the multi-select component.
* Each option object has a label, value, and an optional icon.
*/
options: {
/** The text to display for the option. */
label: string;
/** The unique value associated with the option. */
value: string;
/** Optional icon component to display alongside the option. */
icon?: React.ComponentType<{ className?: string }>;
}[];
options: (MultiSelectGroupOptionType | MultiSelectOptionType)[];
/**
* Callback function triggered when the selected values change.
@ -144,6 +183,11 @@ export const MultiSelect = React.forwardRef<
const [isPopoverOpen, setIsPopoverOpen] = React.useState(false);
const [isAnimating, setIsAnimating] = React.useState(false);
const flatOptions = React.useMemo(() => {
return options.flatMap((option) =>
'options' in option ? option.options : [option],
);
}, [options]);
const handleInputKeyDown = (
event: React.KeyboardEvent<HTMLInputElement>,
) => {
@ -181,10 +225,10 @@ export const MultiSelect = React.forwardRef<
};
const toggleAll = () => {
if (selectedValues.length === options.length) {
if (selectedValues.length === flatOptions.length) {
handleClear();
} else {
const allValues = options.map((option) => option.value);
const allValues = flatOptions.map((option) => option.value);
setSelectedValues(allValues);
onValueChange(allValues);
}
@ -210,7 +254,7 @@ export const MultiSelect = React.forwardRef<
<div className="flex justify-between items-center w-full">
<div className="flex flex-wrap items-center">
{selectedValues?.slice(0, maxCount)?.map((value) => {
const option = options.find((o) => o.value === value);
const option = flatOptions.find((o) => o.value === value);
const IconComponent = option?.icon;
return (
<Badge
@ -304,7 +348,7 @@ export const MultiSelect = React.forwardRef<
<div
className={cn(
'mr-2 flex h-4 w-4 items-center justify-center rounded-sm border border-primary',
selectedValues.length === options.length
selectedValues.length === flatOptions.length
? 'bg-primary text-primary-foreground'
: 'opacity-50 [&_svg]:invisible',
)}
@ -313,32 +357,38 @@ export const MultiSelect = React.forwardRef<
</div>
<span>(Select All)</span>
</CommandItem>
{options.map((option) => {
const isSelected = selectedValues.includes(option.value);
return (
<CommandItem
key={option.value}
onSelect={() => toggleOption(option.value)}
className="cursor-pointer"
>
<div
className={cn(
'mr-2 flex h-4 w-4 items-center justify-center rounded-sm border border-primary',
isSelected
? 'bg-primary text-primary-foreground'
: 'opacity-50 [&_svg]:invisible',
)}
>
<CheckIcon className="h-4 w-4" />
</div>
{option.icon && (
<option.icon className="mr-2 h-4 w-4 text-muted-foreground" />
)}
<span>{option.label}</span>
</CommandItem>
);
})}
{!options.some((x) => 'options' in x) &&
(options as unknown as MultiSelectOptionType[]).map(
(option) => {
const isSelected = selectedValues.includes(option.value);
return (
<MultiCommandItem
option={option}
key={option.value}
isSelected={isSelected}
toggleOption={toggleOption}
></MultiCommandItem>
);
},
)}
</CommandGroup>
{options.every((x) => 'options' in x) &&
options.map((x, idx) => (
<CommandGroup heading={x.label} key={idx}>
{x.options.map((option) => {
const isSelected = selectedValues.includes(option.value);
return (
<MultiCommandItem
option={option}
key={option.value}
isSelected={isSelected}
toggleOption={toggleOption}
></MultiCommandItem>
);
})}
</CommandGroup>
))}
<CommandSeparator />
<CommandGroup>
<div className="flex items-center justify-between">

View File

@ -353,7 +353,12 @@ export const useHandleMessageInputChange = () => {
export const useSelectDerivedMessages = () => {
const [derivedMessages, setDerivedMessages] = useState<IMessage[]>([]);
const ref = useScrollToBottom(derivedMessages);
const messageContainerRef = useRef<HTMLDivElement>(null);
const { scrollRef, scrollToBottom } = useScrollToBottom(
derivedMessages,
messageContainerRef,
);
const addNewestQuestion = useCallback(
(message: Message, answer: string = '') => {
@ -492,7 +497,8 @@ export const useSelectDerivedMessages = () => {
}, [setDerivedMessages]);
return {
ref,
scrollRef,
messageContainerRef,
derivedMessages,
setDerivedMessages,
addNewestQuestion,
@ -503,6 +509,7 @@ export const useSelectDerivedMessages = () => {
addNewestOneAnswer,
removeMessagesAfterCurrentMessage,
removeAllMessages,
scrollToBottom,
};
};

View File

@ -39,6 +39,13 @@ export default {
nextPage: 'Next',
add: 'Add',
promptPlaceholder: `Please input or use / to quickly insert variables.`,
mcp: {
namePlaceholder: 'My MCP Server',
nameRequired:
'It must be 164 characters long and can only contain letters, numbers, hyphens, and underscores.',
urlPlaceholder: 'https://api.example.com/v1/mcp',
tokenPlaceholder: 'e.g. eyJhbGciOiJIUzI1Ni...',
},
},
login: {
login: 'Sign in',
@ -1322,6 +1329,7 @@ This delimiter is used to split the input text into several text pieces echo of
logTimeline: {
begin: 'Ready to begin',
agent: 'Agent is thinking',
userFillUp: 'Waiting for you',
retrieval: 'Looking up knowledge',
message: 'Agent says',
awaitResponse: 'Waiting for you',

View File

@ -1265,6 +1265,7 @@ General实体和关系提取提示来自 GitHub - microsoft/graphrag基于
subject: '主题',
logTimeline: {
begin: '准备开始',
userFillUp: '等你输入',
agent: '智能体正在思考',
retrieval: '查找知识',
message: '回复',

View File

@ -11,6 +11,11 @@ import {
DropdownMenuLabel,
DropdownMenuTrigger,
} from '@/components/ui/dropdown-menu';
import {
Tooltip,
TooltipContent,
TooltipTrigger,
} from '@/components/ui/tooltip';
import { IModalProps } from '@/interfaces/common';
import { Operator } from '@/pages/agent/constant';
import { AgentInstanceContext, HandleContext } from '@/pages/agent/context';
@ -33,19 +38,26 @@ function OperatorItemList({ operators }: OperatorItemProps) {
<ul className="space-y-2">
{operators.map((x) => {
return (
<DropdownMenuItem
key={x}
className="hover:bg-background-card py-1 px-3 cursor-pointer rounded-sm flex gap-2 items-center justify-start"
onClick={addCanvasNode(x, {
nodeId,
id,
position,
})}
onSelect={() => hideModal?.()}
>
<OperatorIcon name={x}></OperatorIcon>
{t(`flow.${lowerFirst(x)}`)}
</DropdownMenuItem>
<Tooltip key={x}>
<TooltipTrigger asChild>
<DropdownMenuItem
key={x}
className="hover:bg-background-card py-1 px-3 cursor-pointer rounded-sm flex gap-2 items-center justify-start"
onClick={addCanvasNode(x, {
nodeId,
id,
position,
})}
onSelect={() => hideModal?.()}
>
<OperatorIcon name={x}></OperatorIcon>
{t(`flow.${lowerFirst(x)}`)}
</DropdownMenuItem>
</TooltipTrigger>
<TooltipContent side="right">
<p>{t(`flow.${lowerFirst(x)}Description`)}</p>
</TooltipContent>
</Tooltip>
);
})}
</ul>

View File

@ -2,11 +2,11 @@ import { RAGFlowAvatar } from '@/components/ragflow-avatar';
import { useFetchKnowledgeList } from '@/hooks/knowledge-hooks';
import { IRetrievalNode } from '@/interfaces/database/flow';
import { NodeProps, Position } from '@xyflow/react';
import { Flex } from 'antd';
import classNames from 'classnames';
import { get } from 'lodash';
import { memo, useMemo } from 'react';
import { NodeHandleId } from '../../constant';
import { useGetVariableLabelByValue } from '../../hooks/use-get-begin-query';
import { CommonHandle } from './handle';
import { LeftHandleStyle, RightHandleStyle } from './handle-icon';
import styles from './index.less';
@ -21,6 +21,7 @@ function InnerRetrievalNode({
selected,
}: NodeProps<IRetrievalNode>) {
const knowledgeBaseIds: string[] = get(data, 'form.kb_ids', []);
console.log('🚀 ~ InnerRetrievalNode ~ knowledgeBaseIds:', knowledgeBaseIds);
const { list: knowledgeList } = useFetchKnowledgeList(true);
const knowledgeBases = useMemo(() => {
return knowledgeBaseIds.map((x) => {
@ -33,6 +34,8 @@ function InnerRetrievalNode({
});
}, [knowledgeList, knowledgeBaseIds]);
const getLabel = useGetVariableLabelByValue(id);
return (
<ToolBar selected={selected} id={id} label={data.label}>
<NodeWrapper selected={selected}>
@ -63,25 +66,27 @@ function InnerRetrievalNode({
[styles.nodeHeader]: knowledgeBaseIds.length > 0,
})}
></NodeHeader>
<Flex vertical gap={8}>
{knowledgeBases.map((knowledge) => {
<section className="flex flex-col gap-2">
{knowledgeBaseIds.map((id) => {
const item = knowledgeList.find((y) => id === y.id);
const label = getLabel(id);
return (
<div className={styles.nodeText} key={knowledge.id}>
<Flex align={'center'} gap={6}>
<div className={styles.nodeText} key={id}>
<div className="flex items-center gap-1.5">
<RAGFlowAvatar
className="size-6 rounded-lg"
avatar={knowledge.avatar}
name={knowledge.name || 'CN'}
avatar={id}
name={item?.name || (label as string) || 'CN'}
isPerson={true}
/>
<Flex className={styles.knowledgeNodeName} flex={1}>
{knowledge.name}
</Flex>
</Flex>
<div className={'truncate flex-1'}>{label || item?.name}</div>
</div>
</div>
);
})}
</Flex>
</section>
</NodeWrapper>
</ToolBar>
);

View File

@ -13,19 +13,17 @@ import {
useUploadCanvasFileWithProgress,
} from '@/hooks/use-agent-request';
import { useFetchUserInfo } from '@/hooks/user-setting-hooks';
import { Message } from '@/interfaces/database/chat';
import { buildMessageUuidWithRole } from '@/utils/chat';
import { get } from 'lodash';
import { memo, useCallback, useMemo } from 'react';
import { memo, useCallback } from 'react';
import { useParams } from 'umi';
import DebugContent from '../debug-content';
import { BeginQuery } from '../interface';
import { buildBeginQueryWithObject } from '../utils';
import { useAwaitCompentData } from '../hooks/use-chat-logic';
function AgentChatBox() {
const {
value,
ref,
scrollRef,
messageContainerRef,
sendLoading,
derivedMessages,
handleInputChange,
@ -43,33 +41,12 @@ function AgentChatBox() {
const { data: canvasInfo } = useFetchAgent();
const { id: canvasId } = useParams();
const { uploadCanvasFile, loading } = useUploadCanvasFileWithProgress();
const getInputs = useCallback((message: Message) => {
return get(message, 'data.inputs', {}) as Record<string, BeginQuery>;
}, []);
const buildInputList = useCallback(
(message: Message) => {
return Object.entries(getInputs(message)).map(([key, val]) => {
return {
...val,
key,
};
});
},
[getInputs],
);
const handleOk = useCallback(
(message: Message) => (values: BeginQuery[]) => {
const inputs = getInputs(message);
const nextInputs = buildBeginQueryWithObject(inputs, values);
sendFormMessage({
inputs: nextInputs,
id: canvasId,
});
},
[canvasId, getInputs, sendFormMessage],
);
const { buildInputList, handleOk, isWaitting } = useAwaitCompentData({
derivedMessages,
sendFormMessage,
canvasId: canvasId as string,
});
const handleUploadFile: NonNullable<FileUploadProps['onUpload']> =
useCallback(
@ -79,20 +56,11 @@ function AgentChatBox() {
},
[appendUploadResponseList, uploadCanvasFile],
);
const isWaitting = useMemo(() => {
const temp = derivedMessages?.some((message, i) => {
const flag =
message.role === MessageType.Assistant &&
derivedMessages.length - 1 === i &&
message.data;
return flag;
});
return temp;
}, [derivedMessages]);
return (
<>
<section className="flex flex-1 flex-col px-5 h-[90vh]">
<div className="flex-1 overflow-auto">
<div className="flex-1 overflow-auto" ref={messageContainerRef}>
<div>
{/* <Spin spinning={sendLoading}> */}
{derivedMessages?.map((message, i) => {
@ -139,7 +107,7 @@ function AgentChatBox() {
})}
{/* </Spin> */}
</div>
<div ref={ref.scrollRef} />
<div ref={scrollRef} />
</div>
<NextMessageInput
value={value}

View File

@ -198,12 +198,14 @@ export const useSendAgentMessage = (
const prologue = useGetBeginNodePrologue();
const {
derivedMessages,
ref,
scrollRef,
messageContainerRef,
removeLatestMessage,
removeMessageById,
addNewestOneQuestion,
addNewestOneAnswer,
removeAllMessages,
scrollToBottom,
} = useSelectDerivedMessages();
const { addEventList: addEventListFun } = useContext(AgentChatLogContext);
const {
@ -269,7 +271,7 @@ export const useSendAgentMessage = (
const sendFormMessage = useCallback(
(body: { id?: string; inputs: Record<string, BeginQuery> }) => {
send(body);
send({ ...body, session_id: sessionId });
addNewestOneQuestion({
content: Object.entries(body.inputs)
.map(([key, val]) => `${key}: ${val.value}`)
@ -277,7 +279,7 @@ export const useSendAgentMessage = (
role: MessageType.User,
});
},
[addNewestOneQuestion, send],
[addNewestOneQuestion, send, sessionId],
);
// reset session
@ -303,7 +305,18 @@ export const useSendAgentMessage = (
});
}
addNewestOneQuestion({ ...msgBody, files: fileList });
}, [value, done, addNewestOneQuestion, fileList, setValue, sendMessage]);
setTimeout(() => {
scrollToBottom();
}, 100);
}, [
value,
done,
addNewestOneQuestion,
fileList,
setValue,
sendMessage,
scrollToBottom,
]);
useEffect(() => {
const { content, id } = findMessageFromList(answerList);
@ -337,7 +350,8 @@ export const useSendAgentMessage = (
value,
sendLoading: !done,
derivedMessages,
ref,
scrollRef,
messageContainerRef,
handlePressEnter,
handleInputChange,
removeMessageById,

View File

@ -103,9 +103,12 @@ function PromptContent({
</div>
)}
<ContentEditable
className={cn('relative px-2 py-1 focus-visible:outline-none', {
'min-h-40': multiLine,
})}
className={cn(
'relative px-2 py-1 focus-visible:outline-none max-h-[50vh] overflow-auto',
{
'min-h-40': multiLine,
},
)}
onBlur={handleBlur}
onFocus={handleFocus}
/>

View File

@ -97,7 +97,7 @@ function RetrievalForm({ node }: INextOperatorForm) {
<FormWrapper>
<FormContainer>
<QueryVariable></QueryVariable>
<KnowledgeBaseFormField></KnowledgeBaseFormField>
<KnowledgeBaseFormField showVariable></KnowledgeBaseFormField>
</FormContainer>
<Collapse title={<div>Advanced Settings</div>}>
<FormContainer>

View File

@ -43,7 +43,7 @@ const RetrievalForm = () => {
>
<FormContainer>
<DescriptionField></DescriptionField>
<KnowledgeBaseFormField></KnowledgeBaseFormField>
<KnowledgeBaseFormField showVariable></KnowledgeBaseFormField>
</FormContainer>
<Collapse title={<div>Advanced Settings</div>}>
<FormContainer>

View File

@ -0,0 +1,60 @@
import { MessageType } from '@/constants/chat';
import { Message } from '@/interfaces/database/chat';
import { IMessage } from '@/pages/chat/interface';
import { get } from 'lodash';
import { useCallback, useMemo } from 'react';
import { BeginQuery } from '../interface';
import { buildBeginQueryWithObject } from '../utils';
type IAwaitCompentData = {
derivedMessages: IMessage[];
sendFormMessage: (params: {
inputs: Record<string, BeginQuery>;
id: string;
}) => void;
canvasId: string;
};
const useAwaitCompentData = (props: IAwaitCompentData) => {
const { derivedMessages, sendFormMessage, canvasId } = props;
const getInputs = useCallback((message: Message) => {
return get(message, 'data.inputs', {}) as Record<string, BeginQuery>;
}, []);
const buildInputList = useCallback(
(message: Message) => {
return Object.entries(getInputs(message)).map(([key, val]) => {
return {
...val,
key,
};
});
},
[getInputs],
);
const handleOk = useCallback(
(message: Message) => (values: BeginQuery[]) => {
const inputs = getInputs(message);
const nextInputs = buildBeginQueryWithObject(inputs, values);
sendFormMessage({
inputs: nextInputs,
id: canvasId,
});
},
[getInputs, sendFormMessage, canvasId],
);
const isWaitting = useMemo(() => {
const temp = derivedMessages?.some((message, i) => {
const flag =
message.role === MessageType.Assistant &&
derivedMessages.length - 1 === i &&
message.data;
return flag;
});
return temp;
}, [derivedMessages]);
return { getInputs, buildInputList, handleOk, isWaitting };
};
export { useAwaitCompentData };

View File

@ -14,24 +14,37 @@ import {
import { cn } from '@/lib/utils';
import { isEmpty } from 'lodash';
import { Operator } from '../constant';
import OperatorIcon from '../operator-icon';
import OperatorIcon, { SVGIconMap } from '../operator-icon';
import {
JsonViewer,
toLowerCaseStringAndDeleteChar,
typeMap,
} from './workFlowTimeline';
const capitalizeWords = (str: string, separator: string = '_'): string => {
if (!str) return '';
type IToolIcon =
| Operator.ArXiv
| Operator.GitHub
| Operator.Bing
| Operator.DuckDuckGo
| Operator.Google
| Operator.GoogleScholar
| Operator.PubMed
| Operator.TavilyExtract
| Operator.TavilySearch
| Operator.Wikipedia
| Operator.YahooFinance
| Operator.WenCai
| Operator.Crawler;
return str
.split(separator)
.map((word) => {
return word.charAt(0).toUpperCase() + word.slice(1).toLowerCase();
})
.join(' ');
const capitalizeWords = (str: string, separator: string = '_'): string[] => {
if (!str) return [''];
const resultStrArr = str.split(separator).map((word) => {
return word.charAt(0).toUpperCase() + word.slice(1).toLowerCase();
});
return resultStrArr;
};
const changeToolName = (toolName: any) => {
const name = 'Agent ' + capitalizeWords(toolName);
const name = 'Agent ' + capitalizeWords(toolName).join(' ');
return name;
};
const ToolTimelineItem = ({
@ -61,6 +74,8 @@ const ToolTimelineItem = ({
return (
<>
{filteredTools?.map((tool, idx) => {
const toolName = capitalizeWords(tool.tool_name, '_').join('');
return (
<TimelineItem
key={'tool_' + idx}
@ -105,7 +120,11 @@ const ToolTimelineItem = ({
<div className="size-6 flex items-center justify-center">
<OperatorIcon
className="size-4"
name={'Agent' as Operator}
name={
(SVGIconMap[toolName as IToolIcon]
? toolName
: 'Agent') as Operator
}
></OperatorIcon>
</div>
</div>
@ -119,12 +138,14 @@ const ToolTimelineItem = ({
className="bg-background-card px-3"
>
<AccordionItem value={idx.toString()}>
<AccordionTrigger>
<AccordionTrigger
hideDownIcon={isShare && isEmpty(tool.arguments)}
>
<div className="flex gap-2 items-center">
{!isShare && (
<span>
{parentName(tool.path) + ' '}
{capitalizeWords(tool.tool_name, '_')}
{capitalizeWords(tool.tool_name, '_').join(' ')}
</span>
)}
{isShare && (
@ -142,7 +163,7 @@ const ToolTimelineItem = ({
</span>
<span
className={cn(
'border-background -end-1 -top-1 size-2 rounded-full border-2 bg-dot-green',
'border-background -end-1 -top-1 size-2 rounded-full bg-dot-green',
)}
>
<span className="sr-only">Online</span>
@ -161,7 +182,7 @@ const ToolTimelineItem = ({
)}
{isShare && !isEmpty(tool.arguments) && (
<AccordionContent>
<div className="space-y-2">
<div className="space-y-2 bg-muted p-2">
{tool &&
tool.arguments &&
Object.entries(tool.arguments).length &&
@ -171,8 +192,8 @@ const ToolTimelineItem = ({
<div className="text-sm font-medium leading-none">
{key}
</div>
<div className="text-sm text-muted-foreground">
{val || ''}
<div className="text-sm text-muted-foreground mt-1">
{val as string}
</div>
</div>
);

View File

@ -51,7 +51,7 @@ export function JsonViewer({
src={data}
displaySize
collapseStringsAfterLength={100000000000}
className="w-full h-[200px] break-words overflow-auto scrollbar-auto p-2 bg-slate-800"
className="w-full h-[200px] break-words overflow-auto scrollbar-auto p-2 bg-muted"
/>
</section>
);
@ -81,11 +81,21 @@ export const typeMap = {
httpRequest: t('flow.logTimeline.httpRequest'),
wenCai: t('flow.logTimeline.wenCai'),
yahooFinance: t('flow.logTimeline.yahooFinance'),
userFillUp: t('flow.logTimeline.userFillUp'),
};
export const toLowerCaseStringAndDeleteChar = (
str: string,
char: string = '_',
) => str.toLowerCase().replace(/ /g, '').replaceAll(char, '');
// Convert all keys in typeMap to lowercase and output the new typeMap
export const typeMapLowerCase = Object.fromEntries(
Object.entries(typeMap).map(([key, value]) => [
toLowerCaseStringAndDeleteChar(key),
value,
]),
);
function getInputsOrOutputs(
nodeEventList: INodeData[],
field: 'inputs' | 'outputs',
@ -247,16 +257,19 @@ export const WorkFlowTimeline = ({
className="bg-background-card px-3"
>
<AccordionItem value={idx.toString()}>
<AccordionTrigger>
<AccordionTrigger
hideDownIcon={isShare && !x.data?.thoughts}
>
<div className="flex gap-2 items-center">
<span>
{!isShare && getNodeName(x.data?.component_name)}
{isShare &&
typeMap[
(typeMapLowerCase[
toLowerCaseStringAndDeleteChar(
nodeLabel,
) as keyof typeof typeMap
]}
] ??
nodeLabel)}
</span>
<span className="text-text-sub-title text-xs">
{x.data.elapsed_time?.toString().slice(0, 6)}
@ -294,7 +307,7 @@ export const WorkFlowTimeline = ({
{isShare && x.data?.thoughts && (
<AccordionContent>
<div className="space-y-2">
<div className="w-full h-[200px] break-words overflow-auto scrollbar-auto p-2 bg-slate-800">
<div className="w-full h-[200px] break-words overflow-auto scrollbar-auto p-2 bg-muted">
<HightLightMarkdown>
{x.data.thoughts || ''}
</HightLightMarkdown>

View File

@ -38,7 +38,7 @@ export const OperatorIconMap = {
[Operator.Email]: 'sendemail-0',
};
const SVGIconMap = {
export const SVGIconMap = {
[Operator.ArXiv]: ArxivIcon,
[Operator.GitHub]: GithubIcon,
[Operator.Bing]: BingIcon,

View File

@ -231,7 +231,7 @@ const AgentLogPage: React.FC = () => {
<div className="flex justify-between items-center">
<h1 className="text-2xl font-bold mb-4">Log</h1>
<div className="flex justify-end space-x-2 mb-4">
<div className="flex justify-end space-x-2 mb-4 text-foreground">
<div className="flex items-center space-x-2">
<span>ID/Title</span>
<SearchInput

View File

@ -1,7 +1,6 @@
import MessageItem from '@/components/message-item';
import { MessageType } from '@/constants/chat';
import { Flex, Spin } from 'antd';
import { useRef } from 'react';
import {
useCreateConversationBeforeUploadDocument,
useGetFileIcon,
@ -19,7 +18,6 @@ import {
useFetchNextDialog,
useGetChatSearchParams,
} from '@/hooks/chat-hooks';
import { useScrollToBottom } from '@/hooks/logic-hooks';
import { useFetchUserInfo } from '@/hooks/user-setting-hooks';
import { buildMessageUuidWithRole } from '@/utils/chat';
import { memo } from 'react';
@ -34,10 +32,10 @@ const ChatContainer = ({ controller }: IProps) => {
const { data: conversation } = useFetchNextConversation();
const { data: currentDialog } = useFetchNextDialog();
const messageContainerRef = useRef<HTMLDivElement>(null);
const {
value,
ref,
scrollRef,
messageContainerRef,
loading,
sendLoading,
derivedMessages,
@ -47,10 +45,6 @@ const ChatContainer = ({ controller }: IProps) => {
removeMessageById,
stopOutputMessage,
} = useSendNextMessage(controller);
const { scrollRef, isAtBottom, scrollToBottom } = useScrollToBottom(
derivedMessages,
messageContainerRef,
);
const { visible, hideModal, documentId, selectedChunk, clickDocumentButton } =
useClickDrawer();
@ -61,11 +55,6 @@ const ChatContainer = ({ controller }: IProps) => {
const { createConversationBeforeUploadDocument } =
useCreateConversationBeforeUploadDocument();
const handleSend = (msg) => {
// your send logic
setTimeout(scrollToBottom, 0);
};
return (
<>
<Flex flex={1} className={styles.chatContainer} vertical>

View File

@ -291,7 +291,8 @@ export const useSetConversation = () => {
export const useSelectNextMessages = () => {
const {
ref,
scrollRef,
messageContainerRef,
setDerivedMessages,
derivedMessages,
addNewestAnswer,
@ -335,7 +336,8 @@ export const useSelectNextMessages = () => {
}, [conversation.message, conversationId, setDerivedMessages, isNew]);
return {
ref,
scrollRef,
messageContainerRef,
derivedMessages,
loading,
addNewestAnswer,
@ -371,7 +373,8 @@ export const useSendNextMessage = (controller: AbortController) => {
api.completeConversation,
);
const {
ref,
scrollRef,
messageContainerRef,
derivedMessages,
loading,
addNewestAnswer,
@ -499,7 +502,8 @@ export const useSendNextMessage = (controller: AbortController) => {
regenerateMessage,
sendLoading: !done,
loading,
ref,
scrollRef,
messageContainerRef,
derivedMessages,
removeMessageById,
stopOutputMessage,

View File

@ -13,7 +13,9 @@ import {
} from '@/hooks/use-agent-request';
import { cn } from '@/lib/utils';
import i18n from '@/locales/config';
import DebugContent from '@/pages/agent/debug-content';
import { useCacheChatLog } from '@/pages/agent/hooks/use-cache-chat-log';
import { useAwaitCompentData } from '@/pages/agent/hooks/use-chat-logic';
import { IInputs } from '@/pages/agent/interface';
import { useSendButtonDisabled } from '@/pages/chat/hooks';
import { buildMessageUuidWithRole } from '@/utils/chat';
@ -48,7 +50,8 @@ const ChatContainer = () => {
handleInputChange,
value,
sendLoading,
ref,
scrollRef,
messageContainerRef,
derivedMessages,
hasError,
stopOutputMessage,
@ -56,10 +59,15 @@ const ChatContainer = () => {
appendUploadResponseList,
parameterDialogVisible,
showParameterDialog,
sendFormMessage,
ok,
resetSession,
} = useSendNextSharedMessage(addEventList);
const { buildInputList, handleOk, isWaitting } = useAwaitCompentData({
derivedMessages,
sendFormMessage,
canvasId: conversationId as string,
});
const sendDisabled = useSendButtonDisabled(value);
const appConf = useFetchAppConf();
const { data: inputsData } = useFetchExternalAgentInputs();
@ -142,6 +150,7 @@ const ChatContainer = () => {
className={cn(
'flex flex-1 flex-col overflow-auto scrollbar-auto m-auto w-5/6',
)}
ref={messageContainerRef}
>
<div>
{derivedMessages?.map((message, i) => {
@ -171,26 +180,47 @@ const ChatContainer = () => {
showLoudspeaker={false}
showLog={false}
sendLoading={sendLoading}
></MessageItem>
>
{message.role === MessageType.Assistant &&
derivedMessages.length - 1 === i && (
<DebugContent
parameters={buildInputList(message)}
message={message}
ok={handleOk(message)}
isNext={false}
btnText={'Submit'}
></DebugContent>
)}
{message.role === MessageType.Assistant &&
derivedMessages.length - 1 !== i && (
<div>
<div>{message?.data?.tips}</div>
<div>
{buildInputList(message)?.map((item) => item.value)}
</div>
</div>
)}
</MessageItem>
);
})}
</div>
<div ref={ref} />
<div ref={scrollRef} />
</div>
<div className="flex w-full justify-center mb-8">
<div className="w-5/6">
<NextMessageInput
isShared
value={value}
disabled={hasError}
sendDisabled={sendDisabled}
disabled={hasError || isWaitting}
sendDisabled={sendDisabled || isWaitting}
conversationId={conversationId}
onInputChange={handleInputChange}
onPressEnter={handlePressEnter}
sendLoading={sendLoading}
stopOutputMessage={stopOutputMessage}
onUpload={handleUploadFile}
isUploading={loading}
isUploading={loading || isWaitting}
></NextMessageInput>
</div>
</div>

View File

@ -37,20 +37,23 @@ export function useBuildFormSchema() {
name: z
.string()
.min(1, {
message: t('common.namePlaceholder'),
message: t('common.mcp.namePlaceholder'),
})
.regex(/^[a-zA-Z0-9_-]{1,64}$/, {
message: t('common.mcp.nameRequired'),
})
.trim(),
url: z
.string()
.url()
.min(1, {
message: t('common.namePlaceholder'),
message: t('common.mcp.urlPlaceholder'),
})
.trim(),
server_type: z
.string()
.min(1, {
message: t('common.namePlaceholder'),
message: t('common.pleaseSelect'),
})
.trim(),
authorization_token: z.string().optional(),
@ -89,7 +92,7 @@ export function EditMcpForm({
<FormLabel>{t('common.name')}</FormLabel>
<FormControl>
<Input
placeholder={t('common.namePlaceholder')}
placeholder={t('common.mcp.namePlaceholder')}
{...field}
autoComplete="off"
/>
@ -106,7 +109,7 @@ export function EditMcpForm({
<FormLabel>{t('mcp.url')}</FormLabel>
<FormControl>
<Input
placeholder={t('common.namePlaceholder')}
placeholder={t('common.mcp.urlPlaceholder')}
{...field}
autoComplete="off"
onChange={(e) => {
@ -148,7 +151,7 @@ export function EditMcpForm({
<FormLabel>Authorization Token</FormLabel>
<FormControl>
<Input
placeholder={t('common.namePlaceholder')}
placeholder={t('common.mcp.tokenPlaceholder')}
{...field}
autoComplete="off"
type="password"

View File

@ -246,7 +246,7 @@
@layer utilities {
.scrollbar-auto {
/* hide scrollbar */
scrollbar-width: thin;
scrollbar-width: none;
scrollbar-color: transparent transparent;
}