Commit Graph

3109 Commits

Author SHA1 Message Date
7b57ab5dea Fix: retrieval component for shared KB issue. (#7513)
### What problem does this PR solve?

#7483

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-08 09:20:34 +08:00
e300d90c00 Docs: minor format updates (#7514)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-05-07 19:49:01 +08:00
87317bcfc4 Docs: Initial editorial pass to MCP server (#7359)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-05-07 19:40:45 +08:00
9849230a04 Fix: remove deprecated novitaAI. (#7511)
### What problem does this PR solve?

#7484

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-07 19:36:16 +08:00
fa32a2d0fd Fix:When sharing the knowledge base of multiple tenants with one person, when this person queries the knowledge base of both tenants, they will only query the question of the first person's knowledge base (#7500)
Fix:When sharing the knowledge base of multiple tenants with one person,
when this person queries the knowledge base of both tenants, they will
only query the question of the first person's knowledge base

Co-authored-by: 杜有强 <duyq@internal.ths.com.cn>
2025-05-07 16:05:40 +08:00
27ffc0ed74 Feat: Improve 'user_canvan_version' delete and 'document' delete performance (#6553)
### What problem does this PR solve?

1.  Add delete_by_ids method
2. Add get_doc_ids_by_doc_names
3. Improve user_canvan_version's logic (avoid O(n) db IO)
4. Improve document delete logic (avoid O(n) db IO)

### Type of change

- [x] Performance Improvement
2025-05-07 10:55:08 +08:00
539876af11 docs: add API key instructions for MCP host mode (#7496)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: 马继龙 <majilong@ideal.com>
2025-05-07 10:38:21 +08:00
b1c8746984 fix: After the file is deleted, it still remains in the bucket. (#7482)
### What problem does this PR solve?

Fix: After deleting the file from the file management menu, it was not
removed from the MinIO bucket.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
2025-05-06 19:30:42 +08:00
bc3160f75a Feat: Support knowledge base type input in agent flow debugger (#7471)
### What problem does this PR solve?

This is a follow-up of #7088 , adding a knowledge base type input to the
`Begin` component, and a knowledge base selector to the agent flow debug
input panel:


![image](https://github.com/user-attachments/assets/e4cd35f1-1c8e-4f69-bed4-5d613b96d148)

then you can select one or more knowledge bases when testing the agent:


![image](https://github.com/user-attachments/assets/724b547e-4790-4cd8-83d3-67e02f2e76d8)

Note: the lines changed in `agent/component/retrieval.py` after line 94
are modified by `ruff format` from the `pre-commit` hooks, no functional
change.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-06 19:30:27 +08:00
75b24ba02a Fix: chat solo issue. (#7479)
### What problem does this PR solve?



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 19:30:00 +08:00
953b3e1b3f Fix: Sometimes VisionFigureParser.figures may is tuple (#7477)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7466
I think due to some times we can not get position 

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 17:38:22 +08:00
c98933499a refa: Optimize create dataset validation (#7451)
### What problem does this PR solve?

Optimize dataset validation and add function docs

### Type of change

- [x] Refactoring
2025-05-06 17:38:06 +08:00
2f768b96e8 perf: optimze figure parser (#7392)
### What problem does this PR solve?

When parsing documents containing images, the current code uses a
single-threaded approach to call the VL model, resulting in extremely
slow parsing speed (e.g., parsing a Word document with dozens of images
takes over 20 minutes).

By switching to a multithreaded approach to call the VL model, the
parsing speed can be improved to an acceptable level.

### Type of change

- [x] Performance Improvement

---------

Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
2025-05-06 14:39:45 +08:00
d6cc6453d1 fixed errror when vars of cnt begin declare with key contain "begin" (#7457)
### What problem does this PR solve?
fixed errror when vars of cnt begin  declare with key contain "begin"

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 14:39:22 +08:00
45dfaf230c fix(deps): incorrect nltk download dir (#7447)
### What problem does this PR solve?

Fix https://github.com/infiniflow/ragflow/issues/7224 and
https://github.com/infiniflow/ragflow/issues/6793

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)a
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-06 14:39:05 +08:00
65537b8200 Fix:Set CUDA_VISIBLE_DEVICES In DefaultEmbedding (#7465)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7420

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 14:38:36 +08:00
60787f8d5d Fix Ollama instructions (#7478)
Fix instructions for Ollama

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-06 13:57:39 +08:00
c4b3d3af95 Fix instructions for Ollama (#7468)
1. Use `host.docker.internal` as base URL
2. Fix numbers in list
3. Make clear what is the console input and what is the output

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-06 09:47:19 +08:00
f29a5de9f5 Fix: filed_map was incorrectly persisted (#7443)
### What problem does this PR solve?

Fix `filed_map` was incorrectly persisted. #7412 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 09:44:38 +08:00
cb37f00a8f Feat: Modify the style of the dataset page #3221 (#7446)
### What problem does this PR solve?

Feat:  Modify the style of the dataset page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-02 21:27:21 +08:00
fc379e90d1 Fix: change create dataset htto api delimiter default value to r'\n' (#7434)
### What problem does this PR solve?

change create dataset delimiter default value to r'\n'

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 17:43:42 +08:00
fea9d970ec Feat: Modify the dataset list page style #3221 (#7437)
### What problem does this PR solve?

Feat: Modify the dataset list page style #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 15:37:16 +08:00
6e7dd54a50 Feat: Support passing knowledge base id as variable in retrieval component (#7088)
### What problem does this PR solve?

Fix #6600

Hello, I have the same business requirement as #6600. My use case is: 

We have many departments (> 20 now and increasing), and each department
has its own knowledge base. Because the agent workflow is the same, so I
want to change the knowledge base on the fly, instead of creating agents
for every department.

It now looks like this:


![屏幕截图_20250416_212622](https://github.com/user-attachments/assets/5cb3dade-d4fb-4591-ade3-4b9c54387911)

Knowledge bases can be selected from the dropdown, and passed through
the variables in the table. All selected knowledge bases are used for
retrieval.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-04-30 15:32:14 +08:00
f56b651acb Built-in reranker models have been removed from official deliveries. (#7439)
### What problem does this PR solve?

### Type of change


- [x] Documentation Update
2025-04-30 15:28:03 +08:00
2dbcc0a1bf Fix: Tried to fix the fid mis match under some cases (#7426)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/7407

Based on this context, I think there should be some reasons that let
some LLMs have a mismatch (add the wrong "@xxx"),
So I think when use fid can not fetch llm then tried to just use name
should can fetch it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 14:55:21 +08:00
1f82889001 Fix: create dataset remove unnecessary parameter constraints (#7432)
### What problem does this PR solve?

Remove unnecessary parameter restrictions in dataset creation API

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 14:50:23 +08:00
e6c824e606 Test: Update tests to use new fixture instead of deprecated one (#7431)
### What problem does this PR solve?

Deprecate get_dataset_id_and_document_id fixture, use add_document
instead

### Type of change

- [x] Update test cases
2025-04-30 14:49:26 +08:00
e2b0bceb1b Feat: filler list by user change input (#7389)
### What problem does this PR solve?

filler list by user change input

![Recording2025-04-28163440-ezgif
com-video-to-gif-converter](https://github.com/user-attachments/assets/6ff2cfea-dea9-4293-b9a6-b4c61ab9a549)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 14:48:41 +08:00
713c055e04 DOC: Added a UI tip for document parsing (#7430)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-04-30 13:10:13 +08:00
1fc52033ba Feat: Using IconFont as an additional icon library #3221 (#7427)
### What problem does this PR solve?
Feat: Using IconFont as an additional icon library #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 13:09:42 +08:00
ab27609a64 Fix: whole knowledge graph lost after removing any document in the knowledge base (#7151)
### What problem does this PR solve?

When you removed any document in a knowledge base using knowledge graph,
the graph's `removed_kwd` is set to "Y".
However, in the function `graphrag.utils.get_gaph`, `rebuild_graph`
method is passed and directly return `None` while `removed_kwd=Y`,
making residual part of the graph abandoned (but old entity data still
exist in db).

Besides, infinity instance actually pass deleting graph components'
`source_id` when removing document. It may cause wrong graph after
rebuild.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 09:43:17 +08:00
538a408608 Feat: Modify background color of Card #3221 (#7421)
### What problem does this PR solve?

Feat: Modify background color of Card #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 09:12:28 +08:00
093d280528 Feat: add Qwen3 and OpenAI o series (#7415)
### What problem does this PR solve?

Qwen3 and more LLMs.

Close #7296

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-04-29 18:26:29 +08:00
de166d0ff2 Feat: Add a language switch drop-down box to the top navigation bar #3221 (#7416)
### What problem does this PR solve?

Feat: Add a language switch drop-down box to the top navigation bar
#3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 18:20:46 +08:00
942b94fc3c feat: dataset filter by parsing status (#7404)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/5931

### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 17:29:58 +08:00
77bb7750e9 Feat: Modify the segmented component style #3221 (#7409)
### What problem does this PR solve?

Feat: Modify the segmented component style #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 17:05:23 +08:00
78380fa181 Refa: http API create dataset and test cases (#7393)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the create dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. ​​Error Handling
3. Test Updates
4. Documentation

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
2025-04-29 16:53:57 +08:00
c88e4b3fc0 Fix: improve recover_pending_tasks timeout (#7408)
### What problem does this PR solve?

Fix the redis lock will always timeout (change the logic order release
lock first)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-29 16:50:39 +08:00
552475dd5c Feat: Adjust the style of the home page #3221 (#7405)
### What problem does this PR solve?

Feat: Adjust the style of the home page #3321

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 15:32:50 +08:00
c69fbca24f fixed missing list input ref in query (#7375)
### What problem does this PR solve?

fixed missing list input ref in query

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-29 13:03:36 +08:00
5bb1c383ac Feat: Bind data to the agent module of the home page #3221 (#7385)
### What problem does this PR solve?

Feat: Bind data to the agent module of the home page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 09:50:54 +08:00
c7310f7fb2 Refa: similarity calculations. (#7381)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-04-28 19:17:11 +08:00
3a43043c8a Feat: Add support for OAuth2 and OpenID Connect (OIDC) authentication (#7379)
### What problem does this PR solve?

Add support for OAuth2 and OpenID Connect (OIDC) authentication,
allowing OAuth/OIDC authentication using the specified routes:
- `/login/<channel>`: Initiates the OAuth flow for the specified channel
- `/oauth/callback/<channel>`: Handles the OAuth callback after
successful authentication

The callback URL should be configured in your OAuth provider as:
```
https://your-app.com/oauth/callback/<channel>
```

For detailed instructions on configuring **service_conf.yaml.template**,
see: `./api/apps/auth/README.md#usage`.

- Related issues
#3495  

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-04-28 16:15:52 +08:00
dbfa859ca3 Knowledge graph no longer exists as a chunking method (#7382)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-04-28 15:58:20 +08:00
Qi
53c59c47a1 Fix:Update chat assistant with an empty dataset (#7354)
### What problem does this PR solve?

When updating a chat assistant using API,if the dataset attached by the
current chat assistant is not empty,setting dataset to
null("dataset_ids":[]) will cause update failure:'dataset_ids' can't be
empty

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-28 15:19:21 +08:00
af393b0003 Feat: Add AsyncTreeSelect component #3221 (#7377)
### What problem does this PR solve?

Feat: Add AsyncTreeSelect component #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-28 14:58:33 +08:00
1a5608d0f8 Fix: Add title_tks for Pictures (#7365)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7362

append title_tks
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-28 13:35:34 +08:00
23dcbc94ef feat: replace models of novita (#7360)
### What problem does this PR solve?

Replace models of novita

### Type of change

- [x] Other (please describe): Replace models of novita
2025-04-28 13:35:09 +08:00
af770c5ced perf: Optimize GraphRAG’s LOOP_PROMPT (#7356)
### What problem does this PR solve?

当前graphrag的LOOP_PROMPT,会导致模型输出Y之后,继续补充了实体和关系,比较浪费时间。参照[graph
rag](https://github.com/microsoft/graphrag/blob/main/graphrag/prompts/index/extract_graph.py)最新的代码,修改了LOOP_PROMPT,经过验证,修改后可以稳定的输出Y停止。

Currently, GraphRAG’s LOOP_PROMPT causes the model to keep appending
entities and relationships even after outputting “Y,” which wastes time.
Referring to the latest code in
[graphRAG](https://github.com/microsoft/graphrag/blob/main/graphrag/prompts/index/extract_graph.py),
I modified the LOOP_PROMPT, and after verification the updated prompt
reliably outputs “Y” and stops.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
2025-04-28 13:31:04 +08:00
8ce5e69b2f Feat: Preview the file #3221 (#7355)
### What problem does this PR solve?

Feat: Preview the file #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-27 18:50:24 +08:00