Commit Graph

2915 Commits

Author SHA1 Message Date
1657755b5d Feat: Adjust the operation cell of the table on the file management page and dataset page #3221. (#7526)
### What problem does this PR solve?

Feat: Adjust the operation cell of the table on the file management page
and dataset page #3221.
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-08 15:25:26 +08:00
9d3dd13fef Refa: text order be robuster. (#7525)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2025-05-08 12:58:10 +08:00
3827c47515 Feat: Add API to support get chunk by id (#7522)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7519
### Type of change
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-05-08 12:24:38 +08:00
e9053b6ed4 fix bug #7309 deepseek-ai/deepseek-vl2 model can not be select as a VL model to parse pdf image (#7312)
### What problem does this PR solve?
fix deepseek-ai/deepseek-vl2 model can not be select as a VL model to
parse pdf image . And add other vl models config from siliconflow
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: unknown <taoshi.ln@chinatelecom.cn>
2025-05-08 11:24:39 +08:00
e349635a3d Feat: Add /login/channels route and improve auth logic for frontend third-party login integration (#7521)
### What problem does this PR solve?

Add `/login/channels` route and improve auth logic to support frontend
integration with third-party login providers:

- Add `/login/channels` route to provide authentication channel list
with `display_name` and `icon`
- Optimize user info parsing logic by prioritizing `avatar_url` and
falling back to `picture`
- Simplify OIDC token validation by removing unnecessary `kid` checks
- Ensure `client_id` is safely cast to string during `audience`
validation
- Fix typo

---
- Related pull request: #7379 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-05-08 10:23:19 +08:00
014a1535f2 Docs: correct wrong URL for related_questions HTTP API (#7507)
### What problem does this PR solve?

Correct wrong URL for related_questions HTTP API. #7282

### Type of change

- [x] Documentation Update
2025-05-08 09:32:21 +08:00
7b57ab5dea Fix: retrieval component for shared KB issue. (#7513)
### What problem does this PR solve?

#7483

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-08 09:20:34 +08:00
e300d90c00 Docs: minor format updates (#7514)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-05-07 19:49:01 +08:00
87317bcfc4 Docs: Initial editorial pass to MCP server (#7359)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-05-07 19:40:45 +08:00
9849230a04 Fix: remove deprecated novitaAI. (#7511)
### What problem does this PR solve?

#7484

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-07 19:36:16 +08:00
fa32a2d0fd Fix:When sharing the knowledge base of multiple tenants with one person, when this person queries the knowledge base of both tenants, they will only query the question of the first person's knowledge base (#7500)
Fix:When sharing the knowledge base of multiple tenants with one person,
when this person queries the knowledge base of both tenants, they will
only query the question of the first person's knowledge base

Co-authored-by: 杜有强 <duyq@internal.ths.com.cn>
2025-05-07 16:05:40 +08:00
27ffc0ed74 Feat: Improve 'user_canvan_version' delete and 'document' delete performance (#6553)
### What problem does this PR solve?

1.  Add delete_by_ids method
2. Add get_doc_ids_by_doc_names
3. Improve user_canvan_version's logic (avoid O(n) db IO)
4. Improve document delete logic (avoid O(n) db IO)

### Type of change

- [x] Performance Improvement
2025-05-07 10:55:08 +08:00
539876af11 docs: add API key instructions for MCP host mode (#7496)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: 马继龙 <majilong@ideal.com>
2025-05-07 10:38:21 +08:00
b1c8746984 fix: After the file is deleted, it still remains in the bucket. (#7482)
### What problem does this PR solve?

Fix: After deleting the file from the file management menu, it was not
removed from the MinIO bucket.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
2025-05-06 19:30:42 +08:00
bc3160f75a Feat: Support knowledge base type input in agent flow debugger (#7471)
### What problem does this PR solve?

This is a follow-up of #7088 , adding a knowledge base type input to the
`Begin` component, and a knowledge base selector to the agent flow debug
input panel:


![image](https://github.com/user-attachments/assets/e4cd35f1-1c8e-4f69-bed4-5d613b96d148)

then you can select one or more knowledge bases when testing the agent:


![image](https://github.com/user-attachments/assets/724b547e-4790-4cd8-83d3-67e02f2e76d8)

Note: the lines changed in `agent/component/retrieval.py` after line 94
are modified by `ruff format` from the `pre-commit` hooks, no functional
change.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-06 19:30:27 +08:00
75b24ba02a Fix: chat solo issue. (#7479)
### What problem does this PR solve?



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 19:30:00 +08:00
953b3e1b3f Fix: Sometimes VisionFigureParser.figures may is tuple (#7477)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7466
I think due to some times we can not get position 

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 17:38:22 +08:00
c98933499a refa: Optimize create dataset validation (#7451)
### What problem does this PR solve?

Optimize dataset validation and add function docs

### Type of change

- [x] Refactoring
2025-05-06 17:38:06 +08:00
2f768b96e8 perf: optimze figure parser (#7392)
### What problem does this PR solve?

When parsing documents containing images, the current code uses a
single-threaded approach to call the VL model, resulting in extremely
slow parsing speed (e.g., parsing a Word document with dozens of images
takes over 20 minutes).

By switching to a multithreaded approach to call the VL model, the
parsing speed can be improved to an acceptable level.

### Type of change

- [x] Performance Improvement

---------

Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
2025-05-06 14:39:45 +08:00
d6cc6453d1 fixed errror when vars of cnt begin declare with key contain "begin" (#7457)
### What problem does this PR solve?
fixed errror when vars of cnt begin  declare with key contain "begin"

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 14:39:22 +08:00
45dfaf230c fix(deps): incorrect nltk download dir (#7447)
### What problem does this PR solve?

Fix https://github.com/infiniflow/ragflow/issues/7224 and
https://github.com/infiniflow/ragflow/issues/6793

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)a
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-06 14:39:05 +08:00
65537b8200 Fix:Set CUDA_VISIBLE_DEVICES In DefaultEmbedding (#7465)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7420

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 14:38:36 +08:00
60787f8d5d Fix Ollama instructions (#7478)
Fix instructions for Ollama

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-06 13:57:39 +08:00
c4b3d3af95 Fix instructions for Ollama (#7468)
1. Use `host.docker.internal` as base URL
2. Fix numbers in list
3. Make clear what is the console input and what is the output

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-06 09:47:19 +08:00
f29a5de9f5 Fix: filed_map was incorrectly persisted (#7443)
### What problem does this PR solve?

Fix `filed_map` was incorrectly persisted. #7412 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-06 09:44:38 +08:00
cb37f00a8f Feat: Modify the style of the dataset page #3221 (#7446)
### What problem does this PR solve?

Feat:  Modify the style of the dataset page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-02 21:27:21 +08:00
fc379e90d1 Fix: change create dataset htto api delimiter default value to r'\n' (#7434)
### What problem does this PR solve?

change create dataset delimiter default value to r'\n'

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 17:43:42 +08:00
fea9d970ec Feat: Modify the dataset list page style #3221 (#7437)
### What problem does this PR solve?

Feat: Modify the dataset list page style #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 15:37:16 +08:00
6e7dd54a50 Feat: Support passing knowledge base id as variable in retrieval component (#7088)
### What problem does this PR solve?

Fix #6600

Hello, I have the same business requirement as #6600. My use case is: 

We have many departments (> 20 now and increasing), and each department
has its own knowledge base. Because the agent workflow is the same, so I
want to change the knowledge base on the fly, instead of creating agents
for every department.

It now looks like this:


![屏幕截图_20250416_212622](https://github.com/user-attachments/assets/5cb3dade-d4fb-4591-ade3-4b9c54387911)

Knowledge bases can be selected from the dropdown, and passed through
the variables in the table. All selected knowledge bases are used for
retrieval.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-04-30 15:32:14 +08:00
f56b651acb Built-in reranker models have been removed from official deliveries. (#7439)
### What problem does this PR solve?

### Type of change


- [x] Documentation Update
2025-04-30 15:28:03 +08:00
2dbcc0a1bf Fix: Tried to fix the fid mis match under some cases (#7426)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/7407

Based on this context, I think there should be some reasons that let
some LLMs have a mismatch (add the wrong "@xxx"),
So I think when use fid can not fetch llm then tried to just use name
should can fetch it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 14:55:21 +08:00
1f82889001 Fix: create dataset remove unnecessary parameter constraints (#7432)
### What problem does this PR solve?

Remove unnecessary parameter restrictions in dataset creation API

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 14:50:23 +08:00
e6c824e606 Test: Update tests to use new fixture instead of deprecated one (#7431)
### What problem does this PR solve?

Deprecate get_dataset_id_and_document_id fixture, use add_document
instead

### Type of change

- [x] Update test cases
2025-04-30 14:49:26 +08:00
e2b0bceb1b Feat: filler list by user change input (#7389)
### What problem does this PR solve?

filler list by user change input

![Recording2025-04-28163440-ezgif
com-video-to-gif-converter](https://github.com/user-attachments/assets/6ff2cfea-dea9-4293-b9a6-b4c61ab9a549)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 14:48:41 +08:00
713c055e04 DOC: Added a UI tip for document parsing (#7430)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-04-30 13:10:13 +08:00
1fc52033ba Feat: Using IconFont as an additional icon library #3221 (#7427)
### What problem does this PR solve?
Feat: Using IconFont as an additional icon library #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 13:09:42 +08:00
ab27609a64 Fix: whole knowledge graph lost after removing any document in the knowledge base (#7151)
### What problem does this PR solve?

When you removed any document in a knowledge base using knowledge graph,
the graph's `removed_kwd` is set to "Y".
However, in the function `graphrag.utils.get_gaph`, `rebuild_graph`
method is passed and directly return `None` while `removed_kwd=Y`,
making residual part of the graph abandoned (but old entity data still
exist in db).

Besides, infinity instance actually pass deleting graph components'
`source_id` when removing document. It may cause wrong graph after
rebuild.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 09:43:17 +08:00
538a408608 Feat: Modify background color of Card #3221 (#7421)
### What problem does this PR solve?

Feat: Modify background color of Card #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 09:12:28 +08:00
093d280528 Feat: add Qwen3 and OpenAI o series (#7415)
### What problem does this PR solve?

Qwen3 and more LLMs.

Close #7296

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-04-29 18:26:29 +08:00
de166d0ff2 Feat: Add a language switch drop-down box to the top navigation bar #3221 (#7416)
### What problem does this PR solve?

Feat: Add a language switch drop-down box to the top navigation bar
#3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 18:20:46 +08:00
942b94fc3c feat: dataset filter by parsing status (#7404)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/5931

### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 17:29:58 +08:00
77bb7750e9 Feat: Modify the segmented component style #3221 (#7409)
### What problem does this PR solve?

Feat: Modify the segmented component style #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 17:05:23 +08:00
78380fa181 Refa: http API create dataset and test cases (#7393)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the create dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. ​​Error Handling
3. Test Updates
4. Documentation

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
2025-04-29 16:53:57 +08:00
c88e4b3fc0 Fix: improve recover_pending_tasks timeout (#7408)
### What problem does this PR solve?

Fix the redis lock will always timeout (change the logic order release
lock first)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-29 16:50:39 +08:00
552475dd5c Feat: Adjust the style of the home page #3221 (#7405)
### What problem does this PR solve?

Feat: Adjust the style of the home page #3321

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 15:32:50 +08:00
c69fbca24f fixed missing list input ref in query (#7375)
### What problem does this PR solve?

fixed missing list input ref in query

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-29 13:03:36 +08:00
5bb1c383ac Feat: Bind data to the agent module of the home page #3221 (#7385)
### What problem does this PR solve?

Feat: Bind data to the agent module of the home page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-29 09:50:54 +08:00
c7310f7fb2 Refa: similarity calculations. (#7381)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-04-28 19:17:11 +08:00
3a43043c8a Feat: Add support for OAuth2 and OpenID Connect (OIDC) authentication (#7379)
### What problem does this PR solve?

Add support for OAuth2 and OpenID Connect (OIDC) authentication,
allowing OAuth/OIDC authentication using the specified routes:
- `/login/<channel>`: Initiates the OAuth flow for the specified channel
- `/oauth/callback/<channel>`: Handles the OAuth callback after
successful authentication

The callback URL should be configured in your OAuth provider as:
```
https://your-app.com/oauth/callback/<channel>
```

For detailed instructions on configuring **service_conf.yaml.template**,
see: `./api/apps/auth/README.md#usage`.

- Related issues
#3495  

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-04-28 16:15:52 +08:00
dbfa859ca3 Knowledge graph no longer exists as a chunking method (#7382)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-04-28 15:58:20 +08:00