Commit Graph

1250 Commits

Author SHA1 Message Date
977962fdfe Fix: loopitem None issue. (#12166)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-24 17:22:31 +08:00
44671ea413 Fix: type check for chunks (#12164)
### What problem does this PR solve?

Fix: type check for chunks

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-24 16:36:00 +08:00
c81421d340 Feat: add document metadata setting (#12156)
### What problem does this PR solve?

Add document metadata setting.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2025-12-24 16:13:50 +08:00
5776fa73a7 refactor: improve memory service date time consistency (#12144)
### What problem does this PR solve?

 improve memory service date time consistency

### Type of change

- [x] Refactoring
2025-12-24 11:00:31 +08:00
c987d33649 Feat: deduplicate metadata lists during updates (#12125)
### What problem does this PR solve?

Deduplicate metadata lists during updates.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-24 09:32:55 +08:00
c33134ea2c Fix: table tag on chunks. (#12126)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-24 09:32:19 +08:00
17b8bb62b6 Feat: message manage (#12083)
### What problem does this PR solve?

Message CRUD.

Issue #4213 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-23 21:16:25 +08:00
bab6a4a219 Fix: /kb/update does not update FileService (#12121)
### What problem does this PR solve?

Fix: /kb/update does not update FileService

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-23 19:56:38 +08:00
00bb6fbd28 Fix: metadata issue & graphrag speeding up. (#12113)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Liu An <asiro@qq.com>
2025-12-23 15:57:27 +08:00
1444de981c Feat: enhance webhook response to include status and success fields and simplify ReAct agent (#12091)
### What problem does this PR solve?

change:
enhance webhook response to include status and success fields and
simplify ReAct agent

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-23 09:36:08 +08:00
bd76b8ff1a Fix: Tika server upgrades. (#12073)
### What problem does this PR solve?

#12037

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-23 09:35:52 +08:00
e5f3d5ae26 Refactor add_llm and add speech to text (#12089)
### What problem does this PR solve?

1. Refactor implementation of add_llm
2. Add speech to text model.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-22 19:27:26 +08:00
993bf7c2c8 Fix IDE warnings (#12085)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-22 16:47:21 +08:00
f911aa2997 Fix: list MCP tools may block (#12067)
### What problem does this PR solve?

 List MCP tools may block. #12043

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-22 13:08:44 +08:00
42f9ac997f Remove Chinese comments and fix function arguments errors (#12052)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-22 12:59:37 +08:00
3ee47e4af7 Feat: document list and filter supports metadata filtering (#12053)
### What problem does this PR solve?

Document list and filter supports metadata filtering.

**OR within the same field, AND across different fields**

Example 1 (multi-field AND):

```markdown
Doc1 metadata: { "a": "b", "as": ["a", "b", "c"] }
Doc2 metadata: { "a": "x", "as": ["d"] }

Query:

metadata = {
  "a": ["b"],
  "as": ["d"]
}

Result:

Doc1 matches a=b but not as=d → excluded
Doc2 matches as=d but not a=b → excluded

Final result: empty
```

Example 2 (same field OR):

```markdown
Doc1 metadata: { "as": ["a", "b", "c"] }
Doc2 metadata: { "as": ["d"] }

Query:

metadata = {
  "as": ["a", "d"]
}
Result:

Doc1 matches as=a → included
Doc2 matches as=d → included

Final result: Doc1 + Doc2
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-22 09:35:11 +08:00
55c0468ac9 Include document_id in knowledgebase info retrieval (#12041)
### What problem does this PR solve?
After a file in the file list is associated with a knowledge base, the
knowledge base document ID is returned


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 19:32:24 +08:00
6cd1824a77 Feat: chats completions API supports metadata filtering (#12023)
### What problem does this PR solve?

Chats completions API supports metadata filtering.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 11:36:35 +08:00
f8fd1ea7e1 Feat: Further update Bedrock model configs (#12029)
### What problem does this PR solve?

Feat: Further update Bedrock model configs #12020 #12008

<img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69"
src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63"
/>
<img width="700" alt="afe88ec3c58f745f85c5c507b040c250"
src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef"
/>
<img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69"
src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0"
/>

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 11:32:20 +08:00
57edc215d7 Feat:update webhook component (#11739)
### What problem does this PR solve?
issue:
https://github.com/infiniflow/ragflow/issues/10427

https://github.com/infiniflow/ragflow/issues/8115

change:

- Support for Multiple HTTP Methods (POST / GET / PUT / PATCH / DELETE /
HEAD)
- Security Validation
  1. max_body_size
  2. IP whitelist
  3. rate limit
  4. token / basic / jwt authentication
- File Upload Support
- Unified Content-Type Handling
- Full Schema-Based Extraction & Type Validation
- Two Execution Modes: Immediately / Streaming


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-18 19:34:39 +08:00
151480dc85 Feat: trace information can be returned by the agent completion API (#12019)
### What problem does this PR solve?

Trace information can be returned by the agent completion API.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-18 15:52:11 +08:00
5cd1a678c8 Fix: image edit in edit_chunk (#12009)
### What problem does this PR solve?

Fix: image edit in edit_chunk #11971

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-18 11:35:01 +08:00
1a4822d6be Refactor: Improve the timestamp consistency (#11942)
### What problem does this PR solve?

Improve the timestamp consistency 

### Type of change
- [x] Refactoring
2025-12-18 09:40:33 +08:00
672958a192 Fix: model not authorized (#12001)
### What problem does this PR solve?

Fix model not authorized. #11973.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-17 19:48:24 +08:00
8e4d011b15 Fix: parent-children chunking method. (#11997)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 16:50:36 +08:00
7baa67dfe8 Feat: Reject default admin account log in to normal services (#11994)
### What problem does this PR solve?

Feat: Reject default admin account log in to normal services
#11854
#11673

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 16:29:20 +08:00
4fd4a41e7c Fix: add multimodel models in chat api (#11986)
…tant, but model is available via UI

Fix: add multimodel models in chat api
Fixes #8549

### What problem does this PR solve?

Add a parameter model_type in chat api.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
2025-12-17 15:46:43 +08:00
30019dab9f Change knowledge base to dataset (#11976)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 10:03:33 +08:00
344a106eba Feat: Enable image edit in edit_chunk (#11971)
### What problem does this PR solve?

Feat: Enable image edit in edit_chunk

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-16 17:57:00 +08:00
5bba562048 Feature/excel export fix (#11914)
### PR details 
feat: Add Excel export support and fix variable reference regex
Changes:
- Add Excel export output format option to Message component
- Apply nest_asyncio patch to handle nested event loops
- Fix async generator iteration in canvas_app.py debug endpoint
- Add underscore support in variable reference regex pattern


### What problem does this PR solve?



### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Shivam Johri <shivamjohri@Shivams-MacBook-Air.local>
2025-12-16 13:15:52 +08:00
49c74d08e8 Feature/mineru improvements (#11938)
我已在下面的评论中用中文重复说明。

### What problem does this PR solve?

## Summary
This PR enhances the MinerU document parser with additional
configuration options, giving users more control over PDF parsing
behavior and improving support for multilingual documents.

## Changes

### Backend (`deepdoc/parser/mineru_parser.py`)
- Added configurable parsing options:
- **Parse Method**: `auto`, `txt`, or `ocr` — allows users to choose the
extraction strategy
- **Formula Recognition**: Toggle for enabling/disabling formula
extraction (useful to disable for Cyrillic documents where it may cause
issues)
- **Table Recognition**: Toggle for enabling/disabling table extraction
- Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate
RAGFlow language settings to MinerU-compatible language codes for better
OCR accuracy
- Improved parser configuration handling to pass these options through
the processing pipeline

### Frontend (`web/`)
- Created new `MinerUOptionsFormField` component that conditionally
renders when MinerU is selected as the layout recognition engine
- Added UI controls for:
  - Parse method selection (dropdown)
  - Formula recognition toggle (switch)
  - Table recognition toggle (switch)
- Added i18n translations for English and Chinese
- Integrated the options into both the dataset creation dialog and
dataset settings page

### Integration
- Updated `rag/app/naive.py` to forward MinerU options to the parser
- Updated task service to handle the new configuration parameters

## Why
MinerU is a powerful document parser, but the default settings don't
work well for all document types. This PR allows users to:
1. Choose the best parsing method for their documents
2. Disable formula recognition for Cyrillic/non-Latin scripts where it
causes issues
3. Control table extraction based on document needs
4. Benefit from automatic language detection for better OCR results

## Testing
- [x] Tested MinerU parsing with different parse methods
- [x] Verified UI renders correctly when MinerU is selected/deselected
- [x] Confirmed settings persist correctly in dataset configuration

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: user210 <user210@rt>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-16 13:15:25 +08:00
44dec89f1f Fix: aspose-slide issue. (#11935)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-12 20:16:18 +08:00
0f0fb53256 Refa: refactor metadata filter (#11907)
### What problem does this PR solve?

Refactor metadata filter.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-12 17:12:38 +08:00
50715ba332 Fix: forget-reset password (#11927)
### What problem does this PR solve?

Fix: forget-reset password

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-12 16:16:17 +08:00
6560388f2b Fix: correct metadata update behavior (#11919)
### What problem does this PR solve?

Correct metadata update behavior. #11912

When update `value` is omitted, the corresponding keys are updated to
`"value"` regardless of their current values.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-12 12:50:17 +08:00
7db9045b74 Feat: Add box connector (#11845)
### What problem does this PR solve?

Feat: Add box connector

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-12 10:23:40 +08:00
ea4a5cd665 Fix: tokenizer issue. (#11902)
#11786
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-11 17:38:17 +08:00
e9710b7aa9 Refa: treat MinerU as an OCR model 2 (#11905)
### What problem does this PR solve?

Treat MinerU as an OCR model 2. #11903

### Type of change

- [x] Refactoring
2025-12-11 17:33:12 +08:00
bd0eff2954 Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code (#11898)
### What problem does this PR solve?

Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-11 13:55:01 +08:00
e3cfe8e848 Fix:async issue and sensitive logging (#11895)
### What problem does this PR solve?

change:
async issue and sensitive logging

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-11 13:54:47 +08:00
c610bb605a Added semi-automatic mode to the metadata filter (#11886)
### What problem does this PR solve?

Retrieval metadata filtering adds semi-automatic mode, and users can
manually check the metadata key that participates in LLM to generate
filter conditions.
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-11 10:45:21 +08:00
8370bc61b7 Feat: enhance metadata operation (#11874)
### What problem does this PR solve?

Add metadata condition in document list.
Add metadata bulk update.
Add metadata summary.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-12-11 09:59:15 +08:00
74eb894453 Fix RuntimeError: asyncio.run() cannot be called from a running event loop when calling mindmap endpoint. (#11880)
### What problem does this PR solve?

Fix RuntimeError when calling mindmap endpoint by converting
`gen_mindmap()` to async function and using `await` instead of
`asyncio.run()`.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-11 09:47:44 +08:00
3cb72377d7 Refa:remove sensitive information (#11873)
### What problem does this PR solve?

change:
remove sensitive information

### Type of change

- [x] Refactoring
2025-12-10 19:08:45 +08:00
a1164b9c89 Feat/memory (#11812)
### What problem does this PR solve?

Manage and display memory datasets.

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-10 13:34:08 +08:00
65a5a56d95 Refa:replace trio with asyncio (#11831)
### What problem does this PR solve?

change:
replace trio with asyncio

### Type of change
- [x] Refactoring
2025-12-09 19:23:14 +08:00
a94b3b9df2 Refa: treat MinerU as an OCR model (#11849)
### What problem does this PR solve?

 Treat MinerU as an OCR model.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2025-12-09 18:54:14 +08:00
43f51baa96 Fix errors (#11804)
### What problem does this PR solve?

1. typos
2. grammar errors.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-08 12:21:18 +08:00
51ec708c58 Refa: cleanup synchronous functions in chat_model and implement synchronization for conversation and dialog chats (#11779)
### What problem does this PR solve?

Cleanup synchronous functions in chat_model and implement
synchronization for conversation and dialog chats.

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-12-08 09:43:03 +08:00
8de6b97806 Feature (canvas): Add Api for download "message" component output's file (#11772)
### What problem does this PR solve?

-Add Api for download "message" component output's file 
-Change the attachment output type check from tuple to
dictionary,because 'attachement' is not instance of tuple
-Update the message type to message_end to avoid the problem that
content does not send an error message when the message type is ans
["data"] ["content"]

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-12-05 19:42:35 +08:00