Commit Graph

542 Commits

Author SHA1 Message Date
f72a35188d refactor: remove debug print statements (#12598)
### What problem does this PR solve?

This PR eliminates unnecessary debug print statements that were left in
hot paths of the codebase.

### Type of change

- [x] Refactoring
2026-01-14 10:05:34 +08:00
2e09db02f3 feat: add paddleocr parser (#12513)
### What problem does this PR solve?

Add PaddleOCR as a new PDF parser.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-01-09 17:48:45 +08:00
64b1e0b4c3 Feat: The translation model type options should be consistent with the model's labels. #1036 (#12537)
### What problem does this PR solve?

Feat: The translation model type options should be consistent with the
model's labels. #1036

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-01-09 17:39:40 +08:00
9562762af2 docs: fix embedding model switching tooltip (#12517)
### What problem does this PR solve?
After version 0.22.1, the embedding model supports switching; the
corresponding tooltip needs to be updated.

### Type of change

- [x] Documentation Update
2026-01-09 10:19:40 +08:00
2a4627d9a0 Fix: Issues and style fixes related to the 'Memory' page (#12469)
### What problem does this PR solve?

Fix:  Some bugs
- Issues and style fixes related to the 'Memory' page
- Data source icon replacement
- Build optimization

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-07 10:03:54 +08:00
7d4d687dde Feat: Bitbucket connector (#12332)
### What problem does this PR solve?

Feat: Bitbucket connector NOT READY TO MERGE

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-31 17:18:30 +08:00
750335978c Fix: Batch parsing problem (#12358)
### What problem does this PR solve?

Fix: Batch parsing problem

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-31 13:52:50 +08:00
ae4692a845 Fix: Bug fixed (#12345)
### What problem does this PR solve?

Fix: Bug fixed
- Memory type multilingual display
- Name modification is prohibited in Data source
- Jump directly to Metadata settings

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-31 10:43:57 +08:00
a7e466142d Fix: Dataset parse logic (#12330)
### What problem does this PR solve?

Fix: Dataset  logic of parser

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-30 19:53:00 +08:00
bffdb5fb11 Feat: add IMAP data source integration with configuration and sync capabilities (#12316)
### What problem does this PR solve?
issue:
#12217 [#12313](https://github.com/infiniflow/ragflow/issues/12313)
change:
add IMAP data source integration with configuration and sync
capabilities

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-30 17:09:13 +08:00
5903d1c8f1 Feat: GitHub connector (#12314)
### What problem does this PR solve?

Feat: GitHub connector

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-30 15:09:52 +08:00
5402666b19 docs: fix typos (#12301)
### What problem does this PR solve?

fix typos

### Type of change

- [x] Documentation Update
2025-12-30 09:39:28 +08:00
c3ae1aaecd Feat: Gitlab connector (#12248)
### What problem does this PR solve?

Feat: Gitlab connector
Fix: submit button in darkmode

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-29 17:05:20 +08:00
a764f0a5b2 Feat: Add Asana data source integration and configuration options (#12239)
### What problem does this PR solve?

change: Add Asana data source integration and configuration options

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-29 13:28:37 +08:00
613d2c5790 Fix: Memory sava issue (#12243)
### What problem does this PR solve?

Fix: Memory sava issue

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-26 18:56:28 +08:00
c4a66204f0 Fix: Memory-related bug fixes (#12238)
### What problem does this PR solve?

Fix: Memory-related bug fixes
- Forget memory button text
- Adjust memory storage interface
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-26 15:56:41 +08:00
894bf995bb Fix: Memory-related bug fixes (#12226)
### What problem does this PR solve?

Fix: bugs fix
- table -> Table
- memory delete fail
- memory copywriting modified

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-26 12:24:05 +08:00
cbcbbc41af Feat: The agent can only retrieve content from the knowledge base or memory. #4213 (#12224)
### What problem does this PR solve?

Feat: The agent can only retrieve content from the knowledge base or
memory. #4213

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-26 12:10:13 +08:00
6044314811 Fix text issue (#12221)
### What problem does this PR solve?

Fix several text issues.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-26 11:18:08 +08:00
1812491679 Feat: add Airtable connector and integration for data synchronization (#12211)
### What problem does this PR solve?
change:
add Airtable connector and integration for data synchronization
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-25 17:50:41 +08:00
2817be14d5 Fix: Metadata tips info (#12209)
### What problem does this PR solve?

Fix: Metadata tips info

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-25 15:55:06 +08:00
a3ceb7a944 Update german language file (resubmission) (#12208)
### What problem does this PR solve?

Resubmission of updated German translation.

### Type of change

- [x] Other (please describe):

Contribution by RAGcon GmbH, visit us at https://www.ragcon.ai
2025-12-25 15:40:16 +08:00
89ea760e67 Fix: Add a no-data filter condition to MetaData (#12189)
### What problem does this PR solve?

Fix: Add a no-data filter condition to MetaData

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-25 12:13:18 +08:00
884aabd130 Fix: Fixed the issue of incorrect agent translation text. #10427 (#12172)
### What problem does this PR solve?

Fix: Fixed the issue of incorrect agent translation text. #10427

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-25 12:12:49 +08:00
4a2978150c Fix:Metadata saving, copywriting and other related issues (#12169)
### What problem does this PR solve?

Fix:Bugs Fixed
- Text overflow issues that caused rendering problems
- Metadata saving, copywriting and other related issues

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-25 12:12:32 +08:00
9a5c5c46f2 Fix: Add prompts when merging or deleting metadata. (#12138)
### What problem does this PR solve?

Fix: Add prompts when merging or deleting metadata.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-25 11:53:06 +08:00
6c93157b14 Refa: image table context window (#12132)
### What problem does this PR solve?

Image table context window

### Type of change

- [x] Refactoring
2025-12-23 19:51:01 +08:00
a958ddb27a refactor: reword locale translations (#12118)
### What problem does this PR solve?

Reword (in locales/en) "Image context window" to "Image & table context
window", etc.

### Type of change

- [x] Refactoring
2025-12-23 17:34:21 +08:00
b47f1afa35 fix: transformer toc prompt text incorrect (#12116)
### What problem does this PR solve?

Fix incorrect prompt texts in **Agent** canvas > **Transformer** >
**Result destination: Table of contents**

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-23 15:59:09 +08:00
b824185a3a Feat: Translate the text of the webhook debugging interface. #10427 (#12115)
### What problem does this PR solve?

Feat: Translate the text of the webhook debugging interface. #10427

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: balibabu <assassin_cike@163.com>
2025-12-23 15:25:38 +08:00
8e6ddd7c1b Fix: Metadata bugs. (#12111)
### What problem does this PR solve?

Fix: Metadata bugs.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-23 14:16:57 +08:00
9e31631d8f Feat: Add memory multi-select dropdown to recall and message operator forms. #4213 (#12106)
### What problem does this PR solve?

Feat: Add memory multi-select dropdown to recall and message operator
forms. #4213

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-23 11:54:32 +08:00
bd4eb19393 Fix:Bugs fix (Reduce metadata saving steps ...) (#12095)
### What problem does this PR solve?

Fix:Bugs fix
- Configure memory and metadata (in Chinese)
- Add indexing modal
- Reduce metadata saving steps

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-23 11:50:35 +08:00
02efab7c11 Feat: Hide part of the message field in webhook mode #10427 (#12100)
### What problem does this PR solve?

Feat: Hide part of the message field in webhook mode  #10427

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: balibabu <assassin_cike@163.com>
2025-12-23 10:45:05 +08:00
38ac6a7c27 feat: add image context window in dataset config (#12094)
### What problem does this PR solve?

Add image context window configuration in **Dataset** >
**Configduration** and **Dataset** > **Files** > **Parse** > **Ingestion
Pipeline** (**Chunk Method** modal)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-22 19:51:23 +08:00
b42b5fcf65 feat: display chunk type in chunk editor and dialog (#12086)
### What problem does this PR solve?

Display chunk type in chunk editor and dialog, may be one of below:
- Image
- Table
- Text

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-22 16:45:47 +08:00
2ddfcc7cf6 Images that appear consecutively in the dialogue are displayed using a carousel. #12076 (#12077)
### What problem does this PR solve?

Images that appear consecutively in the dialogue are displayed using a
carousel. #12076

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-22 14:41:02 +08:00
5aea82d9c4 Feat: Separate connectors from s3 (#12045)
### What problem does this PR solve?

Feat: Separate connectors from s3 #12008 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Overview:
<img width="1500" alt="image"
src="https://github.com/user-attachments/assets/d54fea7a-7294-4ec0-ab6c-9753b3f03a72"
/>

Oracle: 
<img width="350" alt="image"
src="https://github.com/user-attachments/assets/bca140c1-33d8-4950-afdc-153407eedc46"
/>
2025-12-22 09:36:16 +08:00
eeb36a5ce7 Feature: Implement metadata functionality (#12049)
### What problem does this PR solve?

Feature: Implement metadata functionality

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 19:13:33 +08:00
2844700dc4 Refa: better UX for adding OCR model (#12034)
### What problem does this PR solve?

Better UX for adding OCR model.

### Type of change

- [x] Refactoring
2025-12-19 11:34:21 +08:00
f8fd1ea7e1 Feat: Further update Bedrock model configs (#12029)
### What problem does this PR solve?

Feat: Further update Bedrock model configs #12020 #12008

<img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69"
src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63"
/>
<img width="700" alt="afe88ec3c58f745f85c5c507b040c250"
src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef"
/>
<img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69"
src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0"
/>

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 11:32:20 +08:00
e84d5412bc Feat: bedrock iam authentication (#12020)
### What problem does this PR solve?

Feat: bedrock iam authentication #12008 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-18 17:13:09 +08:00
ce161f09cc feat: add image uploader in edit chunk dialog (#12003)
### What problem does this PR solve?

Add image uploader in edit chunk dialog for replacing image chunk

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-18 09:33:52 +08:00
e58271ef76 feat: add toc option in transformer node in ingestion pipeline (#11992)
### What problem does this PR solve?

Add TOC (Table of contents) option in Ingestion Pipeline canvas >
Transformer node

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 15:51:55 +08:00
5e05f43c3d Update default prompt (#11984)
### What problem does this PR solve?

New default prompt:

```
You are an intelligent assistant. Your primary function is to answer questions based strictly on the provided knowledge base.

**Essential Rules:**
- Your answer must be derived **solely** from this knowledge base: `{knowledge}`.
- **When information is available**: Summarize the content to give a detailed answer.
- **When information is unavailable**: Your response must contain this exact sentence: "The answer you are looking for is not found in the knowledge base!"
- **Always consider** the entire conversation history.
```

Also fix some grammar errors.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 12:57:24 +08:00
205a6483f5 Feature:memory function complete (#11982)
### What problem does this PR solve?

memory function complete

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 12:35:26 +08:00
2595644dfd feat: add ingestion pipeline children delimiters configs (#11979)
### What problem does this PR solve?

Add children delimiters for Ingestion pipeline config

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 11:18:54 +08:00
30019dab9f Change knowledge base to dataset (#11976)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 10:03:33 +08:00
49c74d08e8 Feature/mineru improvements (#11938)
我已在下面的评论中用中文重复说明。

### What problem does this PR solve?

## Summary
This PR enhances the MinerU document parser with additional
configuration options, giving users more control over PDF parsing
behavior and improving support for multilingual documents.

## Changes

### Backend (`deepdoc/parser/mineru_parser.py`)
- Added configurable parsing options:
- **Parse Method**: `auto`, `txt`, or `ocr` — allows users to choose the
extraction strategy
- **Formula Recognition**: Toggle for enabling/disabling formula
extraction (useful to disable for Cyrillic documents where it may cause
issues)
- **Table Recognition**: Toggle for enabling/disabling table extraction
- Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate
RAGFlow language settings to MinerU-compatible language codes for better
OCR accuracy
- Improved parser configuration handling to pass these options through
the processing pipeline

### Frontend (`web/`)
- Created new `MinerUOptionsFormField` component that conditionally
renders when MinerU is selected as the layout recognition engine
- Added UI controls for:
  - Parse method selection (dropdown)
  - Formula recognition toggle (switch)
  - Table recognition toggle (switch)
- Added i18n translations for English and Chinese
- Integrated the options into both the dataset creation dialog and
dataset settings page

### Integration
- Updated `rag/app/naive.py` to forward MinerU options to the parser
- Updated task service to handle the new configuration parameters

## Why
MinerU is a powerful document parser, but the default settings don't
work well for all document types. This PR allows users to:
1. Choose the best parsing method for their documents
2. Disable formula recognition for Cyrillic/non-Latin scripts where it
causes issues
3. Control table extraction based on document needs
4. Benefit from automatic language detection for better OCR results

## Testing
- [x] Tested MinerU parsing with different parse methods
- [x] Verified UI renders correctly when MinerU is selected/deselected
- [x] Confirmed settings persist correctly in dataset configuration

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: user210 <user210@rt>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-16 13:15:25 +08:00
a98887d4ca Fix: Bug fixes (#11960)
### What problem does this PR solve?

Fix: Bug fixes

New search popup style modification
Fixed multilingual settings not updating immediately on personal center
page
Changed overlapped percent to percentage format, with maximum value of
30%
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-16 09:44:06 +08:00