### What problem does this PR solve?
Fix: Some bugs
- Issues and style fixes related to the 'Memory' page
- Data source icon replacement
- Build optimization
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Some bugs
- In a production environment, a second-level page refresh results in a
white screen.
- The knowledge graph cannot be opened.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Refactor: UmiJs -> Vite+React
### Type of change
- [x] Refactoring
---------
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Feat: Refactoring the documentation page using shadcn. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Adapt the theme of the documentation page. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: On the agent page and chat page, you can only select knowledge
bases that use the same embedding model. #12320
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: S3 parameter error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: If there is an error message on the chat page, the subsequent
message references will not display correctly. #12252
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Frontend cannot sync document window context.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Liu An <asiro@qq.com>
### What problem does this PR solve?
Improve image table context.
Current strategy in attach_media_context:
- Order by position when possible: if any chunk has page/position info,
sort by (page, top, left), otherwise keep original order.
- Apply only to media chunks: images use image_context_size, tables use
table_context_size.
- Primary matching: on the same page, choose a text chunk whose vertical
span overlaps the media, then pick the one with the closest vertical
midpoint.
- Fallback matching: if no overlap on that page, choose the nearest text
chunk on the same page (page-head uses the next text; page-tail uses the
previous text).
- Context extraction: inside the chosen text chunk, find a mid-sentence
boundary near the text midpoint, then take context_size tokens split
before/after (total budget).
- No multi-chunk stitching: context comes from a single text chunk to
avoid mixing unrelated segments.
### Type of change
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Memory-related bug fixes
- Forget memory button text
- Adjust memory storage interface
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
As title
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Feat: Preview the image at the bottom of the message #12076
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Can not select LLM in memory page
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: An image carousel is displayed at the bottom of the agent's chat
messages. #12076
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Images referenced in chat messages are displayed as a carousel at
the bottom of the message. #12076
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Add a no-data filter condition to MetaData
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix:Bugs Fixed
- Text overflow issues that caused rendering problems
- Metadata saving, copywriting and other related issues
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Add prompts when merging or deleting metadata.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: The form waiting for input is not displayed in the dialog message.
#12129
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Metadata bugs.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add memory multi-select dropdown to recall and message operator
forms. #4213
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix:Bugs fix
- Configure memory and metadata (in Chinese)
- Add indexing modal
- Reduce metadata saving steps
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add image context window configuration in **Dataset** >
**Configduration** and **Dataset** > **Files** > **Parse** > **Ingestion
Pipeline** (**Chunk Method** modal)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Hide drop-zone upload button when picked an image in chunk editor dialog
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
feature: Complete metadata functionality
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Display chunk type in chunk editor and dialog, may be one of below:
- Image
- Table
- Text
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Images that appear consecutively in the dialogue are displayed using a
carousel. #12076
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: When the webhook returns a field in streaming format, the message
displays the status field. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
Add optional cache busting for image
#12003
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add support for multiple Retrieval tools under an agent
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feature: Implement metadata functionality
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Images appearing consecutively in the dialogue are merged and
displayed in a carousel. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Remove HMAC from the webhook #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add image uploader in edit chunk dialog for replacing image chunk
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…ildren delimiter
### What problem does this PR solve?
Fix the issue of unable to save **Files > Ingestion Pipeline (Modal)**
config without modifying children delimiter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Only support MinerU-API now, still need to complete frontend for
pipeline to allow the configuration of MinerU options.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add children delimiters for Ingestion pipeline config
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
我已在下面的评论中用中文重复说明。
### What problem does this PR solve?
## Summary
This PR enhances the MinerU document parser with additional
configuration options, giving users more control over PDF parsing
behavior and improving support for multilingual documents.
## Changes
### Backend (`deepdoc/parser/mineru_parser.py`)
- Added configurable parsing options:
- **Parse Method**: `auto`, `txt`, or `ocr` — allows users to choose the
extraction strategy
- **Formula Recognition**: Toggle for enabling/disabling formula
extraction (useful to disable for Cyrillic documents where it may cause
issues)
- **Table Recognition**: Toggle for enabling/disabling table extraction
- Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate
RAGFlow language settings to MinerU-compatible language codes for better
OCR accuracy
- Improved parser configuration handling to pass these options through
the processing pipeline
### Frontend (`web/`)
- Created new `MinerUOptionsFormField` component that conditionally
renders when MinerU is selected as the layout recognition engine
- Added UI controls for:
- Parse method selection (dropdown)
- Formula recognition toggle (switch)
- Table recognition toggle (switch)
- Added i18n translations for English and Chinese
- Integrated the options into both the dataset creation dialog and
dataset settings page
### Integration
- Updated `rag/app/naive.py` to forward MinerU options to the parser
- Updated task service to handle the new configuration parameters
## Why
MinerU is a powerful document parser, but the default settings don't
work well for all document types. This PR allows users to:
1. Choose the best parsing method for their documents
2. Disable formula recognition for Cyrillic/non-Latin scripts where it
causes issues
3. Control table extraction based on document needs
4. Benefit from automatic language detection for better OCR results
## Testing
- [x] Tested MinerU parsing with different parse methods
- [x] Verified UI renders correctly when MinerU is selected/deselected
- [x] Confirmed settings persist correctly in dataset configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: user210 <user210@rt>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Bug fixes
New search popup style modification
Fixed multilingual settings not updating immediately on personal center
page
Changed overlapped percent to percentage format, with maximum value of
30%
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)