### What problem does this PR solve?
Fix: Add prompts when merging or deleting metadata.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Metadata bugs.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix:Bugs fix
- Configure memory and metadata (in Chinese)
- Add indexing modal
- Reduce metadata saving steps
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add image context window configuration in **Dataset** >
**Configduration** and **Dataset** > **Files** > **Parse** > **Ingestion
Pipeline** (**Chunk Method** modal)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feature: Complete metadata functionality
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feature: Implement metadata functionality
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…ildren delimiter
### What problem does this PR solve?
Fix the issue of unable to save **Files > Ingestion Pipeline (Modal)**
config without modifying children delimiter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Only support MinerU-API now, still need to complete frontend for
pipeline to allow the configuration of MinerU options.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add children delimiters for Ingestion pipeline config
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
我已在下面的评论中用中文重复说明。
### What problem does this PR solve?
## Summary
This PR enhances the MinerU document parser with additional
configuration options, giving users more control over PDF parsing
behavior and improving support for multilingual documents.
## Changes
### Backend (`deepdoc/parser/mineru_parser.py`)
- Added configurable parsing options:
- **Parse Method**: `auto`, `txt`, or `ocr` — allows users to choose the
extraction strategy
- **Formula Recognition**: Toggle for enabling/disabling formula
extraction (useful to disable for Cyrillic documents where it may cause
issues)
- **Table Recognition**: Toggle for enabling/disabling table extraction
- Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate
RAGFlow language settings to MinerU-compatible language codes for better
OCR accuracy
- Improved parser configuration handling to pass these options through
the processing pipeline
### Frontend (`web/`)
- Created new `MinerUOptionsFormField` component that conditionally
renders when MinerU is selected as the layout recognition engine
- Added UI controls for:
- Parse method selection (dropdown)
- Formula recognition toggle (switch)
- Table recognition toggle (switch)
- Added i18n translations for English and Chinese
- Integrated the options into both the dataset creation dialog and
dataset settings page
### Integration
- Updated `rag/app/naive.py` to forward MinerU options to the parser
- Updated task service to handle the new configuration parameters
## Why
MinerU is a powerful document parser, but the default settings don't
work well for all document types. This PR allows users to:
1. Choose the best parsing method for their documents
2. Disable formula recognition for Cyrillic/non-Latin scripts where it
causes issues
3. Control table extraction based on document needs
4. Benefit from automatic language detection for better OCR results
## Testing
- [x] Tested MinerU parsing with different parse methods
- [x] Verified UI renders correctly when MinerU is selected/deselected
- [x] Confirmed settings persist correctly in dataset configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: user210 <user210@rt>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Bug fixes
New search popup style modification
Fixed multilingual settings not updating immediately on personal center
page
Changed overlapped percent to percentage format, with maximum value of
30%
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Refactor the order of the dataset config items.
2. Refactor the text of retrieval test.
3. Refactor typos
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Retrieval metadata filtering adds semi-automatic mode, and users can
manually check the metadata key that participates in LLM to generate
filter conditions.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix:Modify the name of the Overlapped percent field
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feature: Memory interface integration testing
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Features: Memory page rendering and other bug fixes
- Rendering of the Memory list page
- Rendering of the message list page in Memory
- Fixed an issue where the empty state was incorrectly displayed when
search criteria were applied
- Added a web link for the API-Key
- modifying the index_mode attribute of the Confluence data source.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feature:Add voice dialogue functionality to the agent application
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Delete useless request hooks. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Added styles for empty states on the page.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Delete useless knowledge base, chat, and search files. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Refactoring and enhancing the functionality of the delete
confirmation dialog component
- Refactoring and enhancing the functionality of the delete confirmation
dialog component
- Modifying the style of the user center
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Modify the style of the user center
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes: Added session variable types and modified configuration
- Added more types of session variables
- Modified the embedding model switching logic in the knowledge base
configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes: Fixed model provider issues and improved some features
- Removed the old login page
- Updated model provider icons
- Added RAPTOR modification range parameter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fixes: Bugs fixed
- Removed invalid code,
- Modified the user center style,
- Added an automatic data source parsing switch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Added some prompts and polling functionality to retrieve data
source logs.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Improve some functional issues with the data source. #10703
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feature: Added data source functionality
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Refactor the similarity slider component and modify the style of
the dataset-test page
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue where dataset log avatars were displayed
incorrectly #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Resolved the issue where the Generate button must be refreshed
after generating chunk to take effect
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Fixed the issue that filename is not displayed on the overview
page; and added the processing logic of the generate button when chunk=0
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
fix(edit-tag): Fix the bug that the edit-tag tag cannot be deleted #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimize code and fix ts type errors #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Restore the sidebar description of DP slicing method #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Bug fixes#9869
- Added the disabled attribute to control the modal confirmation button
state
- Conditionally rendered the catalog enhancement toggle component
- Replaced the selector component and removed unused imports
- Removed redundant catalog enhancement text in the Chinese language
pack
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix (dataset setting): Remove the introduction and use of TagItems in
the configuration. #9869
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Switch the default theme from light mode to dark mode and improve
some styles #9869
-Update UI component styles such as input boxes, tables, and prompt
boxes
-Optimize login page layout and style details
-Revise some of the wording, such as uniformly changing "data flow" to
"pipeline"
-Adjust the parser to support the markdown type
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Update the parsing editor to support dynamic field names and
optimize UI styles #9869
-Modify the default background color of UI components such as Input and
Select to 'bg bg base'`
-Remove TagItems component reference from naive configuration page
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Optimized the login page and fixed some known issues. #9869
- Added the FlipCard3D component to implement a 3D flip effect on the
login/registration forms.
- Adjusted the Spotlight component to support custom positioning and
color configurations.
- Updated the route to point to the new login page /login-next.
- Added a cancel interface to the auto-generate function.
- Fixed scroll bar issues in PDF preview.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Added table of contents extraction functionality and optimized form
item layout #9869
- Added `EnableTocToggle` component to toggle table of contents
extraction on and off
- Added multiple parser configuration components (such as naive, book,
laws, etc.), displaying different parser components based on built-in
slicing methods
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: Bug fixes#9869
- Adjusted the breadcrumb display logic on the data flow results page
- Added the default display of "Local Upload" to the Source field in the
dataset overview table
- Replaced the original Mindmap Task field with the GraphRAG Task field
on the dataset settings page
- Optimized the build button status check criteria and adjusted the
progress information display logic
- Introduced a Tooltip in the parsing status cell component and removed
redundant Button components
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The dataset uses the new id to obtain the knowledge graph #10333
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)