### What problem does this PR solve?
Deduplicate metadata lists during updates.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: /kb/update does not update FileService
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
change:
enhance webhook response to include status and success fields and
simplify ReAct agent
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Refactor implementation of add_llm
2. Add speech to text model.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Document list and filter supports metadata filtering.
**OR within the same field, AND across different fields**
Example 1 (multi-field AND):
```markdown
Doc1 metadata: { "a": "b", "as": ["a", "b", "c"] }
Doc2 metadata: { "a": "x", "as": ["d"] }
Query:
metadata = {
"a": ["b"],
"as": ["d"]
}
Result:
Doc1 matches a=b but not as=d → excluded
Doc2 matches as=d but not a=b → excluded
Final result: empty
```
Example 2 (same field OR):
```markdown
Doc1 metadata: { "as": ["a", "b", "c"] }
Doc2 metadata: { "as": ["d"] }
Query:
metadata = {
"as": ["a", "d"]
}
Result:
Doc1 matches as=a → included
Doc2 matches as=d → included
Final result: Doc1 + Doc2
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
After a file in the file list is associated with a knowledge base, the
knowledge base document ID is returned
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Chats completions API supports metadata filtering.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
https://github.com/infiniflow/ragflow/issues/10427https://github.com/infiniflow/ragflow/issues/8115
change:
- Support for Multiple HTTP Methods (POST / GET / PUT / PATCH / DELETE /
HEAD)
- Security Validation
1. max_body_size
2. IP whitelist
3. rate limit
4. token / basic / jwt authentication
- File Upload Support
- Unified Content-Type Handling
- Full Schema-Based Extraction & Type Validation
- Two Execution Modes: Immediately / Streaming
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Trace information can be returned by the agent completion API.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Reject default admin account log in to normal services
#11854#11673
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
…tant, but model is available via UI
Fix: add multimodel models in chat api
Fixes#8549
### What problem does this PR solve?
Add a parameter model_type in chat api.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Feat: Enable image edit in edit_chunk
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### PR details
feat: Add Excel export support and fix variable reference regex
Changes:
- Add Excel export output format option to Message component
- Apply nest_asyncio patch to handle nested event loops
- Fix async generator iteration in canvas_app.py debug endpoint
- Add underscore support in variable reference regex pattern
### What problem does this PR solve?
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Shivam Johri <shivamjohri@Shivams-MacBook-Air.local>
我已在下面的评论中用中文重复说明。
### What problem does this PR solve?
## Summary
This PR enhances the MinerU document parser with additional
configuration options, giving users more control over PDF parsing
behavior and improving support for multilingual documents.
## Changes
### Backend (`deepdoc/parser/mineru_parser.py`)
- Added configurable parsing options:
- **Parse Method**: `auto`, `txt`, or `ocr` — allows users to choose the
extraction strategy
- **Formula Recognition**: Toggle for enabling/disabling formula
extraction (useful to disable for Cyrillic documents where it may cause
issues)
- **Table Recognition**: Toggle for enabling/disabling table extraction
- Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate
RAGFlow language settings to MinerU-compatible language codes for better
OCR accuracy
- Improved parser configuration handling to pass these options through
the processing pipeline
### Frontend (`web/`)
- Created new `MinerUOptionsFormField` component that conditionally
renders when MinerU is selected as the layout recognition engine
- Added UI controls for:
- Parse method selection (dropdown)
- Formula recognition toggle (switch)
- Table recognition toggle (switch)
- Added i18n translations for English and Chinese
- Integrated the options into both the dataset creation dialog and
dataset settings page
### Integration
- Updated `rag/app/naive.py` to forward MinerU options to the parser
- Updated task service to handle the new configuration parameters
## Why
MinerU is a powerful document parser, but the default settings don't
work well for all document types. This PR allows users to:
1. Choose the best parsing method for their documents
2. Disable formula recognition for Cyrillic/non-Latin scripts where it
causes issues
3. Control table extraction based on document needs
4. Benefit from automatic language detection for better OCR results
## Testing
- [x] Tested MinerU parsing with different parse methods
- [x] Verified UI renders correctly when MinerU is selected/deselected
- [x] Confirmed settings persist correctly in dataset configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: user210 <user210@rt>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Refactor metadata filter.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Correct metadata update behavior. #11912
When update `value` is omitted, the corresponding keys are updated to
`"value"` regardless of their current values.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change:
async issue and sensitive logging
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Retrieval metadata filtering adds semi-automatic mode, and users can
manually check the metadata key that participates in LLM to generate
filter conditions.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add metadata condition in document list.
Add metadata bulk update.
Add metadata summary.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Fix RuntimeError when calling mindmap endpoint by converting
`gen_mindmap()` to async function and using `await` instead of
`asyncio.run()`.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Manage and display memory datasets.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Treat MinerU as an OCR model.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
Cleanup synchronous functions in chat_model and implement
synchronization for conversation and dialog chats.
### Type of change
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
-Add Api for download "message" component output's file
-Change the attachment output type check from tuple to
dictionary,because 'attachement' is not instance of tuple
-Update the message type to message_end to avoid the problem that
content does not send an error message when the message type is ans
["data"] ["content"]
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
## Problem
The SDK API endpoint `DELETE /datasets/{dataset_id}/chunks` only updates
database status but does not send cancellation signal via Redis, causing
background parsing tasks to continue and eventually complete (status
becomes DONE instead of CANCEL).
## Root Cause
The SDK endpoint was missing the `cancel_all_task_of(id)` call that the
web API
([api/apps/document_app.py](cci:7://file:///d:/workspace1/ragflow-admin/api/apps/document_app.py:0:0-0:0))
uses to properly stop background tasks.
## Solution
Added `cancel_all_task_of(id)` call in the
[stop_parsing](cci:1://file:///d:/workspace1/ragflow/api/apps/sdk/doc.py:785:0-855:23)
function to send cancellation signal via Redis, consistent with the web
API behavior.
## Related Issue
Fixes#11745
Co-authored-by: tedhappy <tedhappy@users.noreply.github.com>
### What problem does this PR solve?
Fixes the `AttributeError: 'bool' object has no attribute 'items'` error
when updating the `enabled` parameter of a document via the Python SDK
(Issue #11721).
Background: When calling `Document.update({"enabled": True/False})`
through the SDK, the server-side API returned a boolean `data=True` in
the response (instead of a dictionary). The SDK's `_update_from_dict`
method (in `base.py`) expects a dictionary to iterate over with
`.items()`, leading to an immediate AttributeError during response
parsing. This prevented successful synchronization of the updated
`enabled` status to the local SDK object, even if the server-side
database/update index operations succeeded.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Additional Context (optional, for clarity)
- **Root Cause**: Server returned `data=True` (boolean) for `enabled`
parameter updates, violating the SDK's expectation of a dictionary-type
`data` field.
- **Fix Logic**:
1. Removed the separate `return get_result(data=True)` in the `enabled`
update branch to unify response flow.
2.
- **Backward Compatibility**: No breaking changes—other update scenarios
(e.g., renaming documents, modifying chunk methods) remain unaffected,
and the response format stays consistent.
Co-authored-by: shirukai <shirukai@hollysysdigital.com>
### What problem does this PR solve?
Update some docs and comments, since 'File manager' is rename to 'File'
### Type of change
- [x] Documentation Update
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>