### What problem does this PR solve?
Feature: Implement metadata functionality
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Images appearing consecutively in the dialogue are merged and
displayed in a carousel. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display error messages from intermediate nodes. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Remove HMAC from the webhook #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: bedrock iam authentication #12008
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add image uploader in edit chunk dialog for replacing image chunk
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add TOC (Table of contents) option in Ingestion Pipeline canvas >
Transformer node
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue of empty memory parameters
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
…ildren delimiter
### What problem does this PR solve?
Fix the issue of unable to save **Files > Ingestion Pipeline (Modal)**
config without modifying children delimiter
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Only support MinerU-API now, still need to complete frontend for
pipeline to allow the configuration of MinerU options.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
New default prompt:
```
You are an intelligent assistant. Your primary function is to answer questions based strictly on the provided knowledge base.
**Essential Rules:**
- Your answer must be derived **solely** from this knowledge base: `{knowledge}`.
- **When information is available**: Summarize the content to give a detailed answer.
- **When information is unavailable**: Your response must contain this exact sentence: "The answer you are looking for is not found in the knowledge base!"
- **Always consider** the entire conversation history.
```
Also fix some grammar errors.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Add children delimiters for Ingestion pipeline config
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### PR details
feat: Add Excel export support and fix variable reference regex
Changes:
- Add Excel export output format option to Message component
- Apply nest_asyncio patch to handle nested event loops
- Fix async generator iteration in canvas_app.py debug endpoint
- Add underscore support in variable reference regex pattern
### What problem does this PR solve?
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Shivam Johri <shivamjohri@Shivams-MacBook-Air.local>
我已在下面的评论中用中文重复说明。
### What problem does this PR solve?
## Summary
This PR enhances the MinerU document parser with additional
configuration options, giving users more control over PDF parsing
behavior and improving support for multilingual documents.
## Changes
### Backend (`deepdoc/parser/mineru_parser.py`)
- Added configurable parsing options:
- **Parse Method**: `auto`, `txt`, or `ocr` — allows users to choose the
extraction strategy
- **Formula Recognition**: Toggle for enabling/disabling formula
extraction (useful to disable for Cyrillic documents where it may cause
issues)
- **Table Recognition**: Toggle for enabling/disabling table extraction
- Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate
RAGFlow language settings to MinerU-compatible language codes for better
OCR accuracy
- Improved parser configuration handling to pass these options through
the processing pipeline
### Frontend (`web/`)
- Created new `MinerUOptionsFormField` component that conditionally
renders when MinerU is selected as the layout recognition engine
- Added UI controls for:
- Parse method selection (dropdown)
- Formula recognition toggle (switch)
- Table recognition toggle (switch)
- Added i18n translations for English and Chinese
- Integrated the options into both the dataset creation dialog and
dataset settings page
### Integration
- Updated `rag/app/naive.py` to forward MinerU options to the parser
- Updated task service to handle the new configuration parameters
## Why
MinerU is a powerful document parser, but the default settings don't
work well for all document types. This PR allows users to:
1. Choose the best parsing method for their documents
2. Disable formula recognition for Cyrillic/non-Latin scripts where it
causes issues
3. Control table extraction based on document needs
4. Benefit from automatic language detection for better OCR results
## Testing
- [x] Tested MinerU parsing with different parse methods
- [x] Verified UI renders correctly when MinerU is selected/deselected
- [x] Confirmed settings persist correctly in dataset configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: user210 <user210@rt>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Bug fixes
New search popup style modification
Fixed multilingual settings not updating immediately on personal center
page
Changed overlapped percent to percentage format, with maximum value of
30%
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Refactor the order of the dataset config items.
2. Refactor the text of retrieval test.
3. Refactor typos
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- Change the message format from 'key: value' to 'name: value' when user
submits the fillup form in agent chat.
- This resolves#11865
- I think this change makes sense, better aligning the form and the
replied message.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Set the return value of the webhook to a string. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Refactor metadata filter.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Displaying the file option in the webhook's request body #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces a new Docs Generator agent component for producing
downloadable PDF, DOCX, or TXT files from Markdown content generated
within a RAGFlow workflow.
### **Key Features**
**Backend**
- New component: DocsGenerator (agent/component/docs_generator.py)
-
- Markdown → PDF/DOCX/TXT conversion
-
- Supports tables, lists, code blocks, headings, and rich formatting
-
- Configurable document style (fonts, margins, colors, page size,
orientation)
-
- Optional header logo and footer with page numbers/timestamps
-
**Frontend**
- New configuration UI for the Docs Generator
-
- Download button integrated into the chat interface
-
- Output wired to the Message component
-
- Full i18n support
**Documentation**
Added component guide:
docs/guides/agent/agent_component_reference/docs_generator.md
**Usage**
Add the Docs Generator to a workflow, connect Markdown output from an
upstream component, configure metadata/style, and feed its output into
the Message component. Users will see a document download button
directly in the chat.
**Contributor Note**
We have been following RAGFlow since more than a year and half now and
have worked extensively on personalizing the framework and integrating
it into several of our internal systems. Over the past year and a half,
we have built multiple platforms that rely on RAGFlow as a core
component, which has given us a strong appreciation for how flexible and
powerful the project is.
We also previously contributed the full Italian translation, and we were
glad to see it accepted. This new Docs Generator component was created
for our own production needs, and we believe that it may be useful for
many others in the community as well.
We want to sincerely thank the entire RAGFlow team for the remarkable
work you have done and continue to do. If there are opportunities to
contribute further, we would be glad to help whenever we have time
available. It would be a pleasure to support the project in any way we
can.
If appropriate, we would be glad to be listed among the project’s
contributors, but in any case we look forward to continuing to support
and contribute to the project.
PentaFrame Development Team
---------
Co-authored-by: PentaFrame <info@pentaframe.it>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Flatten the request schema of the webhook #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add mineru as a model manufacturer to the system. #10621
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Retrieval metadata filtering adds semi-automatic mode, and users can
manually check the metadata key that participates in LLM to generate
filter conditions.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add configuration for webhook to the begin node. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix:Modify the name of the Overlapped percent field
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: The variables in the message node are not displaying correctly.
#11839
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add complete Italian translation file with all UI sections
- Register Italian in LanguageAbbreviation enum and language maps
- Configure Italian translation in i18n config
- Add Italiano to language selector dropdown
### Type of change
- [x] Other (please describe):
## What
Added complete Italian language translation support to RAGFlow
## Changes
- Added comprehensive Italian translation file
([it.ts](ragflow/web/src/locales/it.ts:0:0-0:0)) with all UI sections
(1239 lines)
- Registered Italian in `LanguageAbbreviation` enum and all language
maps
- Configured Italian translation in i18n configuration
- Added "Italiano" to language selector dropdown
## Impact
- Italian users can now use RAGFlow in their native language
- All major UI components are translated including:
- Login/registration screens
- Knowledge base management
- Chat interface
- Settings and configuration
- Admin console
- Error messages and notifications
## Testing
- Verified all translation keys are present
- Confirmed language selector shows "Italiano" correctly
- Tested that no translation keys are missing
- All UI sections properly translated
Co-authored-by: PentaFrame <info@pentaframe.it>
### What problem does this PR solve?
Feature: Memory interface integration testing
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Changed 'HightLightMarkdown' to 'HighLightMarkdown', and replaced
the private component with a public component.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
This commit resolves an incorrect import path for `ToastProps` and
`ToastActionElement` types within the `use-toast.tsx` hook.
The current path, `@/registry/default/ui/toast`, does not reflect the
actual file location in this repository.
The import in `src/components/hooks/use-toast.tsx` has been updated from
`@/registry/default/ui/toast` to the correct alias path:
`@/components/ui/toast`.
This ensures the types are resolved correctly and the codebase remains
clean and functional.
### What problem does this PR solve?
Features: Memory page rendering and other bug fixes
- Rendering of the Memory list page
- Rendering of the message list page in Memory
- Fixed an issue where the empty state was incorrectly displayed when
search criteria were applied
- Added a web link for the API-Key
- modifying the index_mode attribute of the Confluence data source.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Newly added models to OpenAI-API-Compatible are not displayed in
the LLM dropdown menu in a timely manner. #11774
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Users can chat directly without first creating a conversation.
#11768
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display the ID of the code image in the dialog. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: update front end for confluence connector
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: add more attribute for confluence connector.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## What problem does this PR solve?
Feat: detect docx support via header-byte inspection, a further optimize
based on #11684
Not all files with a .doc extension are truly legacy .doc formats, and
some are internally valid .docx documents.
The previous implementation relied on URL suffix checks, which
misclassified these cases and was therefore not reliable.
Doc file could be previewed:
[en2zh.doc](https://github.com/user-attachments/files/23921131/en2zh.doc)
Doc file could not be previewed:
[file-sample_100kB.doc](https://github.com/user-attachments/files/23921134/file-sample_100kB.doc)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feature:Add a loading status to the agent canvas page.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
# PR Description: Add Space Key Configuration for Confluence Data Source
### What problem does this PR solve?
This PR addresses issue #11638 where users requested the ability to
specify Confluence Space Keys when configuring a Confluence data source
connector.
**Problem:**
Currently, the RAGFlow UI for Confluence data sources only provides
fields for:
- Username
- Access Token
- Wiki Base URL
- Is Cloud checkbox
There is no way to specify which Confluence space(s) to sync, causing
RAGFlow to attempt syncing all accessible spaces. This is problematic
for users who:
- Only want to index specific spaces (e.g., only the HR or Documentation
space)
- Have access to many spaces but only need a subset
- Want to avoid unnecessary data transfer and processing
**Solution:**
The backend `ConfluenceConnector` class already supports a `space`
parameter in its `__init__()` method (line 1282 in
`common/data_source/confluence_connector.py`), but this parameter was
never exposed in the UI. This PR adds the missing UI field to allow
users to configure space filtering.
**User Impact:**
Users can now:
- Leave the field empty to sync all accessible spaces (default behavior)
- Specify a single space key (e.g., `DEV`)
- Specify multiple space keys separated by commas (e.g., `DEV,DOCS,HR`)
This gives users fine-grained control over which Confluence content gets
indexed into their RAGFlow knowledge base.
Fixes#11638
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---
## Implementation Details
### Changes Made
**1. Frontend UI
(`web/src/pages/user-setting/data-source/contant.tsx`)**
- Added "Space Key" text input field to Confluence configuration form
- Field is optional (not required)
- Positioned after "Is Cloud" checkbox for logical grouping
- Added to initial values with empty string default
**2. Internationalization (`web/src/locales/*.ts`)**
- **English (`en.ts`)**: Added `confluenceSpaceKeyTip` with clear
instructions and examples
- **Chinese (`zh.ts`)**: Added Chinese translation for the tooltip
- **Russian (`ru.ts`)**: Added Russian translation for the tooltip
- **Bonus Fix**: Removed duplicate `deleteModal` object in `zh.ts` that
was causing TypeScript lint errors
### Backend Compatibility
No backend changes were needed! The `ConfluenceConnector` class already
supports the `space` parameter:
```python
def __init__(
self,
wiki_base: str,
is_cloud: bool,
space: str = "", # ← Already supported!
page_id: str = "",
index_recursively: bool = False,
cql_query: str | None = None,
...
)
```
The connector uses this parameter to filter the CQL query (line
1328-1330):
```python
elif space:
uri_safe_space = quote(space)
base_cql_page_query += f" and space='{uri_safe_space}'"
```
### User Experience
**Before:**
- Users could only sync ALL accessible spaces
- No UI option to limit scope
**After:**
- Users see "Space Key" field with helpful tooltip
- Tooltip explains:
- Optional field (leave empty for all spaces)
- Single space example: `DEV`
- Multiple spaces example: `DEV,DOCS,HR`
- Available in English, Chinese, and Russian
### Future Enhancements
Potential improvements for future PRs:
- Add validation to check if space key exists before saving
- Add autocomplete/dropdown to show available spaces
- Add UI hints about space key format requirements
- Support for page_id filtering (already supported in backend)
---
## Related Issues
- Fixes#11638 - [Confluence] How to specify Space Key when adding
Confluence data source?