ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-02-07 02:55:08 +08:00

Author	SHA1	Message	Date
chanx	eeb36a5ce7	Feature: Implement metadata functionality (#12049 ) ### What problem does this PR solve? Feature: Implement metadata functionality ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 19:13:33 +08:00
balibabu	aceca266ff	Feat: Images appearing consecutively in the dialogue are merged and displayed in a carousel. #10427 (#12051 ) ### What problem does this PR solve? Feat: Images appearing consecutively in the dialogue are merged and displayed in a carousel. #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 19:13:18 +08:00
balibabu	0494b92371	Feat: Display error messages from intermediate nodes. #10427 (#12038 ) ### What problem does this PR solve? Feat: Display error messages from intermediate nodes. #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 17:44:45 +08:00
balibabu	4cbe470089	Feat: Display error messages from intermediate nodes of the webhook. #10427 (#11954 ) ### What problem does this PR solve? Feat: Remove HMAC from the webhook #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 12:56:56 +08:00
Yongteng Lei	2844700dc4	Refa: better UX for adding OCR model (#12034 ) ### What problem does this PR solve? Better UX for adding OCR model. ### Type of change - [x] Refactoring	2025-12-19 11:34:21 +08:00
Magicbook1108	f8fd1ea7e1	Feat: Further update Bedrock model configs (#12029 ) ### What problem does this PR solve? Feat: Further update Bedrock model configs #12020 #12008 <img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69" src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63" /> <img width="700" alt="afe88ec3c58f745f85c5c507b040c250" src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef" /> <img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69" src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0" /> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 11:32:20 +08:00
Magicbook1108	e84d5412bc	Feat: bedrock iam authentication (#12020 ) ### What problem does this PR solve? Feat: bedrock iam authentication #12008 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-18 17:13:09 +08:00
Jimmy Ben Klieve	ce161f09cc	feat: add image uploader in edit chunk dialog (#12003 ) ### What problem does this PR solve? Add image uploader in edit chunk dialog for replacing image chunk ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-18 09:33:52 +08:00
Yongteng Lei	3820de916c	Fix: duplicated PDF parser (#12000 ) ### What problem does this PR solve? Fix duplicated PDF parser. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-17 19:48:10 +08:00
Jimmy Ben Klieve	e58271ef76	feat: add toc option in transformer node in ingestion pipeline (#11992 ) ### What problem does this PR solve? Add TOC (Table of contents) option in Ingestion Pipeline canvas > Transformer node ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-17 15:51:55 +08:00
chanx	d16643a53d	Fix: Fixed the issue of empty memory parameters (#11988 ) ### What problem does this PR solve? Fix: Fixed the issue of empty memory parameters ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-17 15:42:29 +08:00
Jimmy Ben Klieve	4046bffaf1	fix: unable to save ingestion pipeline config without modifying children delimiter (#11991 ) …ildren delimiter ### What problem does this PR solve? Fix the issue of unable to save Files > Ingestion Pipeline (Modal) config without modifying children delimiter ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-17 15:37:28 +08:00
Yongteng Lei	03f9be7cbb	Refa: only support MinerU-API now (#11977 ) ### What problem does this PR solve? Only support MinerU-API now, still need to complete frontend for pipeline to allow the configuration of MinerU options. ### Type of change - [x] Refactoring	2025-12-17 12:58:48 +08:00
Jin Hai	5e05f43c3d	Update default prompt (#11984 ) ### What problem does this PR solve? New default prompt: ``` You are an intelligent assistant. Your primary function is to answer questions based strictly on the provided knowledge base. Essential Rules: - Your answer must be derived solely from this knowledge base: `{knowledge}`. - When information is available: Summarize the content to give a detailed answer. - When information is unavailable: Your response must contain this exact sentence: "The answer you are looking for is not found in the knowledge base!" - Always consider the entire conversation history. ``` Also fix some grammar errors. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-17 12:57:24 +08:00
chanx	205a6483f5	Feature：memory function complete (#11982 ) ### What problem does this PR solve? memory function complete ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-17 12:35:26 +08:00
Jimmy Ben Klieve	2595644dfd	feat: add ingestion pipeline children delimiters configs (#11979 ) ### What problem does this PR solve? Add children delimiters for Ingestion pipeline config ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-17 11:18:54 +08:00
Jin Hai	30019dab9f	Change knowledge base to dataset (#11976 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-17 10:03:33 +08:00
shivam johri	5bba562048	Feature/excel export fix (#11914 ) ### PR details feat: Add Excel export support and fix variable reference regex Changes: - Add Excel export output format option to Message component - Apply nest_asyncio patch to handle nested event loops - Fix async generator iteration in canvas_app.py debug endpoint - Add underscore support in variable reference regex pattern ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Shivam Johri <shivamjohri@Shivams-MacBook-Air.local>	2025-12-16 13:15:52 +08:00
concertdictate	49c74d08e8	Feature/mineru improvements (#11938 ) 我已在下面的评论中用中文重复说明。 ### What problem does this PR solve? ## Summary This PR enhances the MinerU document parser with additional configuration options, giving users more control over PDF parsing behavior and improving support for multilingual documents. ## Changes ### Backend (`deepdoc/parser/mineru_parser.py`) - Added configurable parsing options: - Parse Method: `auto`, `txt`, or `ocr` — allows users to choose the extraction strategy - Formula Recognition: Toggle for enabling/disabling formula extraction (useful to disable for Cyrillic documents where it may cause issues) - Table Recognition: Toggle for enabling/disabling table extraction - Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate RAGFlow language settings to MinerU-compatible language codes for better OCR accuracy - Improved parser configuration handling to pass these options through the processing pipeline ### Frontend (`web/`) - Created new `MinerUOptionsFormField` component that conditionally renders when MinerU is selected as the layout recognition engine - Added UI controls for: - Parse method selection (dropdown) - Formula recognition toggle (switch) - Table recognition toggle (switch) - Added i18n translations for English and Chinese - Integrated the options into both the dataset creation dialog and dataset settings page ### Integration - Updated `rag/app/naive.py` to forward MinerU options to the parser - Updated task service to handle the new configuration parameters ## Why MinerU is a powerful document parser, but the default settings don't work well for all document types. This PR allows users to: 1. Choose the best parsing method for their documents 2. Disable formula recognition for Cyrillic/non-Latin scripts where it causes issues 3. Control table extraction based on document needs 4. Benefit from automatic language detection for better OCR results ## Testing - [x] Tested MinerU parsing with different parse methods - [x] Verified UI renders correctly when MinerU is selected/deselected - [x] Confirmed settings persist correctly in dataset configuration ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): --------- Co-authored-by: user210 <user210@rt> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-16 13:15:25 +08:00
chanx	a98887d4ca	Fix: Bug fixes (#11960 ) ### What problem does this PR solve? Fix: Bug fixes New search popup style modification Fixed multilingual settings not updating immediately on personal center page Changed overlapped percent to percentage format, with maximum value of 30% ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-16 09:44:06 +08:00
Jin Hai	7ca3e11566	Update dataset config and retrieval testing (#11958 ) ### What problem does this PR solve? 1. Refactor the order of the dataset config items. 2. Refactor the text of retrieval test. 3. Refactor typos ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-15 19:56:28 +08:00
lenghanz	a2e080c2d3	feat: display name instead of key in user fillup form submission (#11931 ) ### What problem does this PR solve? - Change the message format from 'key: value' to 'name: value' when user submits the fillup form in agent chat. - This resolves #11865 - I think this change makes sense, better aligning the form and the replied message. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-15 19:12:01 +08:00
balibabu	1ddd11f045	Feat: Set the return value of the webhook to a string. #10427 (#11945 ) ### What problem does this PR solve? Feat: Set the return value of the webhook to a string. #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-15 11:09:08 +08:00
Yongteng Lei	0f0fb53256	Refa: refactor metadata filter (#11907 ) ### What problem does this PR solve? Refactor metadata filter. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-12 17:12:38 +08:00
balibabu	0fcb1680fd	Feat: Displaying the file option in the webhook's request body #10427 (#11928 ) ### What problem does this PR solve? Feat: Displaying the file option in the webhook's request body #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 16:16:34 +08:00
Magicbook1108	50715ba332	Fix: forget-reset password (#11927 ) ### What problem does this PR solve? Fix: forget-reset password ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-12 16:16:17 +08:00
PentaFDevs	f9510edbbc	Feature/docs generator (#11858 ) ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### What problem does this PR solve? This PR introduces a new Docs Generator agent component for producing downloadable PDF, DOCX, or TXT files from Markdown content generated within a RAGFlow workflow. ### Key Features Backend - New component: DocsGenerator (agent/component/docs_generator.py) - - Markdown → PDF/DOCX/TXT conversion - - Supports tables, lists, code blocks, headings, and rich formatting - - Configurable document style (fonts, margins, colors, page size, orientation) - - Optional header logo and footer with page numbers/timestamps - Frontend - New configuration UI for the Docs Generator - - Download button integrated into the chat interface - - Output wired to the Message component - - Full i18n support Documentation Added component guide: docs/guides/agent/agent_component_reference/docs_generator.md Usage Add the Docs Generator to a workflow, connect Markdown output from an upstream component, configure metadata/style, and feed its output into the Message component. Users will see a document download button directly in the chat. Contributor Note We have been following RAGFlow since more than a year and half now and have worked extensively on personalizing the framework and integrating it into several of our internal systems. Over the past year and a half, we have built multiple platforms that rely on RAGFlow as a core component, which has given us a strong appreciation for how flexible and powerful the project is. We also previously contributed the full Italian translation, and we were glad to see it accepted. This new Docs Generator component was created for our own production needs, and we believe that it may be useful for many others in the community as well. We want to sincerely thank the entire RAGFlow team for the remarkable work you have done and continue to do. If there are opportunities to contribute further, we would be glad to help whenever we have time available. It would be a pleasure to support the project in any way we can. If appropriate, we would be glad to be listed among the project’s contributors, but in any case we look forward to continuing to support and contribute to the project. PentaFrame Development Team --------- Co-authored-by: PentaFrame <info@pentaframe.it> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-12 14:59:43 +08:00
Magicbook1108	7db9045b74	Feat: Add box connector (#11845 ) ### What problem does this PR solve? Feat: Add box connector ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 10:23:40 +08:00
balibabu	a6bd765a02	Feat: Flatten the request schema of the webhook #10427 (#11917 ) ### What problem does this PR solve? Feat: Flatten the request schema of the webhook #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 09:59:54 +08:00
balibabu	22a51a3868	Feat: Add mineru as a model manufacturer to the system. #10621 (#11903 ) ### What problem does this PR solve? Feat: Add mineru as a model manufacturer to the system. #10621 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: balibabu <assassin_cike@163.com>	2025-12-11 17:37:10 +08:00
TeslaZY	bd0eff2954	Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code (#11898 ) ### What problem does this PR solve? Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-11 13:55:01 +08:00
TeslaZY	c610bb605a	Added semi-automatic mode to the metadata filter (#11886 ) ### What problem does this PR solve? Retrieval metadata filtering adds semi-automatic mode, and users can manually check the metadata key that participates in LLM to generate filter conditions. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-11 10:45:21 +08:00
balibabu	34d29d7e8b	Feat: Add configuration for webhook to the begin node. #10427 (#11875 ) ### What problem does this PR solve? Feat: Add configuration for webhook to the begin node. #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-10 19:13:57 +08:00
buua436	ab4b62031f	Fix:csv parse in Table (#11870 ) ### What problem does this PR solve? change: csv parse in Table ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-10 16:44:06 +08:00
chanx	80f3ccf1ac	Fix:Modify the name of the Overlapped percent field (#11866 ) ### What problem does this PR solve? Fix:Modify the name of the Overlapped percent field ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-10 13:38:24 +08:00
balibabu	30377319d8	Fix: The variables in the message node are not displaying correctly. #11839 (#11841 ) ### What problem does this PR solve? Fix: The variables in the message node are not displaying correctly. #11839 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-09 17:59:49 +08:00
PentaFDevs	07dca37ef0	feat: add Italian language translation support (#11844 ) ### What problem does this PR solve? - Add complete Italian translation file with all UI sections - Register Italian in LanguageAbbreviation enum and language maps - Configure Italian translation in i18n config - Add Italiano to language selector dropdown ### Type of change - [x] Other (please describe): ## What Added complete Italian language translation support to RAGFlow ## Changes - Added comprehensive Italian translation file ([it.ts](ragflow/web/src/locales/it.ts:0:0-0:0)) with all UI sections (1239 lines) - Registered Italian in `LanguageAbbreviation` enum and all language maps - Configured Italian translation in i18n configuration - Added "Italiano" to language selector dropdown ## Impact - Italian users can now use RAGFlow in their native language - All major UI components are translated including: - Login/registration screens - Knowledge base management - Chat interface - Settings and configuration - Admin console - Error messages and notifications ## Testing - Verified all translation keys are present - Confirmed language selector shows "Italiano" correctly - Tested that no translation keys are missing - All UI sections properly translated Co-authored-by: PentaFrame <info@pentaframe.it>	2025-12-09 17:59:21 +08:00
chanx	28bc87c5e2	Feature: Memory interface integration testing (#11833 ) ### What problem does this PR solve? Feature: Memory interface integration testing ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-09 14:52:58 +08:00
Jin Hai	43f51baa96	Fix errors (#11804 ) ### What problem does this PR solve? 1. typos 2. grammar errors. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-08 12:21:18 +08:00
chanx	5a2011e687	Fix: Changed 'HightLightMarkdown' to 'HighLightMarkdown' (#11803 ) ### What problem does this PR solve? Fix: Changed 'HightLightMarkdown' to 'HighLightMarkdown', and replaced the private component with a public component. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-08 11:11:48 +08:00
Rohit	4d7934061e	fix: Correct toast type import path in use-toast hook (#11791 ) This commit resolves an incorrect import path for `ToastProps` and `ToastActionElement` types within the `use-toast.tsx` hook. The current path, `@/registry/default/ui/toast`, does not reflect the actual file location in this repository. The import in `src/components/hooks/use-toast.tsx` has been updated from `@/registry/default/ui/toast` to the correct alias path: `@/components/ui/toast`. This ensures the types are resolved correctly and the codebase remains clean and functional.	2025-12-08 10:18:20 +08:00
chanx	660fa8888b	Features: Memory page rendering and other bug fixes (#11784 ) ### What problem does this PR solve? Features: Memory page rendering and other bug fixes - Rendering of the Memory list page - Rendering of the message list page in Memory - Fixed an issue where the empty state was incorrectly displayed when search criteria were applied - Added a web link for the API-Key - modifying the index_mode attribute of the Confluence data source. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-12-08 10:17:56 +08:00
balibabu	5b5f19cbc1	Fix: Newly added models to OpenAI-API-Compatible are not displayed in the LLM dropdown menu in a timely manner. #11774 (#11775 ) ### What problem does this PR solve? Fix: Newly added models to OpenAI-API-Compatible are not displayed in the LLM dropdown menu in a timely manner. #11774 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-05 18:04:49 +08:00
balibabu	ea38e12d42	Feat: Users can chat directly without first creating a conversation. #11768 (#11769 ) ### What problem does this PR solve? Feat: Users can chat directly without first creating a conversation. #11768 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-05 17:34:41 +08:00
balibabu	468e4042c2	Feat: Display the ID of the code image in the dialog. #10427 (#11746 ) ### What problem does this PR solve? Feat: Display the ID of the code image in the dialog. #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-04 18:49:55 +08:00
Magicbook1108	4012d65b3c	Feat: update front end for confluence connector (#11747 ) ### What problem does this PR solve? Feat: update front end for confluence connector ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-04 18:49:13 +08:00
Magicbook1108	e2bc1a3478	Feat: add more attribute for confluence connector. (#11743 ) ### What problem does this PR solve? Feat: add more attribute for confluence connector. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-04 17:28:03 +08:00
Magicbook1108	b4e06237ef	Feat: detect docx support via header-byte inspection (#11731 ) ## What problem does this PR solve? Feat: detect docx support via header-byte inspection, a further optimize based on #11684 Not all files with a .doc extension are truly legacy .doc formats, and some are internally valid .docx documents. The previous implementation relied on URL suffix checks, which misclassified these cases and was therefore not reliable. Doc file could be previewed: [en2zh.doc](https://github.com/user-attachments/files/23921131/en2zh.doc) Doc file could not be previewed: [file-sample_100kB.doc](https://github.com/user-attachments/files/23921134/file-sample_100kB.doc) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-04 13:41:18 +08:00
chanx	751a13fb64	Feature：Add a loading status to the agent canvas page. (#11733 ) ### What problem does this PR solve? Feature：Add a loading status to the agent canvas page. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-04 13:40:49 +08:00
hsparks-codes	a3c9402218	Feat: confluence space key (#11706 ) # PR Description: Add Space Key Configuration for Confluence Data Source ### What problem does this PR solve? This PR addresses issue #11638 where users requested the ability to specify Confluence Space Keys when configuring a Confluence data source connector. Problem: Currently, the RAGFlow UI for Confluence data sources only provides fields for: - Username - Access Token - Wiki Base URL - Is Cloud checkbox There is no way to specify which Confluence space(s) to sync, causing RAGFlow to attempt syncing all accessible spaces. This is problematic for users who: - Only want to index specific spaces (e.g., only the HR or Documentation space) - Have access to many spaces but only need a subset - Want to avoid unnecessary data transfer and processing Solution: The backend `ConfluenceConnector` class already supports a `space` parameter in its `__init__()` method (line 1282 in `common/data_source/confluence_connector.py`), but this parameter was never exposed in the UI. This PR adds the missing UI field to allow users to configure space filtering. User Impact: Users can now: - Leave the field empty to sync all accessible spaces (default behavior) - Specify a single space key (e.g., `DEV`) - Specify multiple space keys separated by commas (e.g., `DEV,DOCS,HR`) This gives users fine-grained control over which Confluence content gets indexed into their RAGFlow knowledge base. Fixes #11638 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --- ## Implementation Details ### Changes Made 1. Frontend UI (`web/src/pages/user-setting/data-source/contant.tsx`) - Added "Space Key" text input field to Confluence configuration form - Field is optional (not required) - Positioned after "Is Cloud" checkbox for logical grouping - Added to initial values with empty string default *2. Internationalization (`web/src/locales/.ts`) - English (`en.ts`): Added `confluenceSpaceKeyTip` with clear instructions and examples - Chinese (`zh.ts`): Added Chinese translation for the tooltip - Russian (`ru.ts`): Added Russian translation for the tooltip - Bonus Fix: Removed duplicate `deleteModal` object in `zh.ts` that was causing TypeScript lint errors ### Backend Compatibility No backend changes were needed! The `ConfluenceConnector` class already supports the `space` parameter: ```python def __init__( self, wiki_base: str, is_cloud: bool, space: str = "", # ← Already supported! page_id: str = "", index_recursively: bool = False, cql_query: str \| None = None, ... ) ``` The connector uses this parameter to filter the CQL query (line 1328-1330): ```python elif space: uri_safe_space = quote(space) base_cql_page_query += f" and space='{uri_safe_space}'" ``` ### User Experience Before: - Users could only sync ALL accessible spaces - No UI option to limit scope After:** - Users see "Space Key" field with helpful tooltip - Tooltip explains: - Optional field (leave empty for all spaces) - Single space example: `DEV` - Multiple spaces example: `DEV,DOCS,HR` - Available in English, Chinese, and Russian ### Future Enhancements Potential improvements for future PRs: - Add validation to check if space key exists before saving - Add autocomplete/dropdown to show available spaces - Add UI hints about space key format requirements - Support for page_id filtering (already supported in backend) --- ## Related Issues - Fixes #11638 - [Confluence] How to specify Space Key when adding Confluence data source?	2025-12-03 19:17:47 +08:00

1 2 3 4 5 ...

1521 Commits