ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-02-07 02:55:08 +08:00

Author	SHA1	Message	Date
Stephen Hu	2a0f835ffe	Refactor: Improve the logic to calculate embedding total token count (#11943 ) ### What problem does this PR solve? Improve the logic to calculate embedding total token count ### Type of change - [x] Refactoring	2025-12-15 11:33:57 +08:00
Yongteng Lei	13d8241eee	Doc: executor manager updated docker version (#11946 ) ### What problem does this PR solve? Add documentation for #11806. ### Type of change - [x] Documentation Update	2025-12-15 11:13:51 +08:00
balibabu	1ddd11f045	Feat: Set the return value of the webhook to a string. #10427 (#11945 ) ### What problem does this PR solve? Feat: Set the return value of the webhook to a string. #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-15 11:09:08 +08:00
YngvarHuang	81eb03d230	Support uploading encrypted files to object storage (#11837 ) (#11838 ) ### What problem does this PR solve? Support uploading encrypted files to object storage. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: virgilwong <hyhvirgil@gmail.com>	2025-12-15 09:45:18 +08:00
Magicbook1108	7d23c3aed0	Fix: presentation parsing & Embedding encode exception handling (#11933 ) ### What problem does this PR solve? Fix: presentation parsing #11920 Fix: Embeddin encode exception handling ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-13 11:37:42 +08:00
Yongteng Lei	6be0338aa0	Fix: Asure-OpenAI resource not found (#11934 ) ### What problem does this PR solve? Asure-OpenAI resource not found. #11750 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-13 11:32:46 +08:00
Kevin Hu	44dec89f1f	Fix: aspose-slide issue. (#11935 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-12 20:16:18 +08:00
Yongteng Lei	2b260901df	Fix: raptor don't have attribute chat (#11936 ) ### What problem does this PR solve? Raptor don't have attribute chat. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-12 20:08:18 +08:00
Magicbook1108	948bc93786	Feat: Add GPT-5.2 & pro (#11929 ) ### What problem does this PR solve? Feat: Add GPT-5.2 & pro ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 17:35:08 +08:00
Yongteng Lei	0f0fb53256	Refa: refactor metadata filter (#11907 ) ### What problem does this PR solve? Refactor metadata filter. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-12 17:12:38 +08:00
balibabu	0fcb1680fd	Feat: Displaying the file option in the webhook's request body #10427 (#11928 ) ### What problem does this PR solve? Feat: Displaying the file option in the webhook's request body #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 16:16:34 +08:00
Magicbook1108	50715ba332	Fix: forget-reset password (#11927 ) ### What problem does this PR solve? Fix: forget-reset password ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-12 16:16:17 +08:00
PentaFDevs	f9510edbbc	Feature/docs generator (#11858 ) ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### What problem does this PR solve? This PR introduces a new Docs Generator agent component for producing downloadable PDF, DOCX, or TXT files from Markdown content generated within a RAGFlow workflow. ### Key Features Backend - New component: DocsGenerator (agent/component/docs_generator.py) - - Markdown → PDF/DOCX/TXT conversion - - Supports tables, lists, code blocks, headings, and rich formatting - - Configurable document style (fonts, margins, colors, page size, orientation) - - Optional header logo and footer with page numbers/timestamps - Frontend - New configuration UI for the Docs Generator - - Download button integrated into the chat interface - - Output wired to the Message component - - Full i18n support Documentation Added component guide: docs/guides/agent/agent_component_reference/docs_generator.md Usage Add the Docs Generator to a workflow, connect Markdown output from an upstream component, configure metadata/style, and feed its output into the Message component. Users will see a document download button directly in the chat. Contributor Note We have been following RAGFlow since more than a year and half now and have worked extensively on personalizing the framework and integrating it into several of our internal systems. Over the past year and a half, we have built multiple platforms that rely on RAGFlow as a core component, which has given us a strong appreciation for how flexible and powerful the project is. We also previously contributed the full Italian translation, and we were glad to see it accepted. This new Docs Generator component was created for our own production needs, and we believe that it may be useful for many others in the community as well. We want to sincerely thank the entire RAGFlow team for the remarkable work you have done and continue to do. If there are opportunities to contribute further, we would be glad to help whenever we have time available. It would be a pleasure to support the project in any way we can. If appropriate, we would be glad to be listed among the project’s contributors, but in any case we look forward to continuing to support and contribute to the project. PentaFrame Development Team --------- Co-authored-by: PentaFrame <info@pentaframe.it> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-12 14:59:43 +08:00
Yongteng Lei	6560388f2b	Fix: correct metadata update behavior (#11919 ) ### What problem does this PR solve? Correct metadata update behavior. #11912 When update `value` is omitted, the corresponding keys are updated to `"value"` regardless of their current values. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-12 12:50:17 +08:00
writinwaters	e37aea5f81	Docs: How to use restful API to update or delete metadata (#11912 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-12-12 12:04:47 +08:00
Magicbook1108	7db9045b74	Feat: Add box connector (#11845 ) ### What problem does this PR solve? Feat: Add box connector ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 10:23:40 +08:00
balibabu	a6bd765a02	Feat: Flatten the request schema of the webhook #10427 (#11917 ) ### What problem does this PR solve? Feat: Flatten the request schema of the webhook #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 09:59:54 +08:00
Andrea Bugeja	74afb8d710	feat: Add Single Bucket Mode for MinIO/S3 (#11416 ) ## Overview This PR adds support for Single Bucket Mode in RAGFlow, allowing users to configure MinIO/S3 to use a single bucket with a directory structure instead of creating multiple buckets per Knowledge Base and user folder. ## Problem Statement The current implementation creates one bucket per Knowledge Base and one bucket per user folder, which can be problematic when: - Cloud providers charge per bucket - IAM policies restrict bucket creation - Organizations want centralized data management in a single bucket ## Solution Added a `prefix_path` configuration option to the MinIO connector that enables: - Using a single bucket with directory-based organization - Backward compatibility with existing multi-bucket deployments - Support for MinIO, AWS S3, and other S3-compatible storage backends ## Changes - `rag/utils/minio_conn.py`: Enhanced MinIO connector to support single bucket mode with prefix paths - `conf/service_conf.yaml`: Added new configuration options (`bucket` and `prefix_path`) - `docker/service_conf.yaml.template`: Updated template with single bucket configuration examples - `docker/.env.single-bucket-example`: Added example environment variables for single bucket setup - `docs/single-bucket-mode.md`: Comprehensive documentation covering usage, migration, and troubleshooting ## Configuration Example ```yaml minio: user: "access-key" password: "secret-key" host: "minio.example.com:443" bucket: "ragflow-bucket" # Single bucket name prefix_path: "ragflow" # Optional prefix path ``` ## Backward Compatibility ✅ Fully backward compatible - existing deployments continue to work without any changes - If `bucket` is not configured, uses default multi-bucket behavior - If `bucket` is configured without `prefix_path`, uses bucket root - If both are configured, uses `bucket/prefix_path/` structure ## Testing - Tested with MinIO (local and cloud) - Verified backward compatibility with existing multi-bucket mode - Validated IAM policy restrictions work correctly ## Documentation Included comprehensive documentation in `docs/single-bucket-mode.md` covering: - Configuration examples - Migration guide from multi-bucket to single-bucket mode - IAM policy examples - Troubleshooting guide --- Related Issue: Addresses use cases where bucket creation is restricted or costly	2025-12-11 19:22:47 +08:00
Kevin Hu	ea4a5cd665	Fix: tokenizer issue. (#11902 ) #11786 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-11 17:38:17 +08:00
balibabu	22a51a3868	Feat: Add mineru as a model manufacturer to the system. #10621 (#11903 ) ### What problem does this PR solve? Feat: Add mineru as a model manufacturer to the system. #10621 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: balibabu <assassin_cike@163.com>	2025-12-11 17:37:10 +08:00
Yongteng Lei	e9710b7aa9	Refa: treat MinerU as an OCR model 2 (#11905 ) ### What problem does this PR solve? Treat MinerU as an OCR model 2. #11903 ### Type of change - [x] Refactoring	2025-12-11 17:33:12 +08:00
TeslaZY	bd0eff2954	Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code (#11898 ) ### What problem does this PR solve? Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-11 13:55:01 +08:00
buua436	e3cfe8e848	Fix:async issue and sensitive logging (#11895 ) ### What problem does this PR solve? change： async issue and sensitive logging ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-11 13:54:47 +08:00
TeslaZY	c610bb605a	Added semi-automatic mode to the metadata filter (#11886 ) ### What problem does this PR solve? Retrieval metadata filtering adds semi-automatic mode, and users can manually check the metadata key that participates in LLM to generate filter conditions. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-11 10:45:21 +08:00
David López Carrascal	a6afb7dfe2	Fix data_sync startup crash by properly invoking async main (#11879 ) ### What problem does this PR solve? This PR fixes a startup crash in the data_sync_0 service caused by an incorrect asyncio.run call. The main coroutine was being passed as a function reference instead of being invoked, which raised: `ValueError: a coroutine was expected, got <function main ...> ` What I changed - Updated the entrypoint in sync_data_source.py to correctly invoke the coroutine with `asyncio.run(main())`. Testing - No tested. Related Issue Fixes https://github.com/infiniflow/ragflow/issues/11878 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-11 10:09:16 +08:00
TeslaZY	7b96113d4c	MinerU supports for the new backend vlm-mlx-engine (#11864 ) ### What problem does this PR solve? MinerU new version supports for the new backend vlm-mlx-engine，https://github.com/opendatalab/MinerU . ### Type of change - [ x ] New Feature (non-breaking change which adds functionality)	2025-12-11 09:59:38 +08:00
Yongteng Lei	8370bc61b7	Feat: enhance metadata operation (#11874 ) ### What problem does this PR solve? Add metadata condition in document list. Add metadata bulk update. Add metadata summary. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2025-12-11 09:59:15 +08:00
N0bodycan	74eb894453	Fix `RuntimeError: asyncio.run() cannot be called from a running event loop` when calling mindmap endpoint. (#11880 ) ### What problem does this PR solve? Fix RuntimeError when calling mindmap endpoint by converting `gen_mindmap()` to async function and using `await` instead of `asyncio.run()`. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-11 09:47:44 +08:00
balibabu	34d29d7e8b	Feat: Add configuration for webhook to the begin node. #10427 (#11875 ) ### What problem does this PR solve? Feat: Add configuration for webhook to the begin node. #10427 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-10 19:13:57 +08:00
He Wang	badf33e3b9	feat: enhance OBConnection.search (#11876 ) ### What problem does this PR solve? Enhance OBConnection.search for better performance. Main changes: 1. Use string type of vector array in distance func for better parsing performance. 2. Manually set max_connections as pool size instead of using default value. 3. Set 'fulltext_search_columns' when starting. 4. Cache the results of the table existence check (we will never drop the table). 5. Remove unused 'group_results' logic. 6. Add the `USE_FULLTEXT_FIRST_FUSION_SEARCH` flag, and the corresponding fusion search SQL when it's false. ### Type of change - [x] Performance Improvement	2025-12-10 19:13:37 +08:00
buua436	3cb72377d7	Refa:remove sensitive information (#11873 ) ### What problem does this PR solve? change: remove sensitive information ### Type of change - [x] Refactoring	2025-12-10 19:08:45 +08:00
buua436	ab4b62031f	Fix:csv parse in Table (#11870 ) ### What problem does this PR solve? change: csv parse in Table ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-10 16:44:06 +08:00
chanx	80f3ccf1ac	Fix:Modify the name of the Overlapped percent field (#11866 ) ### What problem does this PR solve? Fix:Modify the name of the Overlapped percent field ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-10 13:38:24 +08:00
Lynn	a1164b9c89	Feat/memory (#11812 ) ### What problem does this PR solve? Manage and display memory datasets. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-10 13:34:08 +08:00
Russell Valentine	fd7e55b23d	executor_manager updated docker version (#11806 ) ### What problem does this PR solve? The docker version(24.0.7) installed in the executor manager image is incompatible with the latest stable docker (29.1.3). The minmum api v29.1.3 can use is 1.4.4 api version, but 24.0.7 uses api version 1.4.3. ### Type of change - [X] Other (please describe): This could break things for people who still have an old docker installed on their system. A better approach could be a setting to share	2025-12-10 11:08:11 +08:00
Zhichang Yu	f128a1fa9e	Bump python to >=3.12 (#11846 ) ### What problem does this PR solve? Bump python to >=3.12 ### Type of change - [x] Refactoring	2025-12-09 19:55:25 +08:00
buua436	65a5a56d95	Refa:replace trio with asyncio (#11831 ) ### What problem does this PR solve? change: replace trio with asyncio ### Type of change - [x] Refactoring	2025-12-09 19:23:14 +08:00
Magicbook1108	ca2d6f3301	Fix: duplicate output by async_chat_streamly (#11842 ) ### What problem does this PR solve? Fix: duplicate output by async_chat_streamly Refact: revert manual modification ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-09 19:21:52 +08:00
Yongteng Lei	a94b3b9df2	Refa: treat MinerU as an OCR model (#11849 ) ### What problem does this PR solve? Treat MinerU as an OCR model. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2025-12-09 18:54:14 +08:00
balibabu	30377319d8	Fix: The variables in the message node are not displaying correctly. #11839 (#11841 ) ### What problem does this PR solve? Fix: The variables in the message node are not displaying correctly. #11839 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-09 17:59:49 +08:00
PentaFDevs	07dca37ef0	feat: add Italian language translation support (#11844 ) ### What problem does this PR solve? - Add complete Italian translation file with all UI sections - Register Italian in LanguageAbbreviation enum and language maps - Configure Italian translation in i18n config - Add Italiano to language selector dropdown ### Type of change - [x] Other (please describe): ## What Added complete Italian language translation support to RAGFlow ## Changes - Added comprehensive Italian translation file ([it.ts](ragflow/web/src/locales/it.ts:0:0-0:0)) with all UI sections (1239 lines) - Registered Italian in `LanguageAbbreviation` enum and all language maps - Configured Italian translation in i18n configuration - Added "Italiano" to language selector dropdown ## Impact - Italian users can now use RAGFlow in their native language - All major UI components are translated including: - Login/registration screens - Knowledge base management - Chat interface - Settings and configuration - Admin console - Error messages and notifications ## Testing - Verified all translation keys are present - Confirmed language selector shows "Italiano" correctly - Tested that no translation keys are missing - All UI sections properly translated Co-authored-by: PentaFrame <info@pentaframe.it>	2025-12-09 17:59:21 +08:00
changkeke	036b29f084	Docs: Enhance API reference for file management (#11827 ) ### What problem does this PR solve? The SDK documentation is lacking in file management sections. ### Type of change - [x] Documentation Update	2025-12-09 17:30:53 +08:00
N0bodycan	9863862348	fix: prevent redundant retries in async_chat_streamly upon success (#11832 ) ## What changes were proposed in this pull request? Added a return statement after the successful completion of the async for loop in async_chat_streamly. ## Why are the changes needed? Previously, the code lacked a break/return mechanism inside the try block. This caused the retry loop (for attempt in range...) to continue executing even after the LLM response was successfully generated and yielded, resulting in duplicate requests (up to max_retries times). ## Does this PR introduce any user-facing change? No (it fixes an internal logic bug).	2025-12-09 17:14:30 +08:00
Zhichang Yu	bb6022477e	Bump infinity to v0.6.11. Requires python>=3.11 (#11814 ) ### What problem does this PR solve? Bump infinity to v0.6.11. Requires python>=3.11 ### Type of change - [x] Refactoring	2025-12-09 16:23:37 +08:00
chanx	28bc87c5e2	Feature: Memory interface integration testing (#11833 ) ### What problem does this PR solve? Feature: Memory interface integration testing ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-09 14:52:58 +08:00
Yongteng Lei	c51e6b2a58	Refa: migrate CV model chat to Async (#11828 ) ### What problem does this PR solve? Migrate CV model chat to Async. #11750 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-12-09 13:08:37 +08:00
Stephen Hu	481192300d	Fix:[ERROR][Exception]: list index out of range (#11826 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/11821 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-09 09:58:34 +08:00
sjIlll	1777620ea5	fix: set default embedding model for TEI profile in Docker deployment (#11824 ) ## What's changed fix: unify embedding model fallback logic for both TEI and non-TEI Docker deployments > This fix targets Docker / `docker-compose` deployments, ensuring a valid default embedding model is always set—regardless of the compose profile used. ## Changes \| Scenario \| New Behavior \| \|--------\|--------------\| \| Non-`tei-` profile (e.g., default deployment) \| `EMBEDDING_MDL` is now correctly initialized from `EMBEDDING_CFG` (derived from `user_default_llm`), ensuring custom defaults like `bge-m3@Ollama` are properly applied to new tenants. \| \| `tei-` profile (`COMPOSE_PROFILES` contains `tei-`) \| Still respects the `TEI_MODEL` environment variable. If unset, falls back to `EMBEDDING_CFG`. Only when both are empty does it use the built-in default (`BAAI/bge-small-en-v1.5`), preventing an empty embedding model. \| ## Why This Change? - In non-TEI mode: The previous logic would reset `EMBEDDING_MDL` to an empty string, causing pre-configured defaults (e.g., `bge-m3@Ollama` in the Docker image) to be ignored—leading to tenant initialization failures or silent misconfigurations. - In TEI mode: Users need the ability to override the model via `TEI_MODEL`, but without a safe fallback, missing configuration could break the system. The new logic adopts a “config-first, env-var-override” strategy for robustness in containerized environments. ## Implementation - Updated the assignment logic for `EMBEDDING_MDL` in `rag/common/settings.py` to follow a unified fallback chain: EMBEDDING_CFG → TEI_MODEL (if tei- profile active) → built-in default ## Testing Verified in Docker deployments: 1. `COMPOSE_PROFILES=` (no TEI) → New tenants get `bge-m3@Ollama` as the default embedding model 2. `COMPOSE_PROFILES=tei-gpu` with no `TEI_MODEL` set → Falls back to `BAAI/bge-small-en-v1.5` 3. `COMPOSE_PROFILES=tei-gpu` with `TEI_MODEL=my-model` → New tenants use `my-model` as the embedding model Closes #8916 fix #11522 fix #11306	2025-12-09 09:38:44 +08:00
Levi	f3a03b06b2	fix: align http client proxy kwarg (#11818 ) ### What problem does this PR solve? Our HTTP wrapper still passed proxies to httpx.Client/AsyncClient, which expect proxy. As a result, configured proxies were ignored and calls could fail with ValueError("Failed to fetch OIDC metadata: Client.__init__() got an unexpected keyword argument 'proxies'"). This PR switches to the correct proxy kwarg so proxies are honored and the runtime error is resolved. ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) --- Contribution during my time at RAGcon GmbH.	2025-12-09 09:35:03 +08:00
buua436	dd046be976	Fix: parent-child chunking method (#11810 ) ### What problem does this PR solve? change: parent-child chunking method ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-09 09:34:01 +08:00

1 2 3 4 5 ...

4764 Commits