ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-12-24 23:46:52 +08:00

Author	SHA1	Message	Date
Stephen Hu	5776fa73a7	refactor: improve memory service date time consistency (#12144 ) ### What problem does this PR solve? improve memory service date time consistency ### Type of change - [x] Refactoring	2025-12-24 11:00:31 +08:00
Yongteng Lei	c987d33649	Feat: deduplicate metadata lists during updates (#12125 ) ### What problem does this PR solve? Deduplicate metadata lists during updates. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-24 09:32:55 +08:00
Kevin Hu	c33134ea2c	Fix: table tag on chunks. (#12126 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-24 09:32:19 +08:00
Lynn	17b8bb62b6	Feat: message manage (#12083 ) ### What problem does this PR solve? Message CRUD. Issue #4213 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-23 21:16:25 +08:00
Magicbook1108	bab6a4a219	Fix: /kb/update does not update FileService (#12121 ) ### What problem does this PR solve? Fix: /kb/update does not update FileService ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-23 19:56:38 +08:00
Kevin Hu	00bb6fbd28	Fix: metadata issue & graphrag speeding up. (#12113 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Liu An <asiro@qq.com>	2025-12-23 15:57:27 +08:00
buua436	1444de981c	Feat: enhance webhook response to include status and success fields and simplify ReAct agent (#12091 ) ### What problem does this PR solve? change： enhance webhook response to include status and success fields and simplify ReAct agent ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-23 09:36:08 +08:00
Kevin Hu	bd76b8ff1a	Fix: Tika server upgrades. (#12073 ) ### What problem does this PR solve? #12037 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-23 09:35:52 +08:00
Jin Hai	e5f3d5ae26	Refactor add_llm and add speech to text (#12089 ) ### What problem does this PR solve? 1. Refactor implementation of add_llm 2. Add speech to text model. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-22 19:27:26 +08:00
Jin Hai	993bf7c2c8	Fix IDE warnings (#12085 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-22 16:47:21 +08:00
Yongteng Lei	f911aa2997	Fix: list MCP tools may block (#12067 ) ### What problem does this PR solve? List MCP tools may block. #12043 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-22 13:08:44 +08:00
Jin Hai	42f9ac997f	Remove Chinese comments and fix function arguments errors (#12052 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-22 12:59:37 +08:00
Yongteng Lei	3ee47e4af7	Feat: document list and filter supports metadata filtering (#12053 ) ### What problem does this PR solve? Document list and filter supports metadata filtering. OR within the same field, AND across different fields Example 1 (multi-field AND): ```markdown Doc1 metadata: { "a": "b", "as": ["a", "b", "c"] } Doc2 metadata: { "a": "x", "as": ["d"] } Query: metadata = { "a": ["b"], "as": ["d"] } Result: Doc1 matches a=b but not as=d → excluded Doc2 matches as=d but not a=b → excluded Final result: empty ``` Example 2 (same field OR): ```markdown Doc1 metadata: { "as": ["a", "b", "c"] } Doc2 metadata: { "as": ["d"] } Query: metadata = { "as": ["a", "d"] } Result: Doc1 matches as=a → included Doc2 matches as=d → included Final result: Doc1 + Doc2 ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-22 09:35:11 +08:00
wenjuhao	55c0468ac9	Include document_id in knowledgebase info retrieval (#12041 ) ### What problem does this PR solve? After a file in the file list is associated with a knowledge base, the knowledge base document ID is returned ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 19:32:24 +08:00
Yongteng Lei	6cd1824a77	Feat: chats completions API supports metadata filtering (#12023 ) ### What problem does this PR solve? Chats completions API supports metadata filtering. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 11:36:35 +08:00
Magicbook1108	f8fd1ea7e1	Feat: Further update Bedrock model configs (#12029 ) ### What problem does this PR solve? Feat: Further update Bedrock model configs #12020 #12008 <img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69" src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63" /> <img width="700" alt="afe88ec3c58f745f85c5c507b040c250" src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef" /> <img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69" src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0" /> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 11:32:20 +08:00
buua436	57edc215d7	Feat:update webhook component (#11739 ) ### What problem does this PR solve? issue: https://github.com/infiniflow/ragflow/issues/10427 https://github.com/infiniflow/ragflow/issues/8115 change: - Support for Multiple HTTP Methods (POST / GET / PUT / PATCH / DELETE / HEAD) - Security Validation 1. max_body_size 2. IP whitelist 3. rate limit 4. token / basic / jwt authentication - File Upload Support - Unified Content-Type Handling - Full Schema-Based Extraction & Type Validation - Two Execution Modes: Immediately / Streaming ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-18 19:34:39 +08:00
Yongteng Lei	151480dc85	Feat: trace information can be returned by the agent completion API (#12019 ) ### What problem does this PR solve? Trace information can be returned by the agent completion API. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-18 15:52:11 +08:00
Magicbook1108	5cd1a678c8	Fix: image edit in edit_chunk (#12009 ) ### What problem does this PR solve? Fix: image edit in edit_chunk #11971 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-18 11:35:01 +08:00
Stephen Hu	1a4822d6be	Refactor: Improve the timestamp consistency (#11942 ) ### What problem does this PR solve? Improve the timestamp consistency ### Type of change - [x] Refactoring	2025-12-18 09:40:33 +08:00
Yongteng Lei	672958a192	Fix: model not authorized (#12001 ) ### What problem does this PR solve? Fix model not authorized. #11973. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-17 19:48:24 +08:00
Kevin Hu	8e4d011b15	Fix: parent-children chunking method. (#11997 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-12-17 16:50:36 +08:00
Magicbook1108	7baa67dfe8	Feat: Reject default admin account log in to normal services (#11994 ) ### What problem does this PR solve? Feat: Reject default admin account log in to normal services #11854 #11673 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-17 16:29:20 +08:00
Magicbook1108	4fd4a41e7c	Fix: add multimodel models in chat api (#11986 ) …tant, but model is available via UI Fix: add multimodel models in chat api Fixes #8549 ### What problem does this PR solve? Add a parameter model_type in chat api. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>	2025-12-17 15:46:43 +08:00
Jin Hai	30019dab9f	Change knowledge base to dataset (#11976 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-17 10:03:33 +08:00
Magicbook1108	344a106eba	Feat: Enable image edit in edit_chunk (#11971 ) ### What problem does this PR solve? Feat: Enable image edit in edit_chunk ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-16 17:57:00 +08:00
shivam johri	5bba562048	Feature/excel export fix (#11914 ) ### PR details feat: Add Excel export support and fix variable reference regex Changes: - Add Excel export output format option to Message component - Apply nest_asyncio patch to handle nested event loops - Fix async generator iteration in canvas_app.py debug endpoint - Add underscore support in variable reference regex pattern ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Shivam Johri <shivamjohri@Shivams-MacBook-Air.local>	2025-12-16 13:15:52 +08:00
concertdictate	49c74d08e8	Feature/mineru improvements (#11938 ) 我已在下面的评论中用中文重复说明。 ### What problem does this PR solve? ## Summary This PR enhances the MinerU document parser with additional configuration options, giving users more control over PDF parsing behavior and improving support for multilingual documents. ## Changes ### Backend (`deepdoc/parser/mineru_parser.py`) - Added configurable parsing options: - Parse Method: `auto`, `txt`, or `ocr` — allows users to choose the extraction strategy - Formula Recognition: Toggle for enabling/disabling formula extraction (useful to disable for Cyrillic documents where it may cause issues) - Table Recognition: Toggle for enabling/disabling table extraction - Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate RAGFlow language settings to MinerU-compatible language codes for better OCR accuracy - Improved parser configuration handling to pass these options through the processing pipeline ### Frontend (`web/`) - Created new `MinerUOptionsFormField` component that conditionally renders when MinerU is selected as the layout recognition engine - Added UI controls for: - Parse method selection (dropdown) - Formula recognition toggle (switch) - Table recognition toggle (switch) - Added i18n translations for English and Chinese - Integrated the options into both the dataset creation dialog and dataset settings page ### Integration - Updated `rag/app/naive.py` to forward MinerU options to the parser - Updated task service to handle the new configuration parameters ## Why MinerU is a powerful document parser, but the default settings don't work well for all document types. This PR allows users to: 1. Choose the best parsing method for their documents 2. Disable formula recognition for Cyrillic/non-Latin scripts where it causes issues 3. Control table extraction based on document needs 4. Benefit from automatic language detection for better OCR results ## Testing - [x] Tested MinerU parsing with different parse methods - [x] Verified UI renders correctly when MinerU is selected/deselected - [x] Confirmed settings persist correctly in dataset configuration ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): --------- Co-authored-by: user210 <user210@rt> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-16 13:15:25 +08:00
Kevin Hu	44dec89f1f	Fix: aspose-slide issue. (#11935 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-12 20:16:18 +08:00
Yongteng Lei	0f0fb53256	Refa: refactor metadata filter (#11907 ) ### What problem does this PR solve? Refactor metadata filter. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-12 17:12:38 +08:00
Magicbook1108	50715ba332	Fix: forget-reset password (#11927 ) ### What problem does this PR solve? Fix: forget-reset password ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-12 16:16:17 +08:00
Yongteng Lei	6560388f2b	Fix: correct metadata update behavior (#11919 ) ### What problem does this PR solve? Correct metadata update behavior. #11912 When update `value` is omitted, the corresponding keys are updated to `"value"` regardless of their current values. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-12 12:50:17 +08:00
Magicbook1108	7db9045b74	Feat: Add box connector (#11845 ) ### What problem does this PR solve? Feat: Add box connector ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-12 10:23:40 +08:00
Kevin Hu	ea4a5cd665	Fix: tokenizer issue. (#11902 ) #11786 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-11 17:38:17 +08:00
Yongteng Lei	e9710b7aa9	Refa: treat MinerU as an OCR model 2 (#11905 ) ### What problem does this PR solve? Treat MinerU as an OCR model 2. #11903 ### Type of change - [x] Refactoring	2025-12-11 17:33:12 +08:00
TeslaZY	bd0eff2954	Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code (#11898 ) ### What problem does this PR solve? Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-11 13:55:01 +08:00
buua436	e3cfe8e848	Fix:async issue and sensitive logging (#11895 ) ### What problem does this PR solve? change： async issue and sensitive logging ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-11 13:54:47 +08:00
TeslaZY	c610bb605a	Added semi-automatic mode to the metadata filter (#11886 ) ### What problem does this PR solve? Retrieval metadata filtering adds semi-automatic mode, and users can manually check the metadata key that participates in LLM to generate filter conditions. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-11 10:45:21 +08:00
Yongteng Lei	8370bc61b7	Feat: enhance metadata operation (#11874 ) ### What problem does this PR solve? Add metadata condition in document list. Add metadata bulk update. Add metadata summary. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2025-12-11 09:59:15 +08:00
N0bodycan	74eb894453	Fix `RuntimeError: asyncio.run() cannot be called from a running event loop` when calling mindmap endpoint. (#11880 ) ### What problem does this PR solve? Fix RuntimeError when calling mindmap endpoint by converting `gen_mindmap()` to async function and using `await` instead of `asyncio.run()`. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-12-11 09:47:44 +08:00
buua436	3cb72377d7	Refa:remove sensitive information (#11873 ) ### What problem does this PR solve? change: remove sensitive information ### Type of change - [x] Refactoring	2025-12-10 19:08:45 +08:00
Lynn	a1164b9c89	Feat/memory (#11812 ) ### What problem does this PR solve? Manage and display memory datasets. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-10 13:34:08 +08:00
buua436	65a5a56d95	Refa:replace trio with asyncio (#11831 ) ### What problem does this PR solve? change: replace trio with asyncio ### Type of change - [x] Refactoring	2025-12-09 19:23:14 +08:00
Yongteng Lei	a94b3b9df2	Refa: treat MinerU as an OCR model (#11849 ) ### What problem does this PR solve? Treat MinerU as an OCR model. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring	2025-12-09 18:54:14 +08:00
Jin Hai	43f51baa96	Fix errors (#11804 ) ### What problem does this PR solve? 1. typos 2. grammar errors. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-08 12:21:18 +08:00
Yongteng Lei	51ec708c58	Refa: cleanup synchronous functions in chat_model and implement synchronization for conversation and dialog chats (#11779 ) ### What problem does this PR solve? Cleanup synchronous functions in chat_model and implement synchronization for conversation and dialog chats. ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-12-08 09:43:03 +08:00
天海蒼灆	8de6b97806	Feature (canvas): Add Api for download "message" component output's file (#11772 ) ### What problem does this PR solve? -Add Api for download "message" component output's file -Change the attachment output type check from tuple to dictionary,because 'attachement' is not instance of tuple -Update the message type to message_end to avoid the problem that content does not send an error message when the message type is ans ["data"] ["content"] ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-12-05 19:42:35 +08:00
Ted	ad03ede7cd	fix(sdk): add cancel_all_task_of call in stop_parsing endpoint (#11748 ) ## Problem The SDK API endpoint `DELETE /datasets/{dataset_id}/chunks` only updates database status but does not send cancellation signal via Redis, causing background parsing tasks to continue and eventually complete (status becomes DONE instead of CANCEL). ## Root Cause The SDK endpoint was missing the `cancel_all_task_of(id)` call that the web API ([api/apps/document_app.py](cci:7://file:///d:/workspace1/ragflow-admin/api/apps/document_app.py:0:0-0:0)) uses to properly stop background tasks. ## Solution Added `cancel_all_task_of(id)` call in the [stop_parsing](cci:1://file:///d:/workspace1/ragflow/api/apps/sdk/doc.py:785:0-855:23) function to send cancellation signal via Redis, consistent with the web API behavior. ## Related Issue Fixes #11745 Co-authored-by: tedhappy <tedhappy@users.noreply.github.com>	2025-12-04 19:29:06 +08:00
shirukai	fa7b857aa9	fix: resolve "'bool' object has no attribute 'items'" in SDK enabled … (#11725 ) ### What problem does this PR solve? Fixes the `AttributeError: 'bool' object has no attribute 'items'` error when updating the `enabled` parameter of a document via the Python SDK (Issue #11721). Background: When calling `Document.update({"enabled": True/False})` through the SDK, the server-side API returned a boolean `data=True` in the response (instead of a dictionary). The SDK's `_update_from_dict` method (in `base.py`) expects a dictionary to iterate over with `.items()`, leading to an immediate AttributeError during response parsing. This prevented successful synchronization of the updated `enabled` status to the local SDK object, even if the server-side database/update index operations succeeded. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Additional Context (optional, for clarity) - Root Cause: Server returned `data=True` (boolean) for `enabled` parameter updates, violating the SDK's expectation of a dictionary-type `data` field. - Fix Logic: 1. Removed the separate `return get_result(data=True)` in the `enabled` update branch to unify response flow. 2. - Backward Compatibility: No breaking changes—other update scenarios (e.g., renaming documents, modifying chunk methods) remain unaffected, and the response format stays consistent. Co-authored-by: shirukai <shirukai@hollysysdigital.com>	2025-12-04 11:24:01 +08:00
Jin Hai	a7d40e9132	Update since 'File manager' is renamed to 'File' (#11698 ) ### What problem does this PR solve? Update some docs and comments, since 'File manager' is rename to 'File' ### Type of change - [x] Documentation Update - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>	2025-12-03 18:32:15 +08:00

1 2 3 4 5 ...

1247 Commits