ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-31 23:55:06 +08:00

Author	SHA1	Message	Date
huansinho	56e6f37ffa	Update Chrome download URL in use_china_mirrors configuration (#8628 ) ### What problem does this PR solve? Update Chrome download URL in use_china_mirrors configuration ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: lqh <liqunhuan@foreveross.com>	2025-07-02 18:34:38 +08:00
balibabu	040e4ad8a5	Feat: Convert the arguments parameter of the code operator to a dictionary #3221 (#8623 ) ### What problem does this PR solve? Feat: Convert the arguments parameter of the code operator to a dictionary #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-02 18:34:21 +08:00
He Wang	695bfe34a2	fix opendal config 'oss_table' and 'max_allowed_packet' (#8611 ) ### What problem does this PR solve? Fix the config option name of the opendal table name and setting of 'max_allowed_packet'. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: He Wang <wanghechn@qq.com>	2025-07-02 16:45:01 +08:00
Tuan Le	d343cb4deb	Add Google Cloud Vision API Integration (Image2Text) (#8608 ) ### What problem does this PR solve? This PR introduces Google Cloud Vision API integration to enhance image understanding capabilities in the application. It addresses the need for advanced image description and chat functionalities by implementing a new `GoogleCV` class to handle API interactions and updating relevant configurations. This enables users to leverage Google Cloud Vision for image-to-text tasks, improving the application's ability to process and interpret visual data. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-02 10:02:01 +08:00
Scott Davidson	9dd3dfaab0	Add service_conf and llm_factories options to Helm chart (#8607 ) ### What problem does this PR solve? ### Type of change - [X] New Feature (non-breaking change which adds functionality)	2025-07-02 09:58:17 +08:00
balibabu	212d5ce7ff	Feat: Construct the to field of the classification operator when saving data #3221 (#8610 ) ### What problem does this PR solve? Feat: Construct the to field of the classification operator when saving data #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-02 09:49:42 +08:00
Liu An	0b40eb3e90	Test: Add tests for chunk API endpoints (#8616 ) ### What problem does this PR solve? - Add comprehensive test suite for chunk operations including: - Test files for create, list, retrieve, update, and delete chunks - Authorization tests - Batch operations tests - Update test configurations and common utilities - Validate `important_kwd` and `question_kwd` fields are lists in chunk_app.py - Reorganize imports and clean up duplicate code ### Type of change - [x] Add test cases	2025-07-02 09:49:08 +08:00
wenxuan.zhang	f586dd0a96	Fix: docx parse error. (#8600 ) ### What problem does this PR solve? docx parse error. ![image](https://github.com/user-attachments/assets/efbe6d1b-10c8-415e-b693-a86f73e1ffa6) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### What problem does this PR solve? Some docx parse with naive cause error. `block.style.name` in Function `__get_nearest_title` will be None in some case. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenxuan.zhang <wenxuan.zhang@chinacreator.com>	2025-07-01 17:38:11 +08:00
balibabu	93a8f4a4c8	Fix: Fixed the issue that the global variables of the code operator cannot be selected #3221 (#8605 ) ### What problem does this PR solve? Fix: Fixed the issue that the global variables of the code operator cannot be selected #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-01 17:31:56 +08:00
balibabu	6b04b07eb4	Fixed the issue where variables were not displayed in the switch operator #3221 (#8601 ) ### What problem does this PR solve? Feat: Fixed the issue where variables were not displayed in the switch operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-01 15:52:14 +08:00
Tuan Le	1c77b4ed9b	fix: Correctly format message parts in GoogleChat (#8596 ) ### What problem does this PR solve? This PR addresses an incompatibility issue with the Google Chat API by correcting the message content format in the `GoogleChat` class. Previously, the content was directly assigned to the "parts" field, which did not align with the API's expected format. This change ensures that messages are properly formatted with a "text" key within a dictionary, as required by the API. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-01 14:06:07 +08:00
Kevin Hu	e3edcc3064	Trivals. (#8597 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-01 14:05:18 +08:00
balibabu	103027580e	Feat: Add agent advanced settings form #3221 (#8592 ) ### What problem does this PR solve? Feat: Add agent advanced settings form #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-01 10:52:48 +08:00
symvation	32f8b3ad77	Fix: the output log is incorrect (#8577 ) ### What problem does this PR solve? Fix: the output log is incorrect ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: liang <xiaofeng.liang@landstech.com.cn>	2025-07-01 10:49:43 +08:00
天海蒼灆	d4da6dce6e	Feat: Add file management HTTP_API (#8395 ) ### What problem does this PR solve? Add file management HTTP_API for operating files ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-01 09:51:53 +08:00
Tuan Le	7f19f604a9	Pass Form Instance to GoogleModal Form Component (#8586 ) ### What problem does this PR solve? This PR enables the `Form` component within the `GoogleModal` to directly access and manipulate the form state by passing the form instance from the parent component. This enhances form control and data manipulation capabilities within the modal, improving the component's functionality and integration with the parent form. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-01 09:48:36 +08:00
Stephen Hu	4a1680a799	doc: change to chunk_token num (#8590 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8556 ### Type of change - [x] Documentation Update	2025-07-01 09:47:23 +08:00
Yongteng Lei	8801de2772	Refa: change mcp_client module to rag/utils/conn (#8578 ) ### What problem does this PR solve? Change mcp_client module to rag/utils/conn. ### Type of change - [x] Refactoring	2025-07-01 09:29:19 +08:00
balibabu	d620432e3b	Feat: In a dialog message, users can enter different types of data #3221 (#8583 ) ### What problem does this PR solve? Feat: In a dialog message, users can enter different types of data #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 19:32:40 +08:00
RafaelFFAumo	cf8c063a69	Adding semaphore usage on the '/run' endpoint (#8526 ) ### What problem does this PR solve? Switching threading.Lock() to asyncio.Lock(), since threading.Lock() is blocking. ### Type of change - [x] Performance Improvement	2025-06-30 15:40:23 +08:00
balibabu	40b1684c1e	Feat: Fixed the issue that the top toolbar disappears when opening the agent operator form #3221 (#8579 ) ### What problem does this PR solve? Feat: Fixed the issue that the top toolbar disappears when opening the agent operator form #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 15:39:38 +08:00
Kevin Hu	d46c24045f	Feat: add GiteeAI as a llm provider. (#8572 ) ### What problem does this PR solve? #1853 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 11:22:11 +08:00
balibabu	10f12fa149	Feat: Support GiteeAI model #1853 (#8573 ) ### What problem does this PR solve? Feat: Support GiteeAI model #1853 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 11:21:51 +08:00
balibabu	356d1f3485	Feat: Allow users to enter text in the middle of a chat #3221 (#8569 ) ### What problem does this PR solve? Feat: Allow users to enter text in the middle of a chat #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 10:36:52 +08:00
Kevin Hu	aafeffa292	Feat: add gitee as LLM provider. (#8545 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 09:22:31 +08:00
Kevin Hu	e441c17c2c	Refa: limit embedding concurrency and fix `chat_with_tool` (#8543 ) ### What problem does this PR solve? #8538 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-06-27 19:28:41 +08:00
balibabu	8e1f8a0c48	Feat: Fixed the issue where the begin operator parameters could not be submitted during debugging #3221 (#8539 ) ### What problem does this PR solve? Feat: Fixed the issue where the begin operator parameters could not be submitted during debugging #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-27 18:53:13 +08:00
balibabu	0f7c955634	Feat: Display sub-agents in agent form #3221 (#8536 ) ### What problem does this PR solve? Feat: Display sub-agents in agent form #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-27 15:45:53 +08:00
balibabu	5a2099a1c7	Feat: Fixed the issue where the prompt menu content was hidden #3221 (#8530 ) ### What problem does this PR solve? Feat: Fixed the issue where the prompt menu content was hidden #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-27 12:11:29 +08:00
Kevin Hu	a10f05f4d7	Fix: chat with tools bug. (#8528 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-27 12:10:53 +08:00
Yongteng Lei	0478f36e36	Feat: allow users to choose which MCP tools are enabled (#8519 ) ### What problem does this PR solve? Allow users to choose which MCP tools are enabled. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-27 10:23:34 +08:00
Tuan Le	303c6dd1a8	Fix memory leaks in PIL image and BytesIO handling during chunk processing (#8522 ) ### What problem does this PR solve? This PR addresses critical memory leaks in the task executor's image processing pipeline. The current implementation fails to properly dispose of PIL Image objects and BytesIO buffers during chunk processing, leading to progressive memory accumulation that can cause the task executor to consume excessive memory over time. ### Background context - The `upload_to_minio` function processes images from document chunks and converts them to JPEG format for storage. - PIL Image objects hold significant memory resources that must be explicitly closed to prevent memory leaks. - BytesIO objects also consume memory and should be properly disposed of after use. - In high-throughput scenarios with many image-containing documents, these memory leaks can lead to out-of-memory errors and degraded performance. ### Specific issues fixed - PIL Image objects were not being explicitly closed after processing. - BytesIO buffers lacked proper cleanup in all code paths. - Converted images (RGBA/P to RGB) were not disposing of the original image object. - Memory references to large image data were not being cleared promptly. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Performance Improvement ### Changes made - Added explicit `d["image"].close()` calls after image processing operations. - Implemented proper cleanup of converted images when changing formats from RGBA/P to RGB. - Enhanced BytesIO cleanup with `try/finally` blocks to ensure disposal in all code paths. - Added explicit `del d["image"]` to clear memory references after processing. This fix ensures stable memory usage during long-running document processing tasks and prevents potential out-of-memory conditions in production environments.	2025-06-27 10:23:21 +08:00
Stephen Hu	7dbe06f7d8	Refactor: remove useless initialize logic in list_doc (#8523 ) ### What problem does this PR solve? Remove useless logic in a loop for list_doc ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-06-27 10:23:08 +08:00
Stephen Hu	be712714af	Refactor:improve the logic to check cancel (#8524 ) ### What problem does this PR solve? improve the logic to check cancel ### Type of change - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-27 10:22:53 +08:00
Stephen Hu	938d8dd878	Fix: user_default_llm configuration doesn't work for OpenAI API compatible LLM factory (#8502 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8467 when add llm the llm_name will like "llm1___OpenAI-API" `f09ca8e795/api/apps/llm_app.py (L173)` so we should not use llm1 to query ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-27 09:41:12 +08:00
zhanglei	daf6c82066	fix: list index out of range (#8518 ) ### What problem does this PR solve? stack： ``` 2025-06-26 17:22:24,739 ERROR 1609 list index out of range Traceback (most recent call last): File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request rv = self.dispatch_request() File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(*view_args) # type: ignore[no-any-return] File "/ragflow/api/utils/api_utils.py", line 298, in decorated_function return func(args, **kwargs) File "/ragflow/api/apps/sdk/session.py", line 472, in list_session print(conv["reference"][message_num]) IndexError: list index out of range ``` ![图片](https://github.com/user-attachments/assets/93fe90a8-0434-4842-ba9f-bb5a995b498a) ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-06-27 09:38:33 +08:00
balibabu	f7b6c4ca99	Feat: Add StringTransform operator #3221 (#8520 ) ### What problem does this PR solve? Feat: Add StringTransform operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-27 09:27:28 +08:00
FatMii	2990779d59	fix(prompt-editor): resolve initial cursor position and auto-newline … (#8511 ) ### What problem does this PR solve? In web folder's prompt-editor component, when entering content for the first time, the cursor position is abnormal and it will automatically wrap ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: leonlai <owllai123456>	2025-06-26 19:28:46 +08:00
Yongteng Lei	d768130204	Fix: chunk number error after re-parsing (#8513 ) ### What problem does this PR solve? Fix chunk number error after re-parsing. #8503. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-26 17:46:53 +08:00
balibabu	05bf01b058	Feat: Displays the output variable type selected by the loop operator #3221 (#8515 ) ### What problem does this PR solve? Feat: Displays the output variable type selected by the loop operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-26 17:46:37 +08:00
Liu An	d11cfd4e45	Fix: Add input validation to chunk creation endpoint (#8516 ) ### What problem does this PR solve? - Include optional `tag_feas` field if present in request - Add input validation for `important_kwd` and `question_kwd` to ensure they are lists - #8462 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-26 17:46:00 +08:00
balibabu	32a7ad3cba	Feat: Customize the output variable name of the loop operator #3221 (#8514 ) ### What problem does this PR solve? Feat: Customize the output variable name of the loop operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-26 16:43:06 +08:00
balibabu	42a570a64d	Feat: Add UserFillUpForm component #3221 (#8508 ) ### What problem does this PR solve? Feat: Add UserFillUpForm component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-26 14:55:51 +08:00
Kevin Hu	6d256ff0f5	Perf: ignore concate between rows. (#8507 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-06-26 14:55:37 +08:00
Yongteng Lei	0eb90e73a5	Feat: add MCP dashboard functionalities list_tools and test_tool (#8505 ) ### What problem does this PR solve? Add MCP dashboard functionalities list_tools and test_tool. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-26 13:52:01 +08:00
Tuan Le	6b1221d2f6	Fix parser_config access for layout_recognize in presentation.py (#8492 ) ### What problem does this PR solve? This PR addresses an issue in the presentation parser where the `layout_recognize` configuration was incorrectly retrieved from `kwargs.get("layout_recognize", "DeepDOC")`. Instead, it should be sourced from the `parser_config` parameter, specifically `parser_config.get("layout_recognize", "DeepDOC")`. This mismatch could cause the parser to default to the "DeepDOC" layout recognizer, ignoring any alternative recognition method specified in the parser configuration. As a result, PDF document parsing might use an incorrect recognition engine. The fix ensures the presentation parser consistently uses the `layout_recognize` setting from `parser_config`, aligning with the configuration access patterns used elsewhere in the codebase. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-26 11:54:43 +08:00
balibabu	f09ca8e795	Feat: Allow operators inside the loop operator to reference the output parameters of external operators #3221 (#8498 ) ### What problem does this PR solve? Feat: Allow operators inside the loop operator to reference the output parameters of external operators #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-26 09:34:38 +08:00
balibabu	c4bfd9fa2c	Feat: Add retrieval tool #3221 (#8491 ) ### What problem does this PR solve? Feat: Add retrieval tool #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-25 18:32:56 +08:00
Tuan Le	7353070f49	Adds retrieval result fields to Chunk (#8478 ) ### What problem does this PR solve? This PR adds fields to the `Chunk` class to store retrieval results like similarity scores, term similarity, vector similarity, positions, and document type. This allows the chunk object to hold all the information needed when returning search results from the vector database. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-25 16:53:15 +08:00
Liu An	dac5bcdf17	Fix: Enforce default embedding model in create_dataset / update_dataset (#8486 ) ### What problem does this PR solve? Previous: - Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI' - Did not respect user-configured default embedding_model Now: - Correctly prioritizes user-configured default embedding_model Other: - Make embedding_model optional in CreateDatasetReq with proper None handling - Add default embedding model fallback in dataset update when empty - Enhance validation utils to handle None values and string normalization - Update SDK default embedding model to None to match API changes - Adjust related test cases to reflect new validation rules ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-25 16:41:32 +08:00

... 2 3 4 5 6 ...

3475 Commits