ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-30 07:06:39 +08:00

Author	SHA1	Message	Date
Stephen Hu	f569401398	Fix: better_handle_different_types (#8775 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8719#issuecomment-3055883271 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-11 18:21:39 +08:00
Stephen Hu	2b7adbd2d1	Fix: Improve Memory Usage For Presentation (#8792 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8791 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-11 11:35:25 +08:00
Stephen Hu	07208e519b	Fix: Wrong_Input_type_for_Gemin (#8783 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8763#issuecomment-3055317110 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-11 11:34:04 +08:00
Yongteng Lei	1895667573	Feat: add xAI provider (#8781 ) ### What problem does this PR solve? Add xAI provider (experimental feature, requires user feedback). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-11 10:35:23 +08:00
Kevin Hu	8281ceb406	Refa: refine retry gap. (#8773 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-07-10 14:28:57 +08:00
Stephen Hu	8d027813f5	Refactor: Improve How To Handle QWenEmbed (#8765 ) ### What problem does this PR solve? Based on https://github.com/infiniflow/ragflow/issues/8740 1. A better handle for 'NoneType' object is not subscriptable 2. Add some logs to get the internal message ### Type of change - [x] Refactoring	2025-07-10 10:30:18 +08:00
Stephen Hu	19419281c3	Fix: Change Ollama Embedding Keep Alive (#8734 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8733 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-09 12:17:26 +08:00
Stephen Hu	00c954755e	Fix:use the same logic to handle pos in tokenize_chunks_with_images (#8732 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8719 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-09 09:31:40 +08:00
Stephen Hu	8af0d04ad0	Refactor:Improve the logic in search.py (#8716 ) ### What problem does this PR solve? 1. Remove the useless pop logic due to already been checked at the if logic 2. merge log logic ### Type of change - [x] Refactoring	2025-07-08 12:32:01 +08:00
Stephen Hu	e60ec0a31b	Fix:disallowed special token while embedding (#8692 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8567 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-07 14:13:37 +08:00
6607changchun	9580e99650	fix: retry embedding with Qwen family models when limits temporarily reached. (#8690 ) fix: retry embedding with Qwen family models when limits temporarily reached. APIs of Qwen family models are limited by calling rates. When reached, the "output" attribute of the "resp" will be None, and in turn cause TypeError when trying to retrieve "embeddings". Since these limits are almost temporary, I have added a simple retry mechanism to avoid it. Besides, if retry_max reached, the error can be early raised, instead of hidden behind "TypeError". ### What problem does this PR solve? Sometimes Qwen blocks calling due to rate limits, but it will cause the whole parsing procedure stops when creating knowledge base. In this situation, resp["output"] will be None, and resp["output"]["embeddings"] will cause TypeError. Since the limits are temporary, I apply a simple retry mechanism to solve it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-07-07 12:15:52 +08:00
Kevin Hu	1e6bda735a	Fix: add ES re-connect once request timeout. (#8678 ) ### What problem does this PR solve? #8669 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-07 09:22:25 +08:00
kwrobel.eth	8a3b5d1d76	Fix a small typo in count of used fragments (#8673 ) ### What problem does this PR solve? Fix a small typo in count of used fragments. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-07-04 19:46:31 +08:00
Yongteng Lei	a306a6f158	Refa: refactor prompts into markdown-style structure using Jinja2 (#8667 ) ### What problem does this PR solve? Refactor prompts into markdown-style structure using Jinja2. ### Type of change - [x] Refactoring	2025-07-04 15:59:41 +08:00
Stephen Hu	d5f6335f99	Fix: The data set created by API call failed to parse after uploading the file. (#8657 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8656 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-04 12:41:28 +08:00
Yongteng Lei	f8a6987f1e	Refa: automatic LLMs registration (#8651 ) ### What problem does this PR solve? Support automatic LLMs registration. ### Type of change - [x] Refactoring	2025-07-03 19:05:31 +08:00
Yongteng Lei	62b63acbb5	Refa: more robust mcp tool call (#8631 ) ### What problem does this PR solve? More robust MCP tool call conn. ### Type of change - [x] Refactoring	2025-07-02 18:37:54 +08:00
Kevin Hu	fffb7c0bba	Fix: anthropic llm issue. (#8633 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-02 18:37:34 +08:00
He Wang	898da23caa	make dirs with 'exist_ok=True' (#8629 ) ### What problem does this PR solve? The following error occurred during local testing, which should be fixed by configuring 'exist_ok=True'. ```log set_progress(7461edc2535c11f0a2aa0242c0a82009), progress: -1, progress_msg: 21:41:41 Page(1~100000001): [ERROR][Errno 17] File exists: '/ragflow/tmp' ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-02 18:35:16 +08:00
He Wang	695bfe34a2	fix opendal config 'oss_table' and 'max_allowed_packet' (#8611 ) ### What problem does this PR solve? Fix the config option name of the opendal table name and setting of 'max_allowed_packet'. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: He Wang <wanghechn@qq.com>	2025-07-02 16:45:01 +08:00
Tuan Le	d343cb4deb	Add Google Cloud Vision API Integration (Image2Text) (#8608 ) ### What problem does this PR solve? This PR introduces Google Cloud Vision API integration to enhance image understanding capabilities in the application. It addresses the need for advanced image description and chat functionalities by implementing a new `GoogleCV` class to handle API interactions and updating relevant configurations. This enables users to leverage Google Cloud Vision for image-to-text tasks, improving the application's ability to process and interpret visual data. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-02 10:02:01 +08:00
wenxuan.zhang	f586dd0a96	Fix: docx parse error. (#8600 ) ### What problem does this PR solve? docx parse error. ![image](https://github.com/user-attachments/assets/efbe6d1b-10c8-415e-b693-a86f73e1ffa6) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### What problem does this PR solve? Some docx parse with naive cause error. `block.style.name` in Function `__get_nearest_title` will be None in some case. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenxuan.zhang <wenxuan.zhang@chinacreator.com>	2025-07-01 17:38:11 +08:00
Tuan Le	1c77b4ed9b	fix: Correctly format message parts in GoogleChat (#8596 ) ### What problem does this PR solve? This PR addresses an incompatibility issue with the Google Chat API by correcting the message content format in the `GoogleChat` class. Previously, the content was directly assigned to the "parts" field, which did not align with the API's expected format. This change ensures that messages are properly formatted with a "text" key within a dictionary, as required by the API. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-01 14:06:07 +08:00
Kevin Hu	e3edcc3064	Trivals. (#8597 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-01 14:05:18 +08:00
symvation	32f8b3ad77	Fix: the output log is incorrect (#8577 ) ### What problem does this PR solve? Fix: the output log is incorrect ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: liang <xiaofeng.liang@landstech.com.cn>	2025-07-01 10:49:43 +08:00
Yongteng Lei	8801de2772	Refa: change mcp_client module to rag/utils/conn (#8578 ) ### What problem does this PR solve? Change mcp_client module to rag/utils/conn. ### Type of change - [x] Refactoring	2025-07-01 09:29:19 +08:00
Kevin Hu	d46c24045f	Feat: add GiteeAI as a llm provider. (#8572 ) ### What problem does this PR solve? #1853 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 11:22:11 +08:00
Kevin Hu	aafeffa292	Feat: add gitee as LLM provider. (#8545 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 09:22:31 +08:00
Kevin Hu	e441c17c2c	Refa: limit embedding concurrency and fix `chat_with_tool` (#8543 ) ### What problem does this PR solve? #8538 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-06-27 19:28:41 +08:00
Kevin Hu	a10f05f4d7	Fix: chat with tools bug. (#8528 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-27 12:10:53 +08:00
Tuan Le	303c6dd1a8	Fix memory leaks in PIL image and BytesIO handling during chunk processing (#8522 ) ### What problem does this PR solve? This PR addresses critical memory leaks in the task executor's image processing pipeline. The current implementation fails to properly dispose of PIL Image objects and BytesIO buffers during chunk processing, leading to progressive memory accumulation that can cause the task executor to consume excessive memory over time. ### Background context - The `upload_to_minio` function processes images from document chunks and converts them to JPEG format for storage. - PIL Image objects hold significant memory resources that must be explicitly closed to prevent memory leaks. - BytesIO objects also consume memory and should be properly disposed of after use. - In high-throughput scenarios with many image-containing documents, these memory leaks can lead to out-of-memory errors and degraded performance. ### Specific issues fixed - PIL Image objects were not being explicitly closed after processing. - BytesIO buffers lacked proper cleanup in all code paths. - Converted images (RGBA/P to RGB) were not disposing of the original image object. - Memory references to large image data were not being cleared promptly. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Performance Improvement ### Changes made - Added explicit `d["image"].close()` calls after image processing operations. - Implemented proper cleanup of converted images when changing formats from RGBA/P to RGB. - Enhanced BytesIO cleanup with `try/finally` blocks to ensure disposal in all code paths. - Added explicit `del d["image"]` to clear memory references after processing. This fix ensures stable memory usage during long-running document processing tasks and prevents potential out-of-memory conditions in production environments.	2025-06-27 10:23:21 +08:00
Stephen Hu	be712714af	Refactor:improve the logic to check cancel (#8524 ) ### What problem does this PR solve? improve the logic to check cancel ### Type of change - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-27 10:22:53 +08:00
Kevin Hu	6d256ff0f5	Perf: ignore concate between rows. (#8507 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-06-26 14:55:37 +08:00
Tuan Le	6b1221d2f6	Fix parser_config access for layout_recognize in presentation.py (#8492 ) ### What problem does this PR solve? This PR addresses an issue in the presentation parser where the `layout_recognize` configuration was incorrectly retrieved from `kwargs.get("layout_recognize", "DeepDOC")`. Instead, it should be sourced from the `parser_config` parameter, specifically `parser_config.get("layout_recognize", "DeepDOC")`. This mismatch could cause the parser to default to the "DeepDOC" layout recognizer, ignoring any alternative recognition method specified in the parser configuration. As a result, PDF document parsing might use an incorrect recognition engine. The fix ensures the presentation parser consistently uses the `layout_recognize` setting from `parser_config`, aligning with the configuration access patterns used elsewhere in the codebase. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-26 11:54:43 +08:00
Rainman	340354b79c	fix the error 'Unknown field for GenerationConfig: max_tokens' when u… (#8473 ) ### What problem does this PR solve? [https://github.com/infiniflow/ragflow/issues/8324](url) docker image version: v0.19.1 The `_clean_conf` function was not implemented in the `_chat` and `chat_streamly` methods of the `GeminiChat` class, causing the error "Unknown field for GenerationConfig: max_tokens" when the default LLM config includes the "max_tokens" parameter. Buggy Code(ragflow/rag/llm/chat_model.py) ```python class GeminiChat(Base): def __init__(self, key, model_name, base_url=None, kwargs): super().__init__(key, model_name, base_url=base_url, kwargs) from google.generativeai import GenerativeModel, client client.configure(api_key=key) _client = client.get_default_generative_client() self.model_name = "models/" + model_name self.model = GenerativeModel(model_name=self.model_name) self.model._client = _client def _clean_conf(self, gen_conf): for k in list(gen_conf.keys()): if k not in ["temperature", "top_p"]: del gen_conf[k] return gen_conf def _chat(self, history, gen_conf): from google.generativeai.types import content_types system = history[0]["content"] if history and history[0]["role"] == "system" else "" hist = [] for item in history: if item["role"] == "system": continue hist.append(deepcopy(item)) item = hist[-1] if "role" in item and item["role"] == "assistant": item["role"] = "model" if "role" in item and item["role"] == "system": item["role"] = "user" if "content" in item: item["parts"] = item.pop("content") if system: self.model._system_instruction = content_types.to_content(system) response = self.model.generate_content(hist, generation_config=gen_conf) ans = response.text return ans, response.usage_metadata.total_token_count def chat_streamly(self, system, history, gen_conf): from google.generativeai.types import content_types if system: self.model._system_instruction = content_types.to_content(system) #❌_clean_conf was not implemented for k in list(gen_conf.keys()): if k not in ["temperature", "top_p", "max_tokens"]: del gen_conf[k] for item in history: if "role" in item and item["role"] == "assistant": item["role"] = "model" if "content" in item: item["parts"] = item.pop("content") ans = "" try: response = self.model.generate_content(history, generation_config=gen_conf, stream=True) for resp in response: ans = resp.text yield ans yield response._chunks[-1].usage_metadata.total_token_count except Exception as e: yield ans + "\nERROR: " + str(e) yield 0 ``` Implement the _clean_conf function ```python class GeminiChat(Base): def __init__(self, key, model_name, base_url=None, kwargs): super().__init__(key, model_name, base_url=base_url, kwargs) from google.generativeai import GenerativeModel, client client.configure(api_key=key) _client = client.get_default_generative_client() self.model_name = "models/" + model_name self.model = GenerativeModel(model_name=self.model_name) self.model._client = _client def _clean_conf(self, gen_conf): for k in list(gen_conf.keys()): if k not in ["temperature", "top_p"]: del gen_conf[k] return gen_conf def _chat(self, history, gen_conf): from google.generativeai.types import content_types #✅ implement _clean_conf to remove the wrong parameters gen_conf = self._clean_conf(gen_conf) system = history[0]["content"] if history and history[0]["role"] == "system" else "" hist = [] for item in history: if item["role"] == "system": continue hist.append(deepcopy(item)) item = hist[-1] if "role" in item and item["role"] == "assistant": item["role"] = "model" if "role" in item and item["role"] == "system": item["role"] = "user" if "content" in item: item["parts"] = item.pop("content") if system: self.model._system_instruction = content_types.to_content(system) response = self.model.generate_content(hist, generation_config=gen_conf) ans = response.text return ans, response.usage_metadata.total_token_count def chat_streamly(self, system, history, gen_conf): from google.generativeai.types import content_types #✅ implement _clean_conf to remove the wrong parameters gen_conf = self._clean_conf(gen_conf) if system: self.model._system_instruction = content_types.to_content(system) #✅Removed duplicate parameter filtering logic "for k in list(gen_conf.keys()):" for item in history: if "role" in item and item["role"] == "assistant": item["role"] = "model" if "content" in item: item["parts"] = item.pop("content") ans = "" try: response = self.model.generate_content(history, generation_config=gen_conf, stream=True) for resp in response: ans = resp.text yield ans yield response._chunks[-1].usage_metadata.total_token_count except Exception as e: yield ans + "\nERROR: " + str(e) yield 0 ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-25 16:23:35 +08:00
Yongteng Lei	b705ff08fe	Refa: improve GraphRAG similarity sensitivity to numeric differences (#8479 ) ### What problem does this PR solve? Improve GraphRAG similarity sensitivity to numeric differences. #8444. ### Type of change - [x] Refactoring	2025-06-25 16:20:59 +08:00
liuzhenghua	5256980ffb	Fix: Solve the OOM issue when passing large PDF files while using QA chunking method. (#8464 ) ### What problem does this PR solve? Using the QA chunking method with a large PDF (e.g., 300+ pages) may lead to OOM in the ragflow-worker module. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-25 10:25:45 +08:00
Stephen Hu	8d9d2cc0a9	Fix: some cases Task return but not set progress (#8469 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8466 I go through the codes, current logic: When do_handle_task raises an exception, handle_task will set the progress, but for some cases do_handle_task internal will just return but not set the right progress, at this cases the redis stream will been acked but the task is running. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-25 09:58:55 +08:00
HaiyangP	d6a941ebf5	Fix the bug of long type value overflow (#8313 ) ### What problem does this PR solve? This PR will fix the #8271 by extending int type to float type when there is any value out of long type range in a column. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-24 18:18:30 +08:00
WuWeiFlow	bc1b837616	FIX:Saving an RGBA image directly as JPEG will cause an error. If the… (#8399 ) Saving an RGBA image directly as JPEG will cause an error. If the image is in RGBA mode, convert it to RGB mode before saving it in JPG format. ### What problem does this PR solve? During document parsing in the knowledge base, we occasionally encounter the error 'cannot write mode RGBA as JPEG.' This occurs because images in RGBA mode cannot be directly saved as JPEG. They must be converted first before saving. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-24 18:01:13 +08:00
Rainman	49d67cbcb7	fix a bug when using huggingface embedding api (#8432 ) ### What problem does this PR solve? image_version: v0.19.1 This PR fixes a bug in the HuggingFaceEmBedding API method that was causing AssertionError: assert len(vects) == len(docs) during the document embedding process. #### Problem The HuggingFaceEmbed.encode() method had an early return statement inside the for loop, causing it to return after processing only the first text input instead of processing all texts in the input list. Error Messenge ```python AssertionError: assert len(vects) == len(docs) # input chunks != embedded vectors from embedding api File "/ragflow/rag/svr/task_executor.py", line 442, in embedding ``` Buggy code(/ragflow/rag/llm/embedding_model.py) ```python class HuggingFaceEmbed(Base): def __init__(self, key, model_name, base_url=None): if not model_name: raise ValueError("Model name cannot be None") self.key = key self.model_name = model_name.split("___")[0] self.base_url = base_url or "http://127.0.0.1:8080" def encode(self, texts: list): embeddings = [] for text in texts: response = requests.post(...) if response.status_code == 200: try: embedding = response.json() embeddings.append(embedding[0]) # ❌ Early return return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) except Exception as _e: log_exception(_e, response) else: raise Exception(...) ``` Fixed Code(I just Rollback this function to the v0.19.0 version) ```python Class HuggingFaceEmbed(Base): def __init__(self, key, model_name, base_url=None): if not model_name: raise ValueError("Model name cannot be None") self.key = key self.model_name = model_name.split("___")[0] self.base_url = base_url or "http://127.0.0.1:8080" def encode(self, texts: list): embeddings = [] for text in texts: response = requests.post(...) if response.status_code == 200: embedding = response.json() embeddings.append(embedding[0]) # ✅ Only append, no return else: raise Exception(...) return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) # ✅ Return after processing all ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-24 09:35:02 +08:00
Song Fuchang	fd7ac17605	Feat: Scratch MCP tool calling support. (#8263 ) ### What problem does this PR solve? This is a cherry-pick from #7781 as requested. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-23 17:45:35 +08:00
Liu An	244d8a47b9	Fix: AzureChat model code (#8426 ) ### What problem does this PR solve? - Simplify AzureChat constructor by passing base_url directly - Clean up spacing and formatting in chat_model.py - Remove redundant parentheses and improve code consistency - #8423 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-23 15:59:25 +08:00
kira-offgrid	f0e0783618	Fix: Database Query Vulnerable to Injection Attacks in rag/utils/opendal_conn.py (#8408 ) Context and Purpose: This PR automatically remediates a security vulnerability: - Description: Detected possible formatted SQL query. Use parameterized queries instead. - Rule ID: python.lang.security.audit.formatted-sql-query.formatted-sql-query - Severity: HIGH - File: rag/utils/opendal_conn.py - Lines Affected: 98 - 98 This change is necessary to protect the application from potential security risks associated with this vulnerability. Solution Implemented: The automated remediation process has applied the necessary changes to the affected code in `rag/utils/opendal_conn.py` to resolve the identified issue. Please review the changes to ensure they are correct and integrate as expected.	2025-06-23 14:54:25 +08:00
Kevin Hu	d4e6e2bd21	Fix: doc_aggs issue. (#8418 ) ### What problem does this PR solve? #8406 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-23 14:54:01 +08:00
Kevin Hu	83e23f1e8a	Fix: rank feature score should be greater than 0. (#8416 ) ### What problem does this PR solve? #8414 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-23 14:10:13 +08:00
Stephen Hu	794a4102c2	Fix: Document parse via API will alot problen (#8407 ) ### What problem does this PR solve? #8391 #8404 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-23 13:08:11 +08:00
Stephen Hu	ef5e7d8c44	Fix:embedding_model class SILICONFLOWEmbed(Base)Function reusing json (#8378 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8360 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-20 11:13:00 +08:00
changqingla	4784aa5b0b	fix: List Chunks API fails to return the correct document status. (#8347 ) ### What problem does this PR solve? The existing /api/v1/datasets/{dataset_id}/documents/{document_id}/chunks endpoint fails to accurately return a document's chunk status. Even when a chunk is explicitly marked as unavailable, the API still returns true. ![img_v3_02nc_3458a1b7-609e-4f20-8cb7-2156a489848g](https://github.com/user-attachments/assets/ab3b8f69-1284-49c1-8af3-bdfae3416583) ![img_v3_02nc_82f1d96e-7596-4def-ba75-5a2bd10d56cg](https://github.com/user-attachments/assets/a8a4162b-b50d-4dfc-af72-e1d7812a0a93) Co-authored-by: zhoudeyong <zhoudeyong@idr.ai>	2025-06-19 11:12:53 +08:00
Kevin Hu	8f3fe63d73	Fix: duplicated task (#8358 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-19 11:12:29 +08:00

1 2 3 4 5 ...

773 Commits