Commit Graph

1009 Commits

Author SHA1 Message Date
d5f6335f99 Fix: The data set created by API call failed to parse after uploading the file. (#8657)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/8656

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-04 12:41:28 +08:00
f8a6987f1e Refa: automatic LLMs registration (#8651)
### What problem does this PR solve?

Support automatic LLMs registration.

### Type of change

- [x] Refactoring
2025-07-03 19:05:31 +08:00
62b63acbb5 Refa: more robust mcp tool call (#8631)
### What problem does this PR solve?

More robust MCP tool call conn.

### Type of change

- [x] Refactoring
2025-07-02 18:37:54 +08:00
fffb7c0bba Fix: anthropic llm issue. (#8633)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-02 18:37:34 +08:00
898da23caa make dirs with 'exist_ok=True' (#8629)
### What problem does this PR solve?

The following error occurred during local testing, which should be fixed
by configuring 'exist_ok=True'.

```log
set_progress(7461edc2535c11f0a2aa0242c0a82009), progress: -1, progress_msg: 21:41:41 Page(1~100000001): [ERROR][Errno 17] File exists: '/ragflow/tmp'
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-02 18:35:16 +08:00
695bfe34a2 fix opendal config 'oss_table' and 'max_allowed_packet' (#8611)
### What problem does this PR solve?

Fix the config option name of the opendal table name and setting of
'max_allowed_packet'.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: He Wang <wanghechn@qq.com>
2025-07-02 16:45:01 +08:00
d343cb4deb Add Google Cloud Vision API Integration (Image2Text) (#8608)
### What problem does this PR solve?

This PR introduces Google Cloud Vision API integration to enhance image
understanding capabilities in the application. It addresses the need for
advanced image description and chat functionalities by implementing a
new `GoogleCV` class to handle API interactions and updating relevant
configurations. This enables users to leverage Google Cloud Vision for
image-to-text tasks, improving the application's ability to process and
interpret visual data.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-07-02 10:02:01 +08:00
f586dd0a96 Fix: docx parse error. (#8600)
### What problem does this PR solve?

docx parse error.

![image](https://github.com/user-attachments/assets/efbe6d1b-10c8-415e-b693-a86f73e1ffa6)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### What problem does this PR solve?

Some docx parse with naive cause error. `block.style.name` in Function
`__get_nearest_title` will be None in some case.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wenxuan.zhang <wenxuan.zhang@chinacreator.com>
2025-07-01 17:38:11 +08:00
1c77b4ed9b fix: Correctly format message parts in GoogleChat (#8596)
### What problem does this PR solve?

This PR addresses an incompatibility issue with the Google Chat API by
correcting the message content format in the `GoogleChat` class.
Previously, the content was directly assigned to the "parts" field,
which did not align with the API's expected format. This change ensures
that messages are properly formatted with a "text" key within a
dictionary, as required by the API.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-01 14:06:07 +08:00
e3edcc3064 Trivals. (#8597)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-01 14:05:18 +08:00
32f8b3ad77 Fix: the output log is incorrect (#8577)
### What problem does this PR solve?

Fix: the output log is incorrect

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: liang <xiaofeng.liang@landstech.com.cn>
2025-07-01 10:49:43 +08:00
8801de2772 Refa: change mcp_client module to rag/utils/conn (#8578)
### What problem does this PR solve?

Change mcp_client module to rag/utils/conn.

### Type of change

- [x] Refactoring
2025-07-01 09:29:19 +08:00
d46c24045f Feat: add GiteeAI as a llm provider. (#8572)
### What problem does this PR solve?

#1853

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-30 11:22:11 +08:00
aafeffa292 Feat: add gitee as LLM provider. (#8545)
### What problem does this PR solve?


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-30 09:22:31 +08:00
e441c17c2c Refa: limit embedding concurrency and fix chat_with_tool (#8543)
### What problem does this PR solve?

#8538

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-06-27 19:28:41 +08:00
a10f05f4d7 Fix: chat with tools bug. (#8528)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-27 12:10:53 +08:00
303c6dd1a8 Fix memory leaks in PIL image and BytesIO handling during chunk processing (#8522)
### What problem does this PR solve?
This PR addresses critical memory leaks in the task executor's image
processing pipeline. The current implementation fails to properly
dispose of PIL Image objects and BytesIO buffers during chunk
processing, leading to progressive memory accumulation that can cause
the task executor to consume excessive memory over time.

### Background context
- The `upload_to_minio` function processes images from document chunks
and converts them to JPEG format for storage.
- PIL Image objects hold significant memory resources that must be
explicitly closed to prevent memory leaks.
- BytesIO objects also consume memory and should be properly disposed of
after use.
- In high-throughput scenarios with many image-containing documents,
these memory leaks can lead to out-of-memory errors and degraded
performance.

### Specific issues fixed
- PIL Image objects were not being explicitly closed after processing.
- BytesIO buffers lacked proper cleanup in all code paths.
- Converted images (RGBA/P to RGB) were not disposing of the original
image object.
- Memory references to large image data were not being cleared promptly.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement


### Changes made
- Added explicit `d["image"].close()` calls after image processing
operations.
- Implemented proper cleanup of converted images when changing formats
from RGBA/P to RGB.
- Enhanced BytesIO cleanup with `try/finally` blocks to ensure disposal
in all code paths.
- Added explicit `del d["image"]` to clear memory references after
processing.

This fix ensures stable memory usage during long-running document
processing tasks and prevents potential out-of-memory conditions in
production environments.
2025-06-27 10:23:21 +08:00
be712714af Refactor:improve the logic to check cancel (#8524)
### What problem does this PR solve?

improve the logic to check cancel

### Type of change

- [x] Refactoring

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-27 10:22:53 +08:00
6d256ff0f5 Perf: ignore concate between rows. (#8507)
### What problem does this PR solve?


### Type of change

- [x] Performance Improvement
2025-06-26 14:55:37 +08:00
6b1221d2f6 Fix parser_config access for layout_recognize in presentation.py (#8492)
### What problem does this PR solve?
This PR addresses an issue in the presentation parser where the
`layout_recognize` configuration was incorrectly retrieved from
`kwargs.get("layout_recognize", "DeepDOC")`. Instead, it should be
sourced from the `parser_config` parameter, specifically
`parser_config.get("layout_recognize", "DeepDOC")`.

This mismatch could cause the parser to default to the "DeepDOC" layout
recognizer, ignoring any alternative recognition method specified in the
parser configuration. As a result, PDF document parsing might use an
incorrect recognition engine.

The fix ensures the presentation parser consistently uses the
`layout_recognize` setting from `parser_config`, aligning with the
configuration access patterns used elsewhere in the codebase.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-26 11:54:43 +08:00
340354b79c fix the error 'Unknown field for GenerationConfig: max_tokens' when u… (#8473)
### What problem does this PR solve?
[https://github.com/infiniflow/ragflow/issues/8324](url)

docker image version: v0.19.1

The `_clean_conf` function was not implemented in the `_chat` and
`chat_streamly` methods of the `GeminiChat` class, causing the error
"Unknown field for GenerationConfig: max_tokens" when the default LLM
config includes the "max_tokens" parameter.

**Buggy Code(ragflow/rag/llm/chat_model.py)**
```python
class GeminiChat(Base):
    def __init__(self, key, model_name, base_url=None, **kwargs):
        super().__init__(key, model_name, base_url=base_url, **kwargs)

        from google.generativeai import GenerativeModel, client

        client.configure(api_key=key)
        _client = client.get_default_generative_client()
        self.model_name = "models/" + model_name
        self.model = GenerativeModel(model_name=self.model_name)
        self.model._client = _client

    def _clean_conf(self, gen_conf):
        for k in list(gen_conf.keys()):
            if k not in ["temperature", "top_p"]:
                del gen_conf[k]
        return gen_conf

    def _chat(self, history, gen_conf):
        from google.generativeai.types import content_types

        system = history[0]["content"] if history and history[0]["role"] == "system" else ""
        hist = []
        for item in history:
            if item["role"] == "system":
                continue
            hist.append(deepcopy(item))
            item = hist[-1]
            if "role" in item and item["role"] == "assistant":
                item["role"] = "model"
            if "role" in item and item["role"] == "system":
                item["role"] = "user"
            if "content" in item:
                item["parts"] = item.pop("content")

        if system:
            self.model._system_instruction = content_types.to_content(system)
        response = self.model.generate_content(hist, generation_config=gen_conf)
        ans = response.text
        return ans, response.usage_metadata.total_token_count

    def chat_streamly(self, system, history, gen_conf):
        from google.generativeai.types import content_types

        if system:
            self.model._system_instruction = content_types.to_content(system)
        #_clean_conf was not implemented 
        for k in list(gen_conf.keys()):
            if k not in ["temperature", "top_p", "max_tokens"]:
                del gen_conf[k]
        for item in history:
            if "role" in item and item["role"] == "assistant":
                item["role"] = "model"
            if "content" in item:
                item["parts"] = item.pop("content")
        ans = ""
        try:
            response = self.model.generate_content(history, generation_config=gen_conf, stream=True)
            for resp in response:
                ans = resp.text
                yield ans

            yield response._chunks[-1].usage_metadata.total_token_count
        except Exception as e:
            yield ans + "\n**ERROR**: " + str(e)

        yield 0
```
**Implement the _clean_conf function**
```python
class GeminiChat(Base):
    def __init__(self, key, model_name, base_url=None, **kwargs):
        super().__init__(key, model_name, base_url=base_url, **kwargs)

        from google.generativeai import GenerativeModel, client

        client.configure(api_key=key)
        _client = client.get_default_generative_client()
        self.model_name = "models/" + model_name
        self.model = GenerativeModel(model_name=self.model_name)
        self.model._client = _client

    def _clean_conf(self, gen_conf):
        for k in list(gen_conf.keys()):
            if k not in ["temperature", "top_p"]:
                del gen_conf[k]
        return gen_conf

    def _chat(self, history, gen_conf):
        from google.generativeai.types import content_types
        # implement _clean_conf to remove the wrong parameters
        gen_conf = self._clean_conf(gen_conf)

        system = history[0]["content"] if history and history[0]["role"] == "system" else ""
        hist = []
        for item in history:
            if item["role"] == "system":
                continue
            hist.append(deepcopy(item))
            item = hist[-1]
            if "role" in item and item["role"] == "assistant":
                item["role"] = "model"
            if "role" in item and item["role"] == "system":
                item["role"] = "user"
            if "content" in item:
                item["parts"] = item.pop("content")

        if system:
            self.model._system_instruction = content_types.to_content(system)
        response = self.model.generate_content(hist, generation_config=gen_conf)
        ans = response.text
        return ans, response.usage_metadata.total_token_count

    def chat_streamly(self, system, history, gen_conf):
        from google.generativeai.types import content_types
        # implement _clean_conf to remove the wrong parameters
        gen_conf = self._clean_conf(gen_conf)

        if system:
            self.model._system_instruction = content_types.to_content(system)
        #Removed duplicate parameter filtering logic "for k in list(gen_conf.keys()):"
        for item in history:
            if "role" in item and item["role"] == "assistant":
                item["role"] = "model"
            if "content" in item:
                item["parts"] = item.pop("content")
        ans = ""
        try:
            response = self.model.generate_content(history, generation_config=gen_conf, stream=True)
            for resp in response:
                ans = resp.text
                yield ans

            yield response._chunks[-1].usage_metadata.total_token_count
        except Exception as e:
            yield ans + "\n**ERROR**: " + str(e)

        yield 0
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-25 16:23:35 +08:00
b705ff08fe Refa: improve GraphRAG similarity sensitivity to numeric differences (#8479)
### What problem does this PR solve?

Improve GraphRAG similarity sensitivity to numeric differences. #8444.

### Type of change

- [x] Refactoring
2025-06-25 16:20:59 +08:00
5256980ffb Fix: Solve the OOM issue when passing large PDF files while using QA chunking method. (#8464)
### What problem does this PR solve?

Using the QA chunking method with a large PDF (e.g., 300+ pages) may
lead to OOM in the ragflow-worker module.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-25 10:25:45 +08:00
8d9d2cc0a9 Fix: some cases Task return but not set progress (#8469)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8466
I go through the codes, current logic:
When do_handle_task raises an exception, handle_task will set the
progress, but for some cases do_handle_task internal will just return
but not set the right progress, at this cases the redis stream will been
acked but the task is running.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-25 09:58:55 +08:00
d6a941ebf5 Fix the bug of long type value overflow (#8313)
### What problem does this PR solve?

This PR will fix the #8271 by extending int type to float type when
there is any value out of long type range in a column.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-24 18:18:30 +08:00
bc1b837616 FIX:Saving an RGBA image directly as JPEG will cause an error. If the… (#8399)
Saving an RGBA image directly as JPEG will cause an error. If the image
is in RGBA mode, convert it to RGB mode before saving it in JPG format.

### What problem does this PR solve?

During document parsing in the knowledge base, we occasionally encounter
the error 'cannot write mode RGBA as JPEG.' This occurs because images
in RGBA mode cannot be directly saved as JPEG. They must be converted
first before saving.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-24 18:01:13 +08:00
49d67cbcb7 fix a bug when using huggingface embedding api (#8432)
### What problem does this PR solve?

image_version: v0.19.1
This PR fixes a bug in the HuggingFaceEmBedding API method that was
causing AssertionError: assert len(vects) == len(docs) during the
document embedding process.

#### Problem
The HuggingFaceEmbed.encode() method had an early return statement
inside the for loop, causing it to return after processing only the
first text input instead of processing all texts in the input list.

**Error Messenge**
```python
AssertionError: assert len(vects) == len(docs) # input chunks  != embedded  vectors from embedding api
File "/ragflow/rag/svr/task_executor.py", line 442, in embedding
```



**Buggy code(/ragflow/rag/llm/embedding_model.py)**
```python
class HuggingFaceEmbed(Base):
    def __init__(self, key, model_name, base_url=None):
        if not model_name:
            raise ValueError("Model name cannot be None")
        self.key = key
        self.model_name = model_name.split("___")[0]
        self.base_url = base_url or "http://127.0.0.1:8080"
        def encode(self, texts: list):
            embeddings = []
            for text in texts:
                response = requests.post(...)
                if response.status_code == 200:
                    try:
                        embedding = response.json()
                        embeddings.append(embedding[0])
                        #  Early return
                        return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) 
                    except Exception as _e:
                        log_exception(_e, response)
                else:
                    raise Exception(...)
```
**Fixed Code(I just Rollback this function to the v0.19.0 version)**
```python
Class HuggingFaceEmbed(Base):
    def __init__(self, key, model_name, base_url=None):
        if not model_name:
            raise ValueError("Model name cannot be None")
        self.key = key
        self.model_name = model_name.split("___")[0]
        self.base_url = base_url or "http://127.0.0.1:8080"
        def encode(self, texts: list):
            embeddings = []
            for text in texts:
                response = requests.post(...)
                if response.status_code == 200:
                    embedding = response.json()
                    embeddings.append(embedding[0])  #  Only append, no return
                else:
                    raise Exception(...)
            return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts])  #  Return after processing all
```
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-24 09:35:02 +08:00
fd7ac17605 Feat: Scratch MCP tool calling support. (#8263)
### What problem does this PR solve?

This is a cherry-pick from #7781 as requested.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-23 17:45:35 +08:00
244d8a47b9 Fix: AzureChat model code (#8426)
### What problem does this PR solve?

- Simplify AzureChat constructor by passing base_url directly
- Clean up spacing and formatting in chat_model.py
- Remove redundant parentheses and improve code consistency
- #8423

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-23 15:59:25 +08:00
f0e0783618 Fix: Database Query Vulnerable to Injection Attacks in rag/utils/opendal_conn.py (#8408)
**Context and Purpose:**

This PR automatically remediates a security vulnerability:
- **Description:** Detected possible formatted SQL query. Use
parameterized queries instead.
- **Rule ID:**
python.lang.security.audit.formatted-sql-query.formatted-sql-query
- **Severity:** HIGH
- **File:** rag/utils/opendal_conn.py
- **Lines Affected:** 98 - 98

This change is necessary to protect the application from potential
security risks associated with this vulnerability.

**Solution Implemented:**

The automated remediation process has applied the necessary changes to
the affected code in `rag/utils/opendal_conn.py` to resolve the
identified issue.

Please review the changes to ensure they are correct and integrate as
expected.
2025-06-23 14:54:25 +08:00
d4e6e2bd21 Fix: doc_aggs issue. (#8418)
### What problem does this PR solve?

#8406

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-23 14:54:01 +08:00
83e23f1e8a Fix: rank feature score should be greater than 0. (#8416)
### What problem does this PR solve?

#8414

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-23 14:10:13 +08:00
794a4102c2 Fix: Document parse via API will alot problen (#8407)
### What problem does this PR solve?
#8391
#8404

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-23 13:08:11 +08:00
ef5e7d8c44 Fix:embedding_model class SILICONFLOWEmbed(Base)Function reusing json (#8378)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/8360

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-20 11:13:00 +08:00
4784aa5b0b fix: List Chunks API fails to return the correct document status. (#8347)
### What problem does this PR solve?

The existing
/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks endpoint
fails to accurately return a document's chunk status. Even when a chunk
is explicitly marked as unavailable, the API still returns true.

![img_v3_02nc_3458a1b7-609e-4f20-8cb7-2156a489848g](https://github.com/user-attachments/assets/ab3b8f69-1284-49c1-8af3-bdfae3416583)

![img_v3_02nc_82f1d96e-7596-4def-ba75-5a2bd10d56cg](https://github.com/user-attachments/assets/a8a4162b-b50d-4dfc-af72-e1d7812a0a93)

Co-authored-by: zhoudeyong <zhoudeyong@idr.ai>
2025-06-19 11:12:53 +08:00
8f3fe63d73 Fix: duplicated task (#8358)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-19 11:12:29 +08:00
35034fed73 Fix: Raptor: [Bug]: **ERROR**: Unknown field for GenerationConfig: max_tokens (#8331)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/8324

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-18 16:40:57 +08:00
4a2ff633e0 Fix typo in code (#8327)
### What problem does this PR solve?

Fix typo in code

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-06-18 09:41:09 +08:00
8f9bcb1c74 Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266)
### Description

This PR introduces two new environment variables, ‎`DOC_BULK_SIZE` and
‎`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for
document parsing and embedding vectorization in RAGFlow. By making these
parameters configurable, users can optimize performance and resource
usage according to their hardware capabilities and workload
requirements.

### What problem does this PR solve?

Previously, the batch sizes for document parsing and embedding were
hardcoded, limiting the ability to adjust throughput and memory
consumption. This PR enables users to set these values via environment
variables (in ‎`.env`, Helm chart, or directly in the deployment
environment), improving flexibility and scalability for both small and
large deployments.

- ‎`DOC_BULK_SIZE`: Controls how many document chunks are processed in a
single batch during document parsing (default: 4).
- ‎`EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed
in a single batch during embedding vectorization (default: 16).

This change updates the codebase, documentation, and configuration files
to reflect the new options.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):

### Additional context
- Updated ‎`.env`, ‎`helm/values.yaml`, and documentation to describe
the new variables.
- Modified relevant code paths to use the environment variables instead
of hardcoded values.
- Users can now tune these parameters to achieve better throughput or
reduce memory usage as needed.

Before:
Default value:
<img width="643" alt="image"
src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a"
/>
After:
10x:
<img width="777" alt="image"
src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1"
/>
2025-06-16 13:40:47 +08:00
b1117a8717 Fix: base url issue. (#8281)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-16 13:40:25 +08:00
dabbc852c8 Fix: opendal storage health attribute not found & remove duplicate operator scheme initialization (#8265)
### What problem does this PR solve?

This PR fixes two issues in the OpenDAL storage connector:
1. The ‎`health` method was missing, which prevented health checks on
the storage backend.
3. The initialization of the ‎`opendal.Operator` object included a
redundant scheme parameter, causing unnecessary duplication and
potential confusion.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Background
- The absence of a ‎`health` method made it difficult to verify the
availability and reliability of the storage service.
- Initializing ‎`opendal.Operator` with both ‎`self._scheme` and
unpacked ‎`**self._kwargs` could lead to errors or unexpected behavior
if the scheme was already included in the kwargs.

### What is changed and how it works?
- Adds a ‎`health` method that writes a test file to verify storage
availability.
- Removes the duplicate scheme parameter from the ‎`opendal.Operator`
initialization to ensure clarity and prevent conflicts.

before:
<img width="762" alt="企业微信截图_46be646f-2e99-4e5e-be67-b1483426e77c"
src="https://github.com/user-attachments/assets/acecbb8c-4810-457f-8342-6355148551ba"
/>
<img width="767" alt="image"
src="https://github.com/user-attachments/assets/147cd5a2-dde3-466b-a9c1-d1d4f0819e5d"
/>

after:
<img width="1123" alt="企业微信截图_09d62997-8908-4985-b89f-7a78b5da55ac"
src="https://github.com/user-attachments/assets/97dc88c9-0f4e-4d77-88b3-cd818e8da046"
/>
2025-06-16 11:35:51 +08:00
8f9e7a6f6f Refa: revert to original task message collection logic (#8251)
### What problem does this PR solve?

Get rid of 'RedisDB.get_unacked_iterator queue rag_flow_svr_queue_1
doesn't exist'

----

Edit: revert to original message collection logic.

### Type of change

- [x] Refactoring

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-13 16:38:53 +08:00
65d5268439 Feat: implement novitaAI embedding and reranking. (#8250)
### What problem does this PR solve?

Close #8227

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-13 15:42:17 +08:00
6aa0b0819d Fix: unify opendal config key from ‎schema to ‎scheme (#8232)
### What problem does this PR solve?

This PR resolves the inconsistency in the opendal configuration where
both ‎`schema` and ‎`scheme` were used as keys. The code and
configuration file now consistently use ‎`scheme`, which helps prevent
configuration errors and runtime issues. This change improves code
clarity and maintainability.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Additional context
- Updated both ‎`conf/service_conf.yaml` and
‎`rag/utils/opendal_conn.py` to use ‎`scheme` instead of ‎`schema`
- No breaking changes to other configuration fields
2025-06-13 14:56:51 +08:00
3d0b440e9f fix(search.py):remove hard page_size (#8242)
### What problem does this PR solve?

Fix the restriction of forcing similarity_threshold=0 and page_size=30
when doc_ids is not empty

#8228

---------

Co-authored-by: shiqing.wusq <shiqing.wusq@dtzhejiang.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-13 14:56:25 +08:00
24ca4cc6b7 Refa: GraphRAG and explaining GraphRAG stalling behavior on large files (#8223)
### What problem does this PR solve?

This PR investigates the cause of #7957.

TL;DR: Incorrect similarity calculations lead to too many candidates.
Since candidate selection involves interaction with the LLM, this causes
significant delays in the program.

What this PR does:

1. **Fix similarity calculation**:
When processing a 64 pages government document, the corrected similarity
calculation reduces the number of candidates from over 100,000 to around
16,000. With a default batch size of 100 pairs per LLM call, this fix
reduces unnecessary LLM interactions from over 1,000 calls to around
160, a roughly 10x improvement.
2. **Add concurrency and timeout limits**: 
Up to 5 entity types are processed in "parallel", each with a 180-second
timeout. These limits may be configurable in future updates.
3. **Improve logging**:
The candidate resolution process now reports progress in real time.
4. **Mitigates potential concurrency risks**


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-06-12 19:09:50 +08:00
d36c8d18b1 Refa: make exception more clear. (#8224)
### What problem does this PR solve?

#8156

### Type of change
- [x] Refactoring
2025-06-12 17:53:59 +08:00
d5236b71f4 Refa: ollama keep alive issue. (#8216)
### What problem does this PR solve?

#8122

### Type of change

- [x] Refactoring
2025-06-12 15:09:40 +08:00
56ee69e9d9 Refa: chat with tools. (#8210)
### What problem does this PR solve?


### Type of change
- [x] Refactoring
2025-06-12 12:31:10 +08:00
44287fb05f Oss support opendal(including mysql) (#8204)
### What problem does this PR solve?

#8074
Oss support opendal(including mysql)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-12 11:37:42 +08:00