Added 0.17.2 release notes (#6028 )

### What problem does this PR solve? ### Type of change - [x] Documentation Update
DOCS: for release. (#6023 )
2025-12-08 20:42:30 +08:00 · 2025-03-13 15:59:58 +08:00 · 2025-03-13 15:09:29 +08:00 · 2025-03-13 14:57:47 +08:00 · 2025-03-13 14:51:55 +08:00 · 2025-03-13 14:43:24 +08:00
79 changed files with 3185 additions and 532 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@ -5,11 +5,17 @@ labels: [bug]
 body:
 - type: checkboxes
  attributes:
-    label: Is there an existing issue for the same bug?
-    description: Please check if an issue already exists for the bug you encountered.
+    label: Self Checks
+    description: "Please check the following in order to be responded in time :)"
    options:
-    - label: I have checked the existing issues.
-      required: true
+      - label: I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
+        required: true
+      - label: I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
+        required: true
+      - label: Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
+        required: true
+      - label: "Please do not modify this template :) and fill in all the required fields."
+        required: true
 - type: markdown
  attributes:
    value: "Please provide the following information to help us understand the issue."
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@ -5,10 +5,16 @@ labels: [feature request]
 body:
  - type: checkboxes
    attributes:
-      label: Is there an existing issue for the same feature request?
-      description: Please check if an issue already exists for the feature you request.
+      label: Self Checks
+      description: "Please check the following in order to be responded in time :)"
      options:
-        - label: I have checked the existing issues.
+        - label: I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
+          required: true
+        - label: I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
+          required: true
+        - label: Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
+          required: true
+        - label: "Please do not modify this template :) and fill in all the required fields."
          required: true
  - type: textarea
    attributes:
--- a/.github/ISSUE_TEMPLATE/question.yml
+++ b/.github/ISSUE_TEMPLATE/question.yml
@ -3,6 +3,19 @@ description: Ask questions on RAGFlow
 title: "[Question]: "
 labels: [question]
 body:
+- type: checkboxes
+  attributes:
+    label: Self Checks
+    description: "Please check the following in order to be responded in time :)"
+    options:
+      - label: I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
+        required: true
+      - label: I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
+        required: true
+      - label: Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
+        required: true
+      - label: "Please do not modify this template :) and fill in all the required fields."
+        required: true
 - type: markdown
  attributes:
    value: |
--- a/.gitignore
+++ b/.gitignore
@ -41,3 +41,4 @@ nltk_data/

 # Exclude hash-like temporary files like 9b5ad71b2ce5302211f9c61530b329a4922fc6a4
 *[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]*
+.lh/
--- a/README.md
+++ b/README.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.1-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.1">
+        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.2-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -178,7 +178,7 @@ releases! 🌟
 > All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
 > If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.

-   > The command below downloads the `v0.17.1-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.17.1-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1` for the full edition `v0.17.1`.
+   > The command below downloads the `v0.17.2-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.17.2-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2` for the full edition `v0.17.2`.

   ```bash
   $ cd ragflow/docker
@ -187,8 +187,8 @@ releases! 🌟

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   |-------------------|-----------------|-----------------------|--------------------------|
-   | v0.17.1           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.17.1-slim      | &approx;2       | ❌                   | Stable release            |
+   | v0.17.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.17.2-slim      | &approx;2       | ❌                   | Stable release            |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                   | _Unstable_ nightly build  |

--- a/README_id.md
+++ b/README_id.md
@ -22,7 +22,7 @@
        <img alt="Lencana Daring" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.1-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.1">
+        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.2-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Rilis%20Terbaru" alt="Rilis Terbaru">
@ -171,7 +171,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
 > Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
 > Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).

-   > Perintah di bawah ini mengunduh edisi v0.17.1-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.17.1-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1 untuk edisi lengkap v0.17.1.
+   > Perintah di bawah ini mengunduh edisi v0.17.2-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.17.2-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2 untuk edisi lengkap v0.17.2.

   ```bash
   $ cd ragflow/docker
@ -180,8 +180,8 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.17.1           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.17.1-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.17.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.17.2-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                    | _Unstable_ nightly build |

--- a/README_ja.md
+++ b/README_ja.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.1-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.1">
+        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.2-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -151,7 +151,7 @@
 > 現在、公式に提供されているすべての Docker イメージは x86 アーキテクチャ向けにビルドされており、ARM64 用の Docker イメージは提供されていません。
 > ARM64 アーキテクチャのオペレーティングシステムを使用している場合は、[このドキュメント](https://ragflow.io/docs/dev/build_docker_image)を参照して Docker イメージを自分でビルドしてください。

-   > 以下のコマンドは、RAGFlow Docker イメージの v0.17.1-slim エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.17.1-slim とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。例えば、完全版 v0.17.1 をダウンロードするには、RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1 と設定します。
+   > 以下のコマンドは、RAGFlow Docker イメージの v0.17.2-slim エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.17.2-slim とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。例えば、完全版 v0.17.2 をダウンロードするには、RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2 と設定します。

   ```bash
   $ cd ragflow/docker
@ -160,8 +160,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.17.1           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.17.1-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.17.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.17.2-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

--- a/README_ko.md
+++ b/README_ko.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.1-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.1">
+        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.2-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -152,7 +152,7 @@
 > 모든 Docker 이미지는 x86 플랫폼을 위해 빌드되었습니다. 우리는 현재 ARM64 플랫폼을 위한 Docker 이미지를 제공하지 않습니다.
 > ARM64 플랫폼을 사용 중이라면, [시스템과 호환되는 Docker 이미지를 빌드하려면 이 가이드를 사용해 주세요](https://ragflow.io/docs/dev/build_docker_image).

-   > 아래 명령어는 RAGFlow Docker 이미지의 v0.17.1-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.17.1-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.17.1을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1로 설정합니다.
+   > 아래 명령어는 RAGFlow Docker 이미지의 v0.17.2-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.17.2-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.17.2을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2로 설정합니다.

   ```bash
   $ cd ragflow/docker
@ -161,8 +161,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.17.1           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.17.1-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.17.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.17.2-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

--- a/README_pt_br.md
+++ b/README_pt_br.md
@ -22,7 +22,7 @@
        <img alt="Badge Estático" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.1-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.1">
+        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.2-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Última%20Relese" alt="Última Versão">
@ -171,7 +171,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
 > Todas as imagens Docker são construídas para plataformas x86. Atualmente, não oferecemos imagens Docker para ARM64.
 > Se você estiver usando uma plataforma ARM64, por favor, utilize [este guia](https://ragflow.io/docs/dev/build_docker_image) para construir uma imagem Docker compatível com o seu sistema.

-    > O comando abaixo baixa a edição `v0.17.1-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.17.1-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1` para a edição completa `v0.17.1`.
+    > O comando abaixo baixa a edição `v0.17.2-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.17.2-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2` para a edição completa `v0.17.2`.

    ```bash
    $ cd ragflow/docker
@ -180,8 +180,8 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).

    | Tag da imagem RAGFlow | Tamanho da imagem (GB) | Possui modelos de incorporação? | Estável?                 |
    | --------------------- | ---------------------- | ------------------------------- | ------------------------ |
-    | v0.17.1               | ~9                     | :heavy_check_mark:              | Lançamento estável       |
-    | v0.17.1-slim          | ~2                     | ❌                              | Lançamento estável       |
+    | v0.17.2               | ~9                     | :heavy_check_mark:              | Lançamento estável       |
+    | v0.17.2-slim          | ~2                     | ❌                              | Lançamento estável       |
    | nightly               | ~9                     | :heavy_check_mark:              | _Instável_ build noturno |
    | nightly-slim          | ~2                     | ❌                               | _Instável_ build noturno |

--- a/README_tzh.md
+++ b/README_tzh.md
@ -21,7 +21,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.1-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.1">
+        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.2-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -150,7 +150,7 @@
 > 所有 Docker 映像檔都是為 x86 平台建置的。目前，我們不提供 ARM64 平台的 Docker 映像檔。
 > 如果您使用的是 ARM64 平台，請使用 [這份指南](https://ragflow.io/docs/dev/build_docker_image) 來建置適合您系統的 Docker 映像檔。

-   > 執行以下指令會自動下載 RAGFlow slim Docker 映像 `v0.17.1-slim`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.17.1-slim` 的 Docker 映像，請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。例如，你可以透過設定 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1` 來下載 RAGFlow 鏡像的 `v0.17.1` 完整發行版。
+   > 執行以下指令會自動下載 RAGFlow slim Docker 映像 `v0.17.2-slim`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.17.2-slim` 的 Docker 映像，請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。例如，你可以透過設定 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2` 來下載 RAGFlow 鏡像的 `v0.17.2` 完整發行版。

   ```bash
   $ cd ragflow/docker
@ -159,8 +159,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.17.1           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.17.1-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.17.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.17.2-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

--- a/README_zh.md
+++ b/README_zh.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.1-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.1">
+        <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.17.2-brightgreen" alt="docker pull infiniflow/ragflow:v0.17.2">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -151,7 +151,7 @@
 > 请注意，目前官方提供的所有 Docker 镜像均基于 x86 架构构建，并不提供基于 ARM64 的 Docker 镜像。
 > 如果你的操作系统是 ARM64 架构，请参考[这篇文档](https://ragflow.io/docs/dev/build_docker_image)自行构建 Docker 镜像。

-   > 运行以下命令会自动下载 RAGFlow slim Docker 镜像 `v0.17.1-slim`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.17.1-slim` 的 Docker 镜像，请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。比如，你可以通过设置 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1` 来下载 RAGFlow 镜像的 `v0.17.1` 完整发行版。
+   > 运行以下命令会自动下载 RAGFlow slim Docker 镜像 `v0.17.2-slim`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.17.2-slim` 的 Docker 镜像，请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。比如，你可以通过设置 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2` 来下载 RAGFlow 镜像的 `v0.17.2` 完整发行版。

   ```bash
   $ cd ragflow/docker
@ -160,8 +160,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.17.1           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.17.1-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.17.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.17.2-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

--- a/agent/component/iterationitem.py
+++ b/agent/component/iterationitem.py
@ -38,6 +38,10 @@ class IterationItem(ComponentBase, ABC):
        ans = parent.get_input()
        ans = parent._param.delimiter.join(ans["content"]) if "content" in ans else ""
        ans = [a.strip() for a in ans.split(parent._param.delimiter)]
+        if not ans:
+            self._idx = -1
+            return pd.DataFrame()
+
        df = pd.DataFrame([{"content": ans[self._idx]}])
        self._idx += 1
        if self._idx >= len(ans):
--- a/agent/component/retrieval.py
+++ b/agent/component/retrieval.py
@ -24,6 +24,7 @@ from api.db.services.llm_service import LLMBundle
 from api import settings
 from agent.component.base import ComponentBase, ComponentParamBase
 from rag.app.tag import label_question
+from rag.utils.tavily_conn import Tavily


 class RetrievalParam(ComponentParamBase):
@ -40,6 +41,8 @@ class RetrievalParam(ComponentParamBase):
        self.kb_ids = []
        self.rerank_id = ""
        self.empty_response = ""
+        self.tavily_api_key = ""
+        self.use_kg = False

    def check(self):
        self.check_decimal_float(self.similarity_threshold, "[Retrieval] Similarity threshold")
@ -75,6 +78,20 @@ class Retrieval(ComponentBase, ABC):
                                        self._param.similarity_threshold, 1 - self._param.keywords_similarity_weight,
                                        aggs=False, rerank_mdl=rerank_mdl,
                                        rank_feature=label_question(query, kbs))
+        if self._param.use_kg:
+            ck = settings.kg_retrievaler.retrieval(query,
+                                                   [kbs[0].tenant_id],
+                                                   self._param.kb_ids,
+                                                   embd_mdl,
+                                                   LLMBundle(kbs[0].tenant_id, LLMType.CHAT))
+            if ck["content_with_weight"]:
+                kbinfos["chunks"].insert(0, ck)
+
+        if self._param.tavily_api_key:
+            tav = Tavily(self._param.tavily_api_key)
+            tav_res = tav.retrieve_chunks(query)
+            kbinfos["chunks"].extend(tav_res["chunks"])
+            kbinfos["doc_aggs"].extend(tav_res["doc_aggs"])

        if not kbinfos["chunks"]:
            df = Retrieval.be_output("")
--- a/agentic_reasoning/deep_research.py
+++ b/agentic_reasoning/deep_research.py
@ -36,132 +36,188 @@ class DeepResearcher:
        self._kb_retrieve = kb_retrieve
        self._kg_retrieve = kg_retrieve

+    @staticmethod
+    def _remove_query_tags(text):
+        """Remove query tags from text"""
+        pattern = re.escape(BEGIN_SEARCH_QUERY) + r"(.*?)" + re.escape(END_SEARCH_QUERY)
+        return re.sub(pattern, "", text)
+
+    @staticmethod
+    def _remove_result_tags(text):
+        """Remove result tags from text"""
+        pattern = re.escape(BEGIN_SEARCH_RESULT) + r"(.*?)" + re.escape(END_SEARCH_RESULT)
+        return re.sub(pattern, "", text)
+
+    def _generate_reasoning(self, msg_history):
+        """Generate reasoning steps"""
+        query_think = ""
+        if msg_history[-1]["role"] != "user":
+            msg_history.append({"role": "user", "content": "Continues reasoning with the new information.\n"})
+        else:
+            msg_history[-1]["content"] += "\n\nContinues reasoning with the new information.\n"
+            
+        for ans in self.chat_mdl.chat_streamly(REASON_PROMPT, msg_history, {"temperature": 0.7}):
+            ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
+            if not ans:
+                continue
+            query_think = ans
+            yield query_think
+        return query_think
+
+    def _extract_search_queries(self, query_think, question, step_index):
+        """Extract search queries from thinking"""
+        queries = extract_between(query_think, BEGIN_SEARCH_QUERY, END_SEARCH_QUERY)
+        if not queries and step_index == 0:
+            # If this is the first step and no queries are found, use the original question as the query
+            queries = [question]
+        return queries
+
+    def _truncate_previous_reasoning(self, all_reasoning_steps):
+        """Truncate previous reasoning steps to maintain a reasonable length"""
+        truncated_prev_reasoning = ""
+        for i, step in enumerate(all_reasoning_steps):
+            truncated_prev_reasoning += f"Step {i + 1}: {step}\n\n"
+
+        prev_steps = truncated_prev_reasoning.split('\n\n')
+        if len(prev_steps) <= 5:
+            truncated_prev_reasoning = '\n\n'.join(prev_steps)
+        else:
+            truncated_prev_reasoning = ''
+            for i, step in enumerate(prev_steps):
+                if i == 0 or i >= len(prev_steps) - 4 or BEGIN_SEARCH_QUERY in step or BEGIN_SEARCH_RESULT in step:
+                    truncated_prev_reasoning += step + '\n\n'
+                else:
+                    if truncated_prev_reasoning[-len('\n\n...\n\n'):] != '\n\n...\n\n':
+                        truncated_prev_reasoning += '...\n\n'
+        
+        return truncated_prev_reasoning.strip('\n')
+
+    def _retrieve_information(self, search_query):
+        """Retrieve information from different sources"""
+        # 1. Knowledge base retrieval
+        kbinfos = self._kb_retrieve(question=search_query) if self._kb_retrieve else {"chunks": [], "doc_aggs": []}
+        
+        # 2. Web retrieval (if Tavily API is configured)
+        if self.prompt_config.get("tavily_api_key"):
+            tav = Tavily(self.prompt_config["tavily_api_key"])
+            tav_res = tav.retrieve_chunks(search_query)
+            kbinfos["chunks"].extend(tav_res["chunks"])
+            kbinfos["doc_aggs"].extend(tav_res["doc_aggs"])
+        
+        # 3. Knowledge graph retrieval (if configured)
+        if self.prompt_config.get("use_kg") and self._kg_retrieve:
+            ck = self._kg_retrieve(question=search_query)
+            if ck["content_with_weight"]:
+                kbinfos["chunks"].insert(0, ck)
+                
+        return kbinfos
+
+    def _update_chunk_info(self, chunk_info, kbinfos):
+        """Update chunk information for citations"""
+        if not chunk_info["chunks"]:
+            # If this is the first retrieval, use the retrieval results directly
+            for k in chunk_info.keys():
+                chunk_info[k] = kbinfos[k]
+        else:
+            # Merge newly retrieved information, avoiding duplicates
+            cids = [c["chunk_id"] for c in chunk_info["chunks"]]
+            for c in kbinfos["chunks"]:
+                if c["chunk_id"] not in cids:
+                    chunk_info["chunks"].append(c)
+                    
+            dids = [d["doc_id"] for d in chunk_info["doc_aggs"]]
+            for d in kbinfos["doc_aggs"]:
+                if d["doc_id"] not in dids:
+                    chunk_info["doc_aggs"].append(d)
+
+    def _extract_relevant_info(self, truncated_prev_reasoning, search_query, kbinfos):
+        """Extract and summarize relevant information"""
+        summary_think = ""
+        for ans in self.chat_mdl.chat_streamly(
+                RELEVANT_EXTRACTION_PROMPT.format(
+                    prev_reasoning=truncated_prev_reasoning,
+                    search_query=search_query,
+                    document="\n".join(kb_prompt(kbinfos, 4096))
+                ),
+                [{"role": "user",
+                  "content": f'Now you should analyze each web page and find helpful information based on the current search query "{search_query}" and previous reasoning steps.'}],
+                {"temperature": 0.7}):
+            ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
+            if not ans:
+                continue
+            summary_think = ans
+            yield summary_think
+        
+        return summary_think
+
    def thinking(self, chunk_info: dict, question: str):
-        def rm_query_tags(line):
-            pattern = re.escape(BEGIN_SEARCH_QUERY) + r"(.*?)" + re.escape(END_SEARCH_QUERY)
-            return re.sub(pattern, "", line)
-
-        def rm_result_tags(line):
-            pattern = re.escape(BEGIN_SEARCH_RESULT) + r"(.*?)" + re.escape(END_SEARCH_RESULT)
-            return re.sub(pattern, "", line)
-
        executed_search_queries = []
-        msg_hisotry = [{"role": "user", "content": f'Question:\"{question}\"\n'}]
+        msg_history = [{"role": "user", "content": f'Question:\"{question}\"\n'}]
        all_reasoning_steps = []
        think = "<think>"
-        for ii in range(MAX_SEARCH_LIMIT + 1):
-            if ii == MAX_SEARCH_LIMIT - 1:
+        
+        for step_index in range(MAX_SEARCH_LIMIT + 1):
+            # Check if the maximum search limit has been reached
+            if step_index == MAX_SEARCH_LIMIT - 1:
                summary_think = f"\n{BEGIN_SEARCH_RESULT}\nThe maximum search limit is exceeded. You are not allowed to search.\n{END_SEARCH_RESULT}\n"
                yield {"answer": think + summary_think + "</think>", "reference": {}, "audio_binary": None}
                all_reasoning_steps.append(summary_think)
-                msg_hisotry.append({"role": "assistant", "content": summary_think})
+                msg_history.append({"role": "assistant", "content": summary_think})
                break

+            # Step 1: Generate reasoning
            query_think = ""
-            if msg_hisotry[-1]["role"] != "user":
-                msg_hisotry.append({"role": "user", "content": "Continues reasoning with the new information.\n"})
-            else:
-                msg_hisotry[-1]["content"] += "\n\nContinues reasoning with the new information.\n"
-            for ans in self.chat_mdl.chat_streamly(REASON_PROMPT, msg_hisotry, {"temperature": 0.7}):
-                ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
-                if not ans:
-                    continue
+            for ans in self._generate_reasoning(msg_history):
                query_think = ans
-                yield {"answer": think + rm_query_tags(query_think) + "</think>", "reference": {}, "audio_binary": None}
+                yield {"answer": think + self._remove_query_tags(query_think) + "</think>", "reference": {}, "audio_binary": None}

-            think += rm_query_tags(query_think)
+            think += self._remove_query_tags(query_think)
            all_reasoning_steps.append(query_think)
-            queries = extract_between(query_think, BEGIN_SEARCH_QUERY, END_SEARCH_QUERY)
-            if not queries:
-                if ii > 0:
-                    break
-                queries = [question]
+            
+            # Step 2: Extract search queries
+            queries = self._extract_search_queries(query_think, question, step_index)
+            if not queries and step_index > 0:
+                # If not the first step and no queries, end the search process
+                break

+            # Process each search query
            for search_query in queries:
-                logging.info(f"[THINK]Query: {ii}. {search_query}")
-                msg_hisotry.append({"role": "assistant", "content": search_query})
-                think += f"\n\n> {ii +1}. {search_query}\n\n"
+                logging.info(f"[THINK]Query: {step_index}. {search_query}")
+                msg_history.append({"role": "assistant", "content": search_query})
+                think += f"\n\n> {step_index + 1}. {search_query}\n\n"
                yield {"answer": think + "</think>", "reference": {}, "audio_binary": None}

-                summary_think = ""
-                # The search query has been searched in previous steps.
+                # Check if the query has already been executed
                if search_query in executed_search_queries:
                    summary_think = f"\n{BEGIN_SEARCH_RESULT}\nYou have searched this query. Please refer to previous results.\n{END_SEARCH_RESULT}\n"
                    yield {"answer": think + summary_think + "</think>", "reference": {}, "audio_binary": None}
                    all_reasoning_steps.append(summary_think)
-                    msg_hisotry.append({"role": "user", "content": summary_think})
+                    msg_history.append({"role": "user", "content": summary_think})
                    think += summary_think
                    continue
-
-                truncated_prev_reasoning = ""
-                for i, step in enumerate(all_reasoning_steps):
-                    truncated_prev_reasoning += f"Step {i + 1}: {step}\n\n"
-
-                prev_steps = truncated_prev_reasoning.split('\n\n')
-                if len(prev_steps) <= 5:
-                    truncated_prev_reasoning = '\n\n'.join(prev_steps)
-                else:
-                    truncated_prev_reasoning = ''
-                    for i, step in enumerate(prev_steps):
-                        if i == 0 or i >= len(prev_steps) - 4 or BEGIN_SEARCH_QUERY in step or BEGIN_SEARCH_RESULT in step:
-                            truncated_prev_reasoning += step + '\n\n'
-                        else:
-                            if truncated_prev_reasoning[-len('\n\n...\n\n'):] != '\n\n...\n\n':
-                                truncated_prev_reasoning += '...\n\n'
-                truncated_prev_reasoning = truncated_prev_reasoning.strip('\n')
-
-                # Retrieval procedure:
-                # 1. KB search
-                # 2. Web search (optional)
-                # 3. KG search (optional)
-                kbinfos = self._kb_retrieve(question=search_query) if self._kb_retrieve else {"chunks": [], "doc_aggs": []}
-
-                if self.prompt_config.get("tavily_api_key"):
-                    tav = Tavily(self.prompt_config["tavily_api_key"])
-                    tav_res = tav.retrieve_chunks(search_query)
-                    kbinfos["chunks"].extend(tav_res["chunks"])
-                    kbinfos["doc_aggs"].extend(tav_res["doc_aggs"])
-                if self.prompt_config.get("use_kg") and self._kg_retrieve:
-                    ck = self._kg_retrieve(question=search_query)
-                    if ck["content_with_weight"]:
-                        kbinfos["chunks"].insert(0, ck)
-
-                # Merge chunk info for citations
-                if not chunk_info["chunks"]:
-                    for k in chunk_info.keys():
-                        chunk_info[k] = kbinfos[k]
-                else:
-                    cids = [c["chunk_id"] for c in chunk_info["chunks"]]
-                    for c in kbinfos["chunks"]:
-                        if c["chunk_id"] in cids:
-                            continue
-                        chunk_info["chunks"].append(c)
-                    dids = [d["doc_id"] for d in chunk_info["doc_aggs"]]
-                    for d in kbinfos["doc_aggs"]:
-                        if d["doc_id"] in dids:
-                            continue
-                        chunk_info["doc_aggs"].append(d)
-
+                
+                executed_search_queries.append(search_query)
+                
+                # Step 3: Truncate previous reasoning steps
+                truncated_prev_reasoning = self._truncate_previous_reasoning(all_reasoning_steps)
+                
+                # Step 4: Retrieve information
+                kbinfos = self._retrieve_information(search_query)
+                
+                # Step 5: Update chunk information
+                self._update_chunk_info(chunk_info, kbinfos)
+                
+                # Step 6: Extract relevant information
                think += "\n\n"
-                for ans in self.chat_mdl.chat_streamly(
-                        RELEVANT_EXTRACTION_PROMPT.format(
-                            prev_reasoning=truncated_prev_reasoning,
-                            search_query=search_query,
-                            document="\n".join(kb_prompt(kbinfos, 4096))
-                        ),
-                        [{"role": "user",
-                          "content": f'Now you should analyze each web page and find helpful information based on the current search query "{search_query}" and previous reasoning steps.'}],
-                        {"temperature": 0.7}):
-                    ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
-                    if not ans:
-                        continue
+                summary_think = ""
+                for ans in self._extract_relevant_info(truncated_prev_reasoning, search_query, kbinfos):
                    summary_think = ans
-                    yield {"answer": think + rm_result_tags(summary_think) + "</think>", "reference": {}, "audio_binary": None}
+                    yield {"answer": think + self._remove_result_tags(summary_think) + "</think>", "reference": {}, "audio_binary": None}

                all_reasoning_steps.append(summary_think)
-                msg_hisotry.append(
+                msg_history.append(
                    {"role": "user", "content": f"\n\n{BEGIN_SEARCH_RESULT}{summary_think}{END_SEARCH_RESULT}\n\n"})
-                think += rm_result_tags(summary_think)
-                logging.info(f"[THINK]Summary: {ii}. {summary_think}")
+                think += self._remove_result_tags(summary_think)
+                logging.info(f"[THINK]Summary: {step_index}. {summary_think}")

        yield think + "</think>"
--- a/agentic_reasoning/prompts.py
+++ b/agentic_reasoning/prompts.py
@ -68,6 +68,7 @@ REASON_PROMPT = (
        f"- You have a dataset to search, so you just provide a proper search query.\n"
        f"- Use {BEGIN_SEARCH_QUERY} to request a dataset search and end with {END_SEARCH_QUERY}.\n"
        "- The language of query MUST be as the same as 'Question' or 'search result'.\n"
+        "- If no helpful information can be found, rewrite the search query to be less and precise keywords.\n"
        "- When done searching, continue your reasoning.\n\n"
        'Please answer the following question. You should think step by step to solve it.\n\n'
    )
--- a/api/apps/document_app.py
+++ b/api/apps/document_app.py
@ -347,7 +347,7 @@ def rm():
@manager.route('/run', methods=['POST'])  # noqa: F821
@login_required
@validate_request("doc_ids", "run")
-def run():
+def run(): 
    req = request.json
    for doc_id in req["doc_ids"]:
        if not DocumentService.accessible(doc_id, current_user.id):
--- a/api/apps/llm_app.py
+++ b/api/apps/llm_app.py
@ -135,7 +135,7 @@ def set_api_key():
 def add_llm():
    req = request.json
    factory = req["llm_factory"]
-    api_key = req.get("api_key", "")
+    api_key = req.get("api_key", "x")
    llm_name = req["llm_name"]

    def apikey_json(keys):
@ -338,8 +338,6 @@ def list_app():

        llm_set = set([m["llm_name"] + "@" + m["fid"] for m in llms])
        for o in objs:
-            if not o.api_key:
-                continue
            if o.llm_name + "@" + o.llm_factory in llm_set:
                continue
            llms.append({"llm_name": o.llm_name, "model_type": o.model_type, "fid": o.llm_factory, "available": True})
--- a/api/apps/sdk/chat.py
+++ b/api/apps/sdk/chat.py
@ -40,6 +40,12 @@ def create(tenant_id):
        kb = kbs[0]
        if kb.chunk_num == 0:
            return get_error_data_result(f"The dataset {kb_id} doesn't own parsed file")
+        
+        # Check if all documents in the knowledge base have been parsed
+        is_done, error_msg = KnowledgebaseService.is_parsed_done(kb_id)
+        if not is_done:
+            return get_error_data_result(error_msg)
+    
    kbs = KnowledgebaseService.get_by_ids(ids) if ids else []
    embd_ids = [TenantLLMService.split_model_name_and_factory(kb.embd_id)[0] for kb in kbs]  # remove vendor suffix for comparison
    embd_count = list(set(embd_ids))
@ -176,6 +182,12 @@ def update(tenant_id, chat_id):
                kb = kbs[0]
                if kb.chunk_num == 0:
                    return get_error_data_result(f"The dataset {kb_id} doesn't own parsed file")
+                
+                # Check if all documents in the knowledge base have been parsed
+                is_done, error_msg = KnowledgebaseService.is_parsed_done(kb_id)
+                if not is_done:
+                    return get_error_data_result(error_msg)
+                
            kbs = KnowledgebaseService.get_by_ids(ids)
            embd_ids = [TenantLLMService.split_model_name_and_factory(kb.embd_id)[0] for kb in kbs]  # remove vendor suffix for comparison
            embd_count = list(set(embd_ids))
--- a/api/apps/sdk/dataset.py
+++ b/api/apps/sdk/dataset.py
@ -276,7 +276,7 @@ def delete(tenant_id):
    return get_result(code=settings.RetCode.SUCCESS)


-@manager.route("/datasets/<dataset_id>", methods=["PUT"])  # noqa: F821
+@manager.route("/datasets/<dataset_id>", methods=["PUT"])  # noqa: F821  
@token_required
 def update(tenant_id, dataset_id):
    """
@ -330,7 +330,7 @@ def update(tenant_id, dataset_id):
        return get_error_data_result(message="You don't own the dataset")
    req = request.json
    e, t = TenantService.get_by_id(tenant_id)
-    invalid_keys = {"id", "embd_id", "chunk_num", "doc_num", "parser_id"}
+    invalid_keys = {"id", "embd_id", "chunk_num", "doc_num", "parser_id", "create_date", "create_time", "created_by", "status","token_num","update_date","update_time"}
    if any(key in req for key in invalid_keys):
        return get_error_data_result(message="The input parameters are invalid.")
    permission = req.get("permission")
@ -377,7 +377,7 @@ def update(tenant_id, dataset_id):
        if req["document_count"] != kb.doc_num:
            return get_error_data_result(message="Can't change `document_count`.")
        req.pop("document_count")
-    if "chunk_method" in req:
+    if req.get("chunk_method"):
        if kb.chunk_num != 0 and req["chunk_method"] != kb.parser_id:
            return get_error_data_result(
                message="If `chunk_count` is not 0, `chunk_method` is not changeable."
@ -439,6 +439,10 @@ def update(tenant_id, dataset_id):
            return get_error_data_result(
                message="Duplicated dataset name in updating dataset."
            )
+    flds = list(req.keys())
+    for f in flds:
+        if req[f] == "" and f in ["permission", "parser_id", "chunk_method"]:
+            del req[f]
    if not KnowledgebaseService.update_by_id(kb.id, req):
        return get_error_data_result(message="Update dataset error.(Database error)")
    return get_result(code=settings.RetCode.SUCCESS)
--- a/api/apps/sdk/session.py
+++ b/api/apps/sdk/session.py
@ -259,6 +259,7 @@ def chat_completion_openai_like(tenant_id, chat_id):
        # The choices field on the last chunk will always be an empty array [].
        def streamed_response_generator(chat_id, dia, msg):
            token_used = 0
+            should_split_index = 0
            response = {
                "id": f"chatcmpl-{chat_id}",
                "choices": [
@ -284,8 +285,13 @@ def chat_completion_openai_like(tenant_id, chat_id):
            try:
                for ans in chat(dia, msg, True):
                    answer = ans["answer"]
-                    incremental = answer[token_used:]
+                    incremental = answer[should_split_index:]
                    token_used += len(incremental)
+                    if incremental.endswith("</think>"):
+                        response_data_len = len(incremental.rstrip("</think>"))
+                    else:
+                        response_data_len = len(incremental)
+                    should_split_index += response_data_len
                    response["choices"][0]["delta"]["content"] = incremental
                    yield f"data:{json.dumps(response, ensure_ascii=False)}\n\n"
            except Exception as e:
--- a/api/db/services/canvas_service.py
+++ b/api/db/services/canvas_service.py
@ -86,21 +86,9 @@ def completion(tenant_id, agent_id, question, session_id=None, stream=True, **kw
            "dsl": cvs.dsl
        }
        API4ConversationService.save(**conv)
-        if query:
-            yield "data:" + json.dumps({"code": 0,
-                                        "message": "",
-                                        "data": {
-                                            "session_id": session_id,
-                                            "answer": canvas.get_prologue(),
-                                            "reference": [],
-                                            "param": canvas.get_preset_param()
-                                        }
-                                        },
-                                       ensure_ascii=False) + "\n\n"
-            yield "data:" + json.dumps({"code": 0, "message": "", "data": True}, ensure_ascii=False) + "\n\n"
-            return
-        else:
-            conv = API4Conversation(**conv)
+
+        
+        conv = API4Conversation(**conv)
    else:
        e, conv = API4ConversationService.get_by_id(session_id)
        assert e, "Session not found!"
@ -130,7 +118,7 @@ def completion(tenant_id, agent_id, question, session_id=None, stream=True, **kw
                    continue
                for k in ans.keys():
                    final_ans[k] = ans[k]
-                ans = {"answer": ans["content"], "reference": ans.get("reference", [])}
+                ans = {"answer": ans["content"], "reference": ans.get("reference", []), "param": canvas.get_preset_param()}
                ans = structure_answer(conv, ans, message_id, session_id)
                yield "data:" + json.dumps({"code": 0, "message": "", "data": ans},
                                           ensure_ascii=False) + "\n\n"
@ -160,8 +148,8 @@ def completion(tenant_id, agent_id, question, session_id=None, stream=True, **kw
                canvas.reference.append(final_ans["reference"])
            conv.dsl = json.loads(str(canvas))

-            result = {"answer": final_ans["content"], "reference": final_ans.get("reference", [])}
+            result = {"answer": final_ans["content"], "reference": final_ans.get("reference", []) , "param": canvas.get_preset_param()}
            result = structure_answer(conv, result, message_id, session_id)
            API4ConversationService.append_message(conv.id, conv.to_dict())
            yield result
-            break
+            break
--- a/api/db/services/dialog_service.py
+++ b/api/db/services/dialog_service.py
@ -30,7 +30,8 @@ from api import settings
 from rag.app.resume import forbidden_select_fields4resume
 from rag.app.tag import label_question
 from rag.nlp.search import index_name
-from rag.prompts import kb_prompt, message_fit_in, llm_id2llm_type, keyword_extraction, full_question, chunks_format
+from rag.prompts import kb_prompt, message_fit_in, llm_id2llm_type, keyword_extraction, full_question, chunks_format, \
+    citation_prompt
 from rag.utils import rmSpace, num_tokens_from_string
 from rag.utils.tavily_conn import Tavily

@ -235,9 +236,12 @@ def chat(dialog, messages, stream=True, **kwargs):
    gen_conf = dialog.llm_setting

    msg = [{"role": "system", "content": prompt_config["system"].format(**kwargs)}]
+    prompt4citation = ""
+    if knowledges and (prompt_config.get("quote", True) and kwargs.get("quote", True)):
+        prompt4citation = citation_prompt()
    msg.extend([{"role": m["role"], "content": re.sub(r"##\d+\$\$", "", m["content"])}
                for m in messages if m["role"] != "system"])
-    used_token_count, msg = message_fit_in(msg, int(max_tokens * 0.97))
+    used_token_count, msg = message_fit_in(msg, int(max_tokens * 0.95))
    assert len(msg) >= 2, f"message_fit_in has bug: {msg}"
    prompt = msg[0]["content"]

@ -256,14 +260,23 @@ def chat(dialog, messages, stream=True, **kwargs):
            think = ans[0] + "</think>"
            answer = ans[1]
        if knowledges and (prompt_config.get("quote", True) and kwargs.get("quote", True)):
-            answer, idx = retriever.insert_citations(answer,
-                                                     [ck["content_ltks"]
-                                                      for ck in kbinfos["chunks"]],
-                                                     [ck["vector"]
-                                                      for ck in kbinfos["chunks"]],
-                                                     embd_mdl,
-                                                     tkweight=1 - dialog.vector_similarity_weight,
-                                                     vtweight=dialog.vector_similarity_weight)
+            answer = re.sub(r"##[ij]\$\$", "", answer, flags=re.DOTALL)
+            if not re.search(r"##[0-9]+\$\$", answer):
+                answer, idx = retriever.insert_citations(answer,
+                                                         [ck["content_ltks"]
+                                                          for ck in kbinfos["chunks"]],
+                                                         [ck["vector"]
+                                                          for ck in kbinfos["chunks"]],
+                                                         embd_mdl,
+                                                         tkweight=1 - dialog.vector_similarity_weight,
+                                                         vtweight=dialog.vector_similarity_weight)
+            else:
+                idx = set([])
+                for r in re.finditer(r"##([0-9]+)\$\$", answer):
+                    i = int(r.group(1))
+                    if i < len(kbinfos["chunks"]):
+                        idx.add(i)
+
            idx = set([kbinfos["chunks"][int(i)]["doc_id"] for i in idx])
            recall_docs = [
                d for d in kbinfos["doc_aggs"] if d["doc_id"] in idx]
@ -298,7 +311,7 @@ def chat(dialog, messages, stream=True, **kwargs):
    if stream:
        last_ans = ""
        answer = ""
-        for ans in chat_mdl.chat_streamly(prompt, msg[1:], gen_conf):
+        for ans in chat_mdl.chat_streamly(prompt+prompt4citation, msg[1:], gen_conf):
            if thought:
                ans = re.sub(r"<think>.*</think>", "", ans, flags=re.DOTALL)
            answer = ans
@ -312,7 +325,7 @@ def chat(dialog, messages, stream=True, **kwargs):
            yield {"answer": thought+answer, "reference": {}, "audio_binary": tts(tts_mdl, delta_ans)}
        yield decorate_answer(thought+answer)
    else:
-        answer = chat_mdl.chat(prompt, msg[1:], gen_conf)
+        answer = chat_mdl.chat(prompt+prompt4citation, msg[1:], gen_conf)
        user_content = msg[-1].get("content", "[content not available]")
        logging.debug("User: {}|Assistant: {}".format(user_content, answer))
        res = decorate_answer(answer)
--- a/api/db/services/knowledgebase_service.py
+++ b/api/db/services/knowledgebase_service.py
@ -22,6 +22,42 @@ from peewee import fn
 class KnowledgebaseService(CommonService):
    model = Knowledgebase

+    @classmethod
+    @DB.connection_context()
+    def is_parsed_done(cls, kb_id):
+        """
+        Check if all documents in the knowledge base have completed parsing
+        
+        Args:
+            kb_id: Knowledge base ID
+            
+        Returns:
+            If all documents are parsed successfully, returns (True, None)
+            If any document is not fully parsed, returns (False, error_message)
+        """
+        from api.db import TaskStatus
+        from api.db.services.document_service import DocumentService
+        
+        # Get knowledge base information
+        kbs = cls.query(id=kb_id)
+        if not kbs:
+            return False, "Knowledge base not found"
+        kb = kbs[0]
+        
+        # Get all documents in the knowledge base
+        docs, _ = DocumentService.get_by_kb_id(kb_id, 1, 1000, "create_time", True, "")
+        
+        # Check parsing status of each document
+        for doc in docs:
+            # If document is being parsed, don't allow chat creation
+            if doc['run'] == TaskStatus.RUNNING.value or doc['run'] == TaskStatus.CANCEL.value or doc['run'] == TaskStatus.FAIL.value:
+                return False, f"Document '{doc['name']}' in dataset '{kb.name}' is still being parsed. Please wait until all documents are parsed before starting a chat."
+            # If document is not yet parsed and has no chunks, don't allow chat creation
+            if doc['run'] == TaskStatus.UNSTART.value and doc['chunk_num'] == 0:
+                return False, f"Document '{doc['name']}' in dataset '{kb.name}' has not been parsed yet. Please parse all documents before starting a chat."
+        
+        return True, None
+
    @classmethod
    @DB.connection_context()
    def list_documents_by_ids(cls,kb_ids):
--- a/conf/llm_factories.json
+++ b/conf/llm_factories.json
@ -134,6 +134,18 @@
                    "max_tokens": 32768,
                    "model_type": "chat"
                },
+                {
+                    "llm_name": "qwq-32b",
+                    "tags": "LLM,CHAT,128k",
+                    "max_tokens": 131072,
+                    "model_type": "chat"
+                },
+                {
+                    "llm_name": "qwq-plus",
+                    "tags": "LLM,CHAT,128k",
+                    "max_tokens": 131072,
+                    "model_type": "chat"
+                },
                {
                    "llm_name": "qwen-long",
                    "tags": "LLM,CHAT,10000K",
@ -3259,7 +3271,7 @@
                    "tags": "TEXT EMBEDDING,32000",
                    "max_tokens": 32000,
                    "model_type": "embedding"
-                },                
+                },
                {
                    "llm_name": "rerank-1",
                    "tags": "RE-RANK, 8000",
--- a/deepdoc/parser/excel_parser.py
+++ b/deepdoc/parser/excel_parser.py
@ -12,35 +12,63 @@
 #

 import logging
-from openpyxl import load_workbook, Workbook
 import sys
 from io import BytesIO

-from rag.nlp import find_codec
-
 import pandas as pd
+from openpyxl import Workbook, load_workbook
+
+from rag.nlp import find_codec


 class RAGFlowExcelParser:
+
    @staticmethod
    def _load_excel_to_workbook(file_like_object):
+        if isinstance(file_like_object, bytes):
+            file_like_object = BytesIO(file_like_object)
+
+        # Read first 4 bytes to determine file type
+        file_like_object.seek(0)
+        file_head = file_like_object.read(4)
+        file_like_object.seek(0)
+
+        if not (file_head.startswith(b'PK\x03\x04') or file_head.startswith(b'\xD0\xCF\x11\xE0')):
+            logging.info("****wxy: Not an Excel file, converting CSV to Excel Workbook")
+
+            try:
+                file_like_object.seek(0)
+                df = pd.read_csv(file_like_object)
+                return RAGFlowExcelParser._dataframe_to_workbook(df)
+
+            except Exception as e_csv:
+                raise Exception(f"****wxy: Failed to parse CSV and convert to Excel Workbook: {e_csv}")
+
        try:
            return load_workbook(file_like_object)
        except Exception as e:
            logging.info(f"****wxy: openpyxl load error: {e}, try pandas instead")
            try:
+                file_like_object.seek(0)
                df = pd.read_excel(file_like_object)
-                wb = Workbook()
-                ws = wb.active
-                ws.title = "Data"
-                for col_num, column_name in enumerate(df.columns, 1):
-                    ws.cell(row=1, column=col_num, value=column_name)
-                for row_num, row in enumerate(df.values, 2):
-                    for col_num, value in enumerate(row, 1):
-                        ws.cell(row=row_num, column=col_num, value=value)
-                return wb
+                return RAGFlowExcelParser._dataframe_to_workbook(df)
            except Exception as e_pandas:
-                raise Exception(f"****wxy: pandas read error: {e_pandas}, original openpyxl error: {e}")
+                raise Exception(f"****wxy: pandas.read_excel error: {e_pandas}, original openpyxl error: {e}")
+
+    @staticmethod
+    def _dataframe_to_workbook(df):
+        wb = Workbook()
+        ws = wb.active
+        ws.title = "Data"
+
+        for col_num, column_name in enumerate(df.columns, 1):
+            ws.cell(row=1, column=col_num, value=column_name)
+
+        for row_num, row in enumerate(df.values, 2):
+            for col_num, value in enumerate(row, 1):
+                ws.cell(row=row_num, column=col_num, value=value)
+
+        return wb

    def html(self, fnm, chunk_rows=256):
        file_like_object = BytesIO(fnm) if not isinstance(fnm, str) else fnm
@ -62,7 +90,7 @@ class RAGFlowExcelParser:
                tb += f"<table><caption>{sheetname}</caption>"
                tb += tb_rows_0
                for r in list(
-                  rows[1 + chunk_i * chunk_rows: 1 + (chunk_i + 1) * chunk_rows]
+                    rows[1 + chunk_i * chunk_rows: 1 + (chunk_i + 1) * chunk_rows]
                ):
                    tb += "<tr>"
                    for i, c in enumerate(r):
@ -120,4 +148,3 @@ class RAGFlowExcelParser:
 if __name__ == "__main__":
    psr = RAGFlowExcelParser()
    psr(sys.argv[1])
-
--- a/docker/.env
+++ b/docker/.env
@ -80,13 +80,13 @@ REDIS_PASSWORD=infini_rag_flow
 SVR_HTTP_PORT=9380

 # The RAGFlow Docker image to download.
-# Defaults to the v0.17.1-slim edition, which is the RAGFlow Docker image without embedding models.
-RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1-slim
+# Defaults to the v0.17.2-slim edition, which is the RAGFlow Docker image without embedding models.
+RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2-slim
 #
 # To download the RAGFlow Docker image with embedding models, uncomment the following line instead:
-# RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1
+# RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2
 # 
-# The Docker image of the v0.17.1 edition includes:
+# The Docker image of the v0.17.2 edition includes:
 # - Built-in embedding models:
 #   - BAAI/bge-large-zh-v1.5
 #   - BAAI/bge-reranker-v2-m3
@ -122,7 +122,7 @@ TIMEZONE='Asia/Shanghai'
 # HF_ENDPOINT=https://hf-mirror.com

 # Optimizations for MacOS
-# Uncomment the following line if your OS is MacOS:
+# Uncomment the following line if your operating system is MacOS:
 # MACOS=1

 # The maximum file size for each uploaded file, in bytes.
--- a/docker/README.md
+++ b/docker/README.md
@ -78,8 +78,8 @@ The [.env](./.env) file contains important environment variables for Docker.
 - `RAGFLOW-IMAGE`  
  The Docker image edition. Available editions:  
  
-  - `infiniflow/ragflow:v0.17.1-slim` (default): The RAGFlow Docker image without embedding models.  
-  - `infiniflow/ragflow:v0.17.1`: The RAGFlow Docker image with embedding models including:
+  - `infiniflow/ragflow:v0.17.2-slim` (default): The RAGFlow Docker image without embedding models.  
+  - `infiniflow/ragflow:v0.17.2`: The RAGFlow Docker image with embedding models including:
    - Built-in embedding models:
      - `BAAI/bge-large-zh-v1.5` 
      - `BAAI/bge-reranker-v2-m3`
--- a/docs/configurations.md
+++ b/docs/configurations.md
@ -97,8 +97,8 @@ The [.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env) file con
 - `RAGFLOW-IMAGE`  
  The Docker image edition. Available editions:  
  
-  - `infiniflow/ragflow:v0.17.1-slim` (default): The RAGFlow Docker image without embedding models.  
-  - `infiniflow/ragflow:v0.17.1`: The RAGFlow Docker image with embedding models including:
+  - `infiniflow/ragflow:v0.17.2-slim` (default): The RAGFlow Docker image without embedding models.  
+  - `infiniflow/ragflow:v0.17.2`: The RAGFlow Docker image with embedding models including:
    - Built-in embedding models:
      - `BAAI/bge-large-zh-v1.5` 
      - `BAAI/bge-reranker-v2-m3`
--- a/docs/develop/build_docker_image.mdx
+++ b/docs/develop/build_docker_image.mdx
@ -77,7 +77,7 @@ After building the infiniflow/ragflow:nightly-slim image, you are ready to launc

 1. Edit Docker Compose Configuration

-Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.17.1-slim` to `infiniflow/ragflow:nightly-slim` to use the pre-built image.
+Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.17.2-slim` to `infiniflow/ragflow:nightly-slim` to use the pre-built image.


 2. Launch the Service
--- a/docs/faq.md
+++ b/docs/faq.md
@ -37,12 +37,12 @@ If you build RAGFlow from source, the version number is also in the system log:
     / _, _// ___ |/ /_/ // __/  / // /_/ /| |/ |/ / 
    /_/ |_|/_/  |_|\____//_/    /_/ \____/ |__/|__/                             

-2025-02-18 10:10:43,835 INFO     1445658 RAGFlow version: v0.17.1-50-g6daae7f2 full
+2025-02-18 10:10:43,835 INFO     1445658 RAGFlow version: v0.15.0-50-g6daae7f2 full
 ```

 Where:

- `v0.17.1`: The officially published release.
+- `v0.15.0`: The officially published release.
 - `50`: The number of git commits since the official release.
 - `g6daae7f2`: `g` is the prefix, and `6daae7f2` is the first seven characters of the current commit ID.
 - `full`/`slim`: The RAGFlow edition.
@ -71,10 +71,10 @@ We officially support x86 CPU and nvidia GPU. While we also test RAGFlow on ARM6

 ### Which embedding models can be deployed locally?

-RAGFlow offers two Docker image editions, `v0.17.1-slim` and `v0.17.1`:  
+RAGFlow offers two Docker image editions, `v0.17.2-slim` and `v0.17.2`:  
  
- `infiniflow/ragflow:v0.17.1-slim` (default): The RAGFlow Docker image without embedding models.  
- `infiniflow/ragflow:v0.17.1`: The RAGFlow Docker image with embedding models including:
+- `infiniflow/ragflow:v0.17.2-slim` (default): The RAGFlow Docker image without embedding models.  
+- `infiniflow/ragflow:v0.17.2`: The RAGFlow Docker image with embedding models including:
  - Built-in embedding models:
    - `BAAI/bge-large-zh-v1.5`
    - `BAAI/bge-reranker-v2-m3`
@ -318,7 +318,7 @@ The status of a Docker container status does not necessarily reflect the status
   91220e3285dd   docker.elastic.co/elasticsearch/elasticsearch:8.11.3   "/bin/tini -- /usr/l…"   11 hours ago   Up 11 hours (healthy)     9300/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp           ragflow-es-01
   ```

-2. Follow [this document](../guides/run_health_check.md) to check the health status of the Elasticsearch service.
+2. Follow [this document](./guides/run_health_check.md) to check the health status of the Elasticsearch service.

 :::danger IMPORTANT
 The status of a Docker container status does not necessarily reflect the status of the service. You may find that your services are unhealthy even when the corresponding Docker containers are up running. Possible reasons for this include network failures, incorrect port numbers, or DNS issues.
@ -347,7 +347,7 @@ A correct Ollama IP address and port is crucial to adding models to Ollama:
 - If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address. Note that 127.0.0.1 is not a publicly accessible IP address.
 - If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can communicate with each other.

-See [Deploy a local LLM](./guides/deploy_local_llm.mdx) for more information.
+See [Deploy a local LLM](./guides/models/deploy_local_llm.mdx) for more information.

 ---

@ -395,7 +395,7 @@ Ensure that you update the **MAX_CONTENT_LENGTH** environment variable:
   cd29bcb254bc   quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z       "/usr/bin/docker-ent…"   2 weeks ago    Up 11 hours      0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp     ragflow-minio
   ```

-2. Follow [this document](../guides/run_health_check.md) to check the health status of the Elasticsearch service.
+2. Follow [this document](./guides/run_health_check.md) to check the health status of the Elasticsearch service.

 :::danger IMPORTANT
 The status of a Docker container status does not necessarily reflect the status of the service. You may find that your services are unhealthy even when the corresponding Docker containers are up running. Possible reasons for this include network failures, incorrect port numbers, or DNS issues.
@ -417,7 +417,7 @@ The status of a Docker container status does not necessarily reflect the status

 ### How to run RAGFlow with a locally deployed LLM?

-You can use Ollama or Xinference to deploy local LLM. See [here](../guides/deploy_local_llm.mdx) for more information.
+You can use Ollama or Xinference to deploy local LLM. See [here](./guides/models/deploy_local_llm.mdx) for more information.

 ---

@ -434,7 +434,7 @@ If your model is not currently supported but has APIs compatible with those of O
 - If RAGFlow is locally deployed, ensure that your RAGFlow and Ollama are in the same LAN.
 - If you are using our online demo, ensure that the IP address of your Ollama server is public and accessible.

-See [here](../guides/deploy_local_llm.mdx) for more information.
+See [here](./guides/models/deploy_local_llm.mdx) for more information.

 ---

--- a/docs/guides/agent/agent_component_reference/switch.mdx
+++ b/docs/guides/agent/agent_component_reference/switch.mdx
@ -34,6 +34,7 @@ Evaluates whether the output of specific components meets certain conditions, wi

 :::danger IMPORTANT
 When you have added multiple conditions for a specific case, a **Logical operator** field appears, requiring you to set the logical relationship between these conditions as either AND or OR.
+![Image](https://github.com/user-attachments/assets/102f006e-9906-49c2-af43-de6af03d5074)
 :::

 - **Component ID**: The ID of the corresponding component.
--- a/docs/guides/agent/general_purpose_chatbot.md
+++ b/docs/guides/agent/general_purpose_chatbot.md
@ -11,7 +11,7 @@ Create a general-purpose chatbot.

 Chatbot is one of the most common AI scenarios. However, effectively understanding user queries and responding appropriately remains a challenge. RAGFlow's general-purpose chatbot agent is our attempt to tackle this longstanding issue.  

-This chatbot closely resembles the chatbot introduced in [Start an AI chat](../start_chat.md), but with a key difference - it introduces a reflective mechanism that allows it to improve the retrieval from the target knowledge bases by rewriting the user's query.
+This chatbot closely resembles the chatbot introduced in [Start an AI chat](../chat/start_chat.md), but with a key difference - it introduces a reflective mechanism that allows it to improve the retrieval from the target knowledge bases by rewriting the user's query.

 This document provides guides on creating such a chatbot using our chatbot template.

--- a/docs/guides/chat/implement_deep_research.md
+++ b/docs/guides/chat/implement_deep_research.md
@ -9,7 +9,7 @@ Implements deep research for agentic reasoning.

 ---

-From v0.17.1 onward, RAGFlow supports integrating agentic reasoning in an AI chat. The following diagram illustrates the workflow of RAGFlow's deep research:
+From v0.17.0 onward, RAGFlow supports integrating agentic reasoning in an AI chat. The following diagram illustrates the workflow of RAGFlow's deep research:

 ![Image](https://github.com/user-attachments/assets/f65d4759-4f09-4d9d-9549-c0e1fe907525)

--- a/docs/guides/dataset/accelerate_doc_indexing.mdx
+++ b/docs/guides/dataset/accelerate_doc_indexing.mdx
@ -16,4 +16,4 @@ Please note that some of your settings may consume a significant amount of time.
 - On the configuration page of your knowledge base, switch off **Use RAPTOR to enhance retrieval**.
 - Extracting knowledge graph (GraphRAG) is time-consuming.
 - Disable **Auto-keyword** and **Auto-question** on the configuration page of yor knowledge base, as both depend on the LLM.
- **v0.17.1:** If your document is plain text PDF and does not require GPU-intensive processes like OCR (Optical Character Recognition), TSR (Table Structure Recognition), or DLA (Document Layout Analysis), you can choose **Naive** over **DeepDoc** or other time-consuming large model options in the **Document parser** dropdown. This will substantially reduce document parsing time.
+- **v0.17.0:** If your document is plain text PDF and does not require GPU-intensive processes like OCR (Optical Character Recognition), TSR (Table Structure Recognition), or DLA (Document Layout Analysis), you can choose **Naive** over **DeepDoc** or other time-consuming large model options in the **Document parser** dropdown. This will substantially reduce document parsing time.
--- a/docs/guides/dataset/configure_knowledge_base.md
+++ b/docs/guides/dataset/configure_knowledge_base.md
@ -39,18 +39,18 @@ This section covers the following topics:

 RAGFlow offers multiple chunking template to facilitate chunking files of different layouts and ensure semantic integrity. In **Chunk method**, you can choose the default template that suits the layouts and formats of your files. The following table shows the descriptions and the compatible file formats of each supported chunk template:

-| **Template** | Description                                                           | File format                                          |
-|--------------|-----------------------------------------------------------------------|------------------------------------------------------|
-| General      | Files are consecutively chunked based on a preset chunk token number. | DOCX, EXCEL, PPT, PDF, TXT, JPEG, JPG, PNG, TIF, GIF |
-| Q&A          |                                                                       | XLSX, CSV/TXT                                        |
-| Manual       |                                                                       | PDF                                                  |
-| Table        |                                                                       | XLSX, CSV/TXT                                        |
-| Paper        |                                                                       | PDF                                                  |
-| Book         |                                                                       | DOCX, PDF, TXT                                       |
-| Laws         |                                                                       | DOCX, PDF, TXT                                       |
-| Presentation |                                                                       | PDF, PPTX                                            |
-| Picture      |                                                                       | JPEG, JPG, PNG, TIF, GIF                             |
-| One          | The entire document is chunked as one.                                | DOCX, EXCEL, PDF, TXT                                |
+| **Template** | Description                                                           | File format                                                                  |
+|--------------|-----------------------------------------------------------------------|------------------------------------------------------------------------------|
+| General      | Files are consecutively chunked based on a preset chunk token number. | DOCX, XLSX, XLS (Excel97~2003), PPT, PDF, TXT, JPEG, JPG, PNG, TIF, GIF, CSV |
+| Q&A          |                                                                       | XLSX, XLS (Excel97~2003), CSV/TXT                                            |
+| Manual       |                                                                       | PDF                                                                          |
+| Table        |                                                                       | XLSX, XLS (Excel97~2003), CSV/TXT                                            |
+| Paper        |                                                                       | PDF                                                                          |
+| Book         |                                                                       | DOCX, PDF, TXT                                                               |
+| Laws         |                                                                       | DOCX, PDF, TXT                                                               |
+| Presentation |                                                                       | PDF, PPTX                                                                    |
+| Picture      |                                                                       | JPEG, JPG, PNG, TIF, GIF                                                     |
+| One          | The entire document is chunked as one.                                | DOCX, XLSX, XLS (Excel97~2003), PDF, TXT                                     |

 You can also change a file's chunk method on the **Datasets** page.

@ -130,7 +130,7 @@ See [Run retrieval test](./run_retrieval_test.md) for details.

 ## Search for knowledge base

-As of RAGFlow v0.17.1, the search feature is still in a rudimentary form, supporting only knowledge base search by name.
+As of RAGFlow v0.17.2, the search feature is still in a rudimentary form, supporting only knowledge base search by name.

 ![search knowledge base](https://github.com/infiniflow/ragflow/assets/93570324/836ae94c-2438-42be-879e-c7ad2a59693e)

--- a/docs/guides/manage_files.md
+++ b/docs/guides/manage_files.md
@ -85,4 +85,4 @@ RAGFlow's file management allows you to download an uploaded file:

 ![download_file](https://github.com/infiniflow/ragflow/assets/93570324/cf3b297f-7d9b-4522-bf5f-4f45743e4ed5)

-> As of RAGFlow v0.17.1, bulk download is not supported, nor can you download an entire folder. 
+> As of RAGFlow v0.17.2, bulk download is not supported, nor can you download an entire folder. 
--- a/docs/guides/upgrade_ragflow.mdx
+++ b/docs/guides/upgrade_ragflow.mdx
@ -62,16 +62,16 @@ To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker imag
   git clone https://github.com/infiniflow/ragflow.git
   ```

-2. Switch to the latest, officially published release, e.g., `v0.17.1`:
+2. Switch to the latest, officially published release, e.g., `v0.17.2`:

   ```bash
-   git checkout -f v0.17.1
+   git checkout -f v0.17.2
   ```

 3. Update **ragflow/docker/.env** as follows:

   ```bash
-   RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1
+   RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2
   ```

 4. Update the RAGFlow image and restart RAGFlow:
--- a/docs/quickstart.mdx
+++ b/docs/quickstart.mdx
@ -39,7 +39,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If

   `vm.max_map_count`. This value sets the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abnormal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.

-   RAGFlow v0.17.1 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.
+   RAGFlow v0.17.2 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.

 <Tabs
  defaultValue="linux"
@ -179,13 +179,13 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
   ```bash
   $ git clone https://github.com/infiniflow/ragflow.git
   $ cd ragflow/docker
-   $ git checkout -f v0.17.1
+   $ git checkout -f v0.17.2
   ```

 3. Use the pre-built Docker images and start up the server:

   :::tip NOTE
-   The command below downloads the `v0.17.1-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.17.1-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.1` for the full edition `v0.17.1`.
+   The command below downloads the `v0.17.2-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.17.2-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.17.2` for the full edition `v0.17.2`.
   :::

   ```bash
@ -198,8 +198,8 @@ This section provides instructions on setting up the RAGFlow server on Linux. If

 | RAGFlow image tag   | Image size (GB) | Has embedding models and Python packages? | Stable?                  |
 | ------------------- | --------------- | ----------------------------------------- | ------------------------ |
-| `v0.17.1`           | &approx;9       | :heavy_check_mark:                        | Stable release           |
-| `v0.17.1-slim`      | &approx;2       | ❌                                        | Stable release           |
+| `v0.17.2`           | &approx;9       | :heavy_check_mark:                        | Stable release           |
+| `v0.17.2-slim`      | &approx;2       | ❌                                        | Stable release           |
 | `nightly`           | &approx;9       | :heavy_check_mark:                        | *Unstable* nightly build |
 | `nightly-slim`      | &approx;2       | ❌                                        | *Unstable* nightly build |

--- a/docs/release_notes.md
+++ b/docs/release_notes.md
@ -7,8 +7,62 @@ slug: /release_notes

 Key features, improvements and bug fixes in the latest releases.

+## v0.17.2
+
+Released on March 13, 2025.
+
+### Improvements
+
+- Adds OpenAI-compatible APIs.
+- Introduces a German user interface.
+- Accelerates knowledge graph extraction.
+- Enables Tavily-based web search in the **Retrieval** agent component.
+- Adds Tongyi-Qianwen QwQ models (OpenAI-compatible).
+- Supports CSV files in the **General** chunk method.
+
+### Fixed issues
+
+- Unable to add models via Ollama/Xinference, an issue introduced in v0.17.1.
+
+### Related APIs
+
+#### HTTP APIs
+
+[Create chat completion](./references/http_api_reference.md#openai-compatible-api)
+
+#### Python APIs
+
+[Create chat completion](./references/python_api_reference.md#openai-compatible-api)
+
 ## v0.17.1

+Released on March 11, 2025.
+
+### Improvements
+
+- Improves English tokenization quality.
+- Improves the table extraction logic in Markdown document parsing.
+- Updates SiliconFlow's model list.
+- Supports parsing XLS files (Excel97~2003) with improved corresponding error handling.
+- Supports Huggingface rerank models.
+- Enables relative time expressions ("now", "yesterday", "last week", "next year", and more) in the **Rewrite** agent component.
+
+### Fixed issues
+
+- A repetitive knowledge graph extraction issue.
+- Issues with API calling.
+- Options in the **Document parser** dropdown are missing.
+- A Tavily web search issue.
+- Unable to preview diagrams or images in an AI chat.
+
+### Documentation
+
+#### Added documents
+
+[Use tag set](./guides/dataset/use_tag_sets.md)
+
+## v0.17.0
+
 Released on March 3, 2025.

 ### New features
@ -20,7 +74,7 @@ Released on March 3, 2025.
 - Dataset: Adds a **Document parser** dropdown menu to dataset configurations. This includes a DeepDoc model option, which is time-consuming, a much faster **naive** option (plain text), which skips DLA (Document Layout Analysis), OCR (Optical Character Recognition), and TSR (Table Structure Recognition) tasks, and several currently *experimental* large model options.
 - Agent component: **(x)** or a forward slash `/` can be used to insert available keys (variables) in the system prompt field of the **Generate** or **Template** component.
 - Object storage: Supports using Aliyun OSS (Object Storage Service) as a file storage option.
- Models: Updates the supported model list for Tongyi-Qianwen, adding DeepSeek-specific models; adds ModelScope as a model provider.
+- Models: Updates the supported model list for Tongyi-Qianwen (Qwen), adding DeepSeek-specific models; adds ModelScope as a model provider.
 - APIs: Document metadata can be updated through an API.

 The following diagram illustrates the workflow of RAGFlow's Deep Research:
--- a/graphrag/entity_resolution.py
+++ b/graphrag/entity_resolution.py
@ -13,6 +13,7 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
+import logging
 import itertools
 import re
 import time
@ -67,7 +68,7 @@ class EntityResolution(Extractor):
        self._resolution_result_delimiter_key = "resolution_result_delimiter"
        self._input_text_key = "input_text"

-    async def __call__(self, graph: nx.Graph, prompt_variables: dict[str, Any] | None = None) -> EntityResolutionResult:
+    async def __call__(self, graph: nx.Graph, prompt_variables: dict[str, Any] | None = None, callback: Callable | None = None) -> EntityResolutionResult:
        """Call method definition."""
        if prompt_variables is None:
            prompt_variables = {}
@ -93,6 +94,8 @@ class EntityResolution(Extractor):
        candidate_resolution = {entity_type: [] for entity_type in entity_types}
        for k, v in node_clusters.items():
            candidate_resolution[k] = [(a, b) for a, b in itertools.combinations(v, 2) if self.is_similarity(a, b)]
+        num_candidates = sum([len(candidates) for _, candidates in candidate_resolution.items()])
+        callback(msg=f"Identified {num_candidates} candidate pairs")

        resolution_result = set()
        async with trio.open_nursery() as nursery:
@ -100,48 +103,52 @@ class EntityResolution(Extractor):
                if not candidate_resolution_i[1]:
                    continue
                nursery.start_soon(lambda: self._resolve_candidate(candidate_resolution_i, resolution_result))
+        callback(msg=f"Resolved {num_candidates} candidate pairs, {len(resolution_result)} of them are selected to merge.")

        connect_graph = nx.Graph()
        removed_entities = []
        connect_graph.add_edges_from(resolution_result)
        all_entities_data = []
        all_relationships_data = []
+        all_remove_nodes = []

-        for sub_connect_graph in nx.connected_components(connect_graph):
-            sub_connect_graph = connect_graph.subgraph(sub_connect_graph)
-            remove_nodes = list(sub_connect_graph.nodes)
-            keep_node = remove_nodes.pop()
-            await self._merge_nodes(keep_node, self._get_entity_(remove_nodes), all_entities_data)
-            for remove_node in remove_nodes:
-                removed_entities.append(remove_node)
-                remove_node_neighbors = graph[remove_node]
-                remove_node_neighbors = list(remove_node_neighbors)
-                for remove_node_neighbor in remove_node_neighbors:
-                    rel = self._get_relation_(remove_node, remove_node_neighbor)
-                    if graph.has_edge(remove_node, remove_node_neighbor):
-                        graph.remove_edge(remove_node, remove_node_neighbor)
-                    if remove_node_neighbor == keep_node:
-                        if graph.has_edge(keep_node, remove_node):
-                            graph.remove_edge(keep_node, remove_node)
-                        continue
-                    if not rel:
-                        continue
-                    if graph.has_edge(keep_node, remove_node_neighbor):
-                        await self._merge_edges(keep_node, remove_node_neighbor, [rel], all_relationships_data)
-                    else:
-                        pair = sorted([keep_node, remove_node_neighbor])
-                        graph.add_edge(pair[0], pair[1], weight=rel['weight'])
-                        self._set_relation_(pair[0], pair[1],
-                                           dict(
-                                                src_id=pair[0],
-                                                tgt_id=pair[1],
-                                                weight=rel['weight'],
-                                                description=rel['description'],
-                                                keywords=[],
-                                                source_id=rel.get("source_id", ""),
-                                                metadata={"created_at": time.time()}
-                                           ))
-                graph.remove_node(remove_node)
+        async with trio.open_nursery() as nursery:
+            for sub_connect_graph in nx.connected_components(connect_graph):
+                sub_connect_graph = connect_graph.subgraph(sub_connect_graph)
+                remove_nodes = list(sub_connect_graph.nodes)
+                keep_node = remove_nodes.pop()
+                all_remove_nodes.append(remove_nodes)
+                nursery.start_soon(lambda: self._merge_nodes(keep_node, self._get_entity_(remove_nodes), all_entities_data))
+                for remove_node in remove_nodes:
+                    removed_entities.append(remove_node)
+                    remove_node_neighbors = graph[remove_node]
+                    remove_node_neighbors = list(remove_node_neighbors)
+                    for remove_node_neighbor in remove_node_neighbors:
+                        rel = self._get_relation_(remove_node, remove_node_neighbor)
+                        if graph.has_edge(remove_node, remove_node_neighbor):
+                            graph.remove_edge(remove_node, remove_node_neighbor)
+                        if remove_node_neighbor == keep_node:
+                            if graph.has_edge(keep_node, remove_node):
+                                graph.remove_edge(keep_node, remove_node)
+                            continue
+                        if not rel:
+                            continue
+                        if graph.has_edge(keep_node, remove_node_neighbor):
+                            nursery.start_soon(lambda: self._merge_edges(keep_node, remove_node_neighbor, [rel], all_relationships_data))
+                        else:
+                            pair = sorted([keep_node, remove_node_neighbor])
+                            graph.add_edge(pair[0], pair[1], weight=rel['weight'])
+                            self._set_relation_(pair[0], pair[1],
+                                            dict(
+                                                    src_id=pair[0],
+                                                    tgt_id=pair[1],
+                                                    weight=rel['weight'],
+                                                    description=rel['description'],
+                                                    keywords=[],
+                                                    source_id=rel.get("source_id", ""),
+                                                    metadata={"created_at": time.time()}
+                                            ))
+                    graph.remove_node(remove_node)

        return EntityResolutionResult(
            graph=graph,
@ -164,8 +171,10 @@ class EntityResolution(Extractor):
            self._input_text_key: pair_prompt
        }
        text = perform_variable_replacements(self._resolution_prompt, variables=variables)
+        logging.info(f"Created resolution prompt {len(text)} bytes for {len(candidate_resolution_i[1])} entity pairs of type {candidate_resolution_i[0]}")
        async with chat_limiter:
            response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
+        logging.debug(f"_resolve_candidate chat prompt: {text}\nchat response: {response}")
        result = self._process_results(len(candidate_resolution_i[1]), response,
                                       self.prompt_variables.get(self._record_delimiter_key,
                                                            DEFAULT_RECORD_DELIMITER),
--- a/graphrag/general/community_reports_extractor.py
+++ b/graphrag/general/community_reports_extractor.py
@ -19,7 +19,6 @@ from graphrag.general.leiden import add_community_info2graph
 from rag.llm.chat_model import Base as CompletionLLM
 from graphrag.utils import perform_variable_replacements, dict_has_keys_with_types, chat_limiter
 from rag.utils import num_tokens_from_string
-from timeit import default_timer as timer
 import trio


@ -62,62 +61,69 @@ class CommunityReportsExtractor(Extractor):
        res_str = []
        res_dict = []
        over, token_count = 0, 0
-        st = timer()
-        for level, comm in communities.items():
-            logging.info(f"Level {level}: Community: {len(comm.keys())}")
-            for cm_id, ents in comm.items():
-                weight = ents["weight"]
-                ents = ents["nodes"]
-                ent_df = pd.DataFrame(self._get_entity_(ents)).dropna()#[{"entity": n, **graph.nodes[n]} for n in ents])
-                if ent_df.empty or "entity_name" not in ent_df.columns:
-                    continue
-                ent_df["entity"] = ent_df["entity_name"]
-                del ent_df["entity_name"]
-                rela_df = pd.DataFrame(self._get_relation_(list(ent_df["entity"]), list(ent_df["entity"]), 10000))
-                if rela_df.empty:
-                    continue
-                rela_df["source"] = rela_df["src_id"]
-                rela_df["target"] = rela_df["tgt_id"]
-                del rela_df["src_id"]
-                del rela_df["tgt_id"]
+        async def extract_community_report(community):
+            nonlocal res_str, res_dict, over, token_count
+            cm_id, ents = community
+            weight = ents["weight"]
+            ents = ents["nodes"]
+            ent_df = pd.DataFrame(self._get_entity_(ents)).dropna()
+            if ent_df.empty or "entity_name" not in ent_df.columns:
+                return
+            ent_df["entity"] = ent_df["entity_name"]
+            del ent_df["entity_name"]
+            rela_df = pd.DataFrame(self._get_relation_(list(ent_df["entity"]), list(ent_df["entity"]), 10000))
+            if rela_df.empty:
+                return
+            rela_df["source"] = rela_df["src_id"]
+            rela_df["target"] = rela_df["tgt_id"]
+            del rela_df["src_id"]
+            del rela_df["tgt_id"]

-                prompt_variables = {
-                    "entity_df": ent_df.to_csv(index_label="id"),
-                    "relation_df": rela_df.to_csv(index_label="id")
-                }
-                text = perform_variable_replacements(self._extraction_prompt, variables=prompt_variables)
-                gen_conf = {"temperature": 0.3}
-                async with chat_limiter:
-                    response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
-                token_count += num_tokens_from_string(text + response)
-                response = re.sub(r"^[^\{]*", "", response)
-                response = re.sub(r"[^\}]*$", "", response)
-                response = re.sub(r"\{\{", "{", response)
-                response = re.sub(r"\}\}", "}", response)
-                logging.debug(response)
-                try:
-                    response = json.loads(response)
-                except json.JSONDecodeError as e:
-                    logging.error(f"Failed to parse JSON response: {e}")
-                    logging.error(f"Response content: {response}")
-                    continue
-                if not dict_has_keys_with_types(response, [
-                            ("title", str),
-                            ("summary", str),
-                            ("findings", list),
-                            ("rating", float),
-                            ("rating_explanation", str),
-                        ]):
-                    continue
-                response["weight"] = weight
-                response["entities"] = ents
+            prompt_variables = {
+                "entity_df": ent_df.to_csv(index_label="id"),
+                "relation_df": rela_df.to_csv(index_label="id")
+            }
+            text = perform_variable_replacements(self._extraction_prompt, variables=prompt_variables)
+            gen_conf = {"temperature": 0.3}
+            async with chat_limiter:
+                response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
+            token_count += num_tokens_from_string(text + response)
+            response = re.sub(r"^[^\{]*", "", response)
+            response = re.sub(r"[^\}]*$", "", response)
+            response = re.sub(r"\{\{", "{", response)
+            response = re.sub(r"\}\}", "}", response)
+            logging.debug(response)
+            try:
+                response = json.loads(response)
+            except json.JSONDecodeError as e:
+                logging.error(f"Failed to parse JSON response: {e}")
+                logging.error(f"Response content: {response}")
+                return
+            if not dict_has_keys_with_types(response, [
+                        ("title", str),
+                        ("summary", str),
+                        ("findings", list),
+                        ("rating", float),
+                        ("rating_explanation", str),
+                    ]):
+                return
+            response["weight"] = weight
+            response["entities"] = ents
+            add_community_info2graph(graph, ents, response["title"])
+            res_str.append(self._get_text_output(response))
+            res_dict.append(response)
+            over += 1
+            if callback:
+                callback(msg=f"Communities: {over}/{total}, used tokens: {token_count}")

-                add_community_info2graph(graph, ents, response["title"])
-                res_str.append(self._get_text_output(response))
-                res_dict.append(response)
-                over += 1
-                if callback:
-                    callback(msg=f"Communities: {over}/{total}, elapsed: {timer() - st}s, used tokens: {token_count}")
+        st = trio.current_time()
+        async with trio.open_nursery() as nursery:
+            for level, comm in communities.items():
+                logging.info(f"Level {level}: Community: {len(comm.keys())}")
+                for community in comm.items():
+                    nursery.start_soon(lambda: extract_community_report(community))
+        if callback:
+            callback(msg=f"Community reports done in {trio.current_time() - st:.2f}s, used tokens: {token_count}")

        return CommunityReportsResult(
            structured_output=res_dict,
--- a/graphrag/general/index.py
+++ b/graphrag/general/index.py
@ -228,7 +228,7 @@ async def resolve_entities(
        get_relation=partial(get_relation, tenant_id, kb_id),
        set_relation=partial(set_relation, tenant_id, kb_id, embed_bdl),
    )
-    reso = await er(graph)
+    reso = await er(graph, callback=callback)
    graph = reso.graph
    callback(msg=f"Graph resolution removed {len(reso.removed_entities)} nodes.")
    await update_nodes_pagerank_nhop_neighbour(tenant_id, kb_id, graph, 2)
--- a/graphrag/utils.py
+++ b/graphrag/utils.py
@ -237,8 +237,33 @@ def is_float_regex(value):
 def chunk_id(chunk):
    return xxhash.xxh64((chunk["content_with_weight"] + chunk["kb_id"]).encode("utf-8")).hexdigest()

+def get_entity_cache(tenant_id, kb_id, ent_name) -> str | list[str]:
+    hasher = xxhash.xxh64()
+    hasher.update(str(tenant_id).encode("utf-8"))
+    hasher.update(str(kb_id).encode("utf-8"))
+    hasher.update(str(ent_name).encode("utf-8"))
+
+    k = hasher.hexdigest()
+    bin = REDIS_CONN.get(k)
+    if not bin:
+        return
+    return json.loads(bin)
+
+
+def set_entity_cache(tenant_id, kb_id, ent_name, content_with_weight):
+    hasher = xxhash.xxh64()
+    hasher.update(str(tenant_id).encode("utf-8"))
+    hasher.update(str(kb_id).encode("utf-8"))
+    hasher.update(str(ent_name).encode("utf-8"))
+
+    k = hasher.hexdigest()
+    REDIS_CONN.set(k, content_with_weight.encode("utf-8"), 3600)
+

 def get_entity(tenant_id, kb_id, ent_name):
+    cache = get_entity_cache(tenant_id, kb_id, ent_name)
+    if cache:
+        return cache
    conds = {
        "fields": ["content_with_weight"],
        "entity_kwd": ent_name,
@ -250,6 +275,7 @@ def get_entity(tenant_id, kb_id, ent_name):
    for id in es_res.ids:
        try:
            if isinstance(ent_name, str):
+                set_entity_cache(tenant_id, kb_id, ent_name, es_res.field[id]["content_with_weight"])
                return json.loads(es_res.field[id]["content_with_weight"])
            res.append(json.loads(es_res.field[id]["content_with_weight"]))
        except Exception:
@ -272,6 +298,7 @@ def set_entity(tenant_id, kb_id, embd_mdl, ent_name, meta):
        "available_int": 0
    }
    chunk["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(chunk["content_ltks"])
+    set_entity_cache(tenant_id, kb_id, ent_name, chunk["content_with_weight"])
    res = settings.retrievaler.search({"entity_kwd": ent_name, "size": 1, "fields": []},
                                      search.index_name(tenant_id), [kb_id])
    if res.ids:
@ -489,15 +516,16 @@ async def update_nodes_pagerank_nhop_neighbour(tenant_id, kb_id, graph, n_hop):
        return nbrs

    pr = nx.pagerank(graph)
-    for n, p in pr.items():
-        graph.nodes[n]["pagerank"] = p
-        try:
-            await trio.to_thread.run_sync(lambda: settings.docStoreConn.update({"entity_kwd": n, "kb_id": kb_id},
-                                         {"rank_flt": p,
-                                          "n_hop_with_weight": json.dumps( (n), ensure_ascii=False)},
-                                         search.index_name(tenant_id), kb_id))
-        except Exception as e:
-            logging.exception(e)
+    try:
+        async with trio.open_nursery() as nursery:
+            for n, p in pr.items():
+                graph.nodes[n]["pagerank"] = p
+                nursery.start_soon(lambda: trio.to_thread.run_sync(lambda: settings.docStoreConn.update({"entity_kwd": n, "kb_id": kb_id},
+                                                {"rank_flt": p,
+                                                "n_hop_with_weight": json.dumps((n), ensure_ascii=False)},
+                                                search.index_name(tenant_id), kb_id)))
+    except Exception as e:
+        logging.exception(e)

    ty2ents = defaultdict(list)
    for p, r in sorted(pr.items(), key=lambda x: x[1], reverse=True):
--- a/helm/values.yaml
+++ b/helm/values.yaml
@ -27,13 +27,13 @@ env:
  REDIS_PASSWORD: infini_rag_flow_helm

  # The RAGFlow Docker image to download.
-  # Defaults to the v0.17.1-slim edition, which is the RAGFlow Docker image without embedding models.
-  RAGFLOW_IMAGE: infiniflow/ragflow:v0.17.1-slim
+  # Defaults to the v0.17.2-slim edition, which is the RAGFlow Docker image without embedding models.
+  RAGFLOW_IMAGE: infiniflow/ragflow:v0.17.2-slim
  #
  # To download the RAGFlow Docker image with embedding models, uncomment the following line instead:
-  # RAGFLOW_IMAGE: infiniflow/ragflow:v0.17.1
+  # RAGFLOW_IMAGE: infiniflow/ragflow:v0.17.2
  #
-  # The Docker image of the v0.17.1 edition includes:
+  # The Docker image of the v0.17.2 edition includes:
  # - Built-in embedding models:
  #   - BAAI/bge-large-zh-v1.5
  #   - BAAI/bge-reranker-v2-m3
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "ragflow"
-version = "0.17.1"
+version = "0.17.2"
 description = "[RAGFlow](https://ragflow.io/) is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data."
 authors = [
    { name = "Zhichang Yu", email = "yuzhichang@gmail.com" }
--- a/rag/app/naive.py
+++ b/rag/app/naive.py
@ -240,7 +240,7 @@ def chunk(filename, binary=None, from_page=0, to_page=100000,
                                      callback=callback)
        res = tokenize_table(tables, doc, is_english)

-    elif re.search(r"\.xlsx?$", filename, re.IGNORECASE):
+    elif re.search(r"\.(csv|xlsx?)$", filename, re.IGNORECASE):
        callback(0.1, "Start to parse.")
        excel_parser = ExcelParser()
        if parser_config.get("html4excel"):
@ -307,9 +307,7 @@ def chunk(filename, binary=None, from_page=0, to_page=100000,
 if __name__ == "__main__":
    import sys

-
    def dummy(prog=None, msg=""):
        pass

-
    chunk(sys.argv[1], from_page=0, to_page=10, callback=dummy)
--- a/rag/llm/chat_model.py
+++ b/rag/llm/chat_model.py
@ -29,8 +29,8 @@ import json
 import requests
 import asyncio

-LENGTH_NOTIFICATION_CN = "······\n由于长度的原因，回答被截断了，要继续吗？"
-LENGTH_NOTIFICATION_EN = "...\nFor the content length reason, it stopped, continue?"
+LENGTH_NOTIFICATION_CN = "······\n由于大模型的上下文窗口大小限制，回答已经被大模型截断。"
+LENGTH_NOTIFICATION_EN = "...\nThe answer is truncated by your chosen LLM due to its limitation on context length."


 class Base(ABC):
@ -268,13 +268,13 @@ class QWenChat(Base):
        import dashscope
        dashscope.api_key = key
        self.model_name = model_name
-        if model_name.lower().find("deepseek") >= 0:
+        if self.is_reasoning_model(self.model_name):
            super().__init__(key, model_name, "https://dashscope.aliyuncs.com/compatible-mode/v1")

    def chat(self, system, history, gen_conf):
        if "max_tokens" in gen_conf:
            del gen_conf["max_tokens"]
-        if self.model_name.lower().find("deepseek") >= 0:
+        if self.is_reasoning_model(self.model_name):
            return super().chat(system, history, gen_conf)

        stream_flag = str(os.environ.get('QWEN_CHAT_BY_STREAM', 'true')).lower() == 'true'
@ -348,11 +348,19 @@ class QWenChat(Base):
    def chat_streamly(self, system, history, gen_conf):
        if "max_tokens" in gen_conf:
            del gen_conf["max_tokens"]
-        if self.model_name.lower().find("deepseek") >= 0:
+        if self.is_reasoning_model(self.model_name):
            return super().chat_streamly(system, history, gen_conf)

        return self._chat_streamly(system, history, gen_conf)

+    @staticmethod
+    def is_reasoning_model(model_name: str) -> bool:
+        return any([
+            model_name.lower().find("deepseek") >= 0,
+            model_name.lower().find("qwq") >= 0 and model_name.lower() != 'qwq-32b-preview',
+        ])
+
+

 class ZhipuChat(Base):
    def __init__(self, key, model_name="glm-3-turbo", **kwargs):
@ -740,7 +748,7 @@ class BedrockChat(Base):
        self.bedrock_sk = json.loads(key).get('bedrock_sk', '')
        self.bedrock_region = json.loads(key).get('bedrock_region', '')
        self.model_name = model_name
-        
+
        if self.bedrock_ak == '' or self.bedrock_sk == '' or self.bedrock_region == '':
            # Try to create a client using the default credentials (AWS_PROFILE, AWS_DEFAULT_REGION, etc.)
            self.client = boto3.client('bedrock-runtime')
--- a/rag/nlp/init.py
+++ b/rag/nlp/init.py
@ -53,7 +53,8 @@ all_codecs = [
 def find_codec(blob):
    detected = chardet.detect(blob[:1024])
    if detected['confidence'] > 0.5:
-        return detected['encoding']
+        if detected['encoding'] == "ascii":
+            return "utf-8"

    for c in all_codecs:
        try:
--- a/rag/prompts.py
+++ b/rag/prompts.py
@ -108,22 +108,63 @@ def kb_prompt(kbinfos, max_tokens):
    docs = {d.id: d.meta_fields for d in docs}

    doc2chunks = defaultdict(lambda: {"chunks": [], "meta": []})
-    for ck in kbinfos["chunks"][:chunks_num]:
-        doc2chunks[ck["docnm_kwd"]]["chunks"].append((f"URL: {ck['url']}\n" if "url" in ck else "") + ck["content_with_weight"])
+    for i, ck in enumerate(kbinfos["chunks"][:chunks_num]):
+        doc2chunks[ck["docnm_kwd"]]["chunks"].append((f"URL: {ck['url']}\n" if "url" in ck else "") + f"ID: {i}\n" + ck["content_with_weight"])
        doc2chunks[ck["docnm_kwd"]]["meta"] = docs.get(ck["doc_id"], {})

    knowledges = []
    for nm, cks_meta in doc2chunks.items():
-        txt = f"Document: {nm} \n"
+        txt = f"\nDocument: {nm} \n"
        for k, v in cks_meta["meta"].items():
            txt += f"{k}: {v}\n"
        txt += "Relevant fragments as following:\n"
        for i, chunk in enumerate(cks_meta["chunks"], 1):
-            txt += f"{i}. {chunk}\n"
+            txt += f"{chunk}\n"
        knowledges.append(txt)
    return knowledges


+def citation_prompt():
+    return """
+
+# Citation requirements:
+- Inserts CITATIONS in format '##i$$ ##j$$' where i,j are the ID of the content you are citing and encapsulated with '##' and '$$'.
+- Inserts the CITATION symbols at the end of a sentence, AND NO MORE than 4 citations.
+- DO NOT insert CITATION in the answer if the content is not from retrieved chunks.
+
+--- Example START ---
+<SYSTEM>: Here is the knowledge base:
+
+Document: Elon Musk Breaks Silence on Crypto, Warns Against Dogecoin ...
+URL: https://blockworks.co/news/elon-musk-crypto-dogecoin
+ID: 0
+The Tesla co-founder advised against going all-in on dogecoin, but Elon Musk said it’s still his favorite crypto...
+
+Document: Elon Musk's Dogecoin tweet sparks social media frenzy
+ID: 1
+Musk said he is 'willing to serve' D.O.G.E. – shorthand for Dogecoin.
+
+Document: Causal effect of Elon Musk tweets on Dogecoin price
+ID: 2
+If you think of Dogecoin — the cryptocurrency based on a meme — you can’t help but also think of Elon Musk...
+
+Document: Elon Musk's Tweet Ignites Dogecoin's Future In Public Services
+ID: 3
+The market is heating up after Elon Musk's announcement about Dogecoin. Is this a new era for crypto?...
+
+      The above is the knowledge base.
+
+<USER>: What's the Elon's view on dogecoin?
+
+<ASSISTANT>: Musk has consistently expressed his fondness for Dogecoin, often citing its humor and the inclusion of dogs in its branding. He has referred to it as his favorite cryptocurrency ##0$$ ##1$$.
+Recently, Musk has hinted at potential future roles for Dogecoin. His tweets have sparked speculation about Dogecoin's potential integration into public services ##3$$.
+Overall, while Musk enjoys Dogecoin and often promotes it, he also warns against over-investing in it, reflecting both his personal amusement and caution regarding its speculative nature.
+
+--- Example END ---
+
+"""
+
+
 def keyword_extraction(chat_mdl, content, topn=3):
    prompt = f"""
 Role: You're a text analyzer. 
--- a/rag/svr/task_executor.py
+++ b/rag/svr/task_executor.py
@ -26,7 +26,6 @@ from rag.prompts import keyword_extraction, question_proposal, content_tagging

 CONSUMER_NO = "0" if len(sys.argv) < 2 else sys.argv[1]
 CONSUMER_NAME = "task_executor_" + CONSUMER_NO
-initRootLogger(CONSUMER_NAME)

 import logging
 import os
@ -40,10 +39,10 @@ from io import BytesIO
 from multiprocessing.context import TimeoutError
 from timeit import default_timer as timer
 import tracemalloc
-import resource
 import signal
 import trio
 import exceptiongroup
+import faulthandler

 import numpy as np
 from peewee import DoesNotExist
@ -117,7 +116,13 @@ def start_tracemalloc_and_snapshot(signum, frame):
    snapshot = tracemalloc.take_snapshot()
    snapshot.dump(snapshot_file)
    current, peak = tracemalloc.get_traced_memory()
-    max_rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
+    if sys.platform == "win32":
+        import  psutil
+        process = psutil.Process()
+        max_rss = process.memory_info().rss / 1024
+    else:
+        import resource
+        max_rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
    logging.info(f"taken snapshot {snapshot_file}. max RSS={max_rss / 1000:.2f} MB, current memory usage: {current / 10**6:.2f} MB, Peak memory usage: {peak / 10**6:.2f} MB")

 # SIGUSR2 handler: stop tracemalloc
@ -134,30 +139,35 @@ class TaskCanceledException(Exception):


 def set_progress(task_id, from_page=0, to_page=-1, prog=None, msg="Processing..."):
-    if prog is not None and prog < 0:
-        msg = "[ERROR]" + msg
-    cancel = TaskService.do_cancel(task_id)
+    try:
+        if prog is not None and prog < 0:
+            msg = "[ERROR]" + msg
+        cancel = TaskService.do_cancel(task_id)

-    if cancel:
-        msg += " [Canceled]"
-        prog = -1
+        if cancel:
+            msg += " [Canceled]"
+            prog = -1

-    if to_page > 0:
+        if to_page > 0:
+            if msg:
+                if from_page < to_page:
+                    msg = f"Page({from_page + 1}~{to_page + 1}): " + msg
        if msg:
-            if from_page < to_page:
-                msg = f"Page({from_page + 1}~{to_page + 1}): " + msg
-    if msg:
-        msg = datetime.now().strftime("%H:%M:%S") + " " + msg
-    d = {"progress_msg": msg}
-    if prog is not None:
-        d["progress"] = prog
+            msg = datetime.now().strftime("%H:%M:%S") + " " + msg
+        d = {"progress_msg": msg}
+        if prog is not None:
+            d["progress"] = prog

-    logging.info(f"set_progress({task_id}), progress: {prog}, progress_msg: {msg}")
-    TaskService.update_progress(task_id, d)
+        TaskService.update_progress(task_id, d)

-    close_connection()
-    if cancel:
-        raise TaskCanceledException(msg)
+        close_connection()
+        if cancel:
+            raise TaskCanceledException(msg)
+        logging.info(f"set_progress({task_id}), progress: {prog}, progress_msg: {msg}")
+    except DoesNotExist:
+        logging.warning(f"set_progress({task_id}) got exception DoesNotExist")
+    except Exception:
+        logging.exception(f"set_progress({task_id}), progress: {prog}, progress_msg: {msg}, got exception")

 async def collect():
    global CONSUMER_NAME, DONE_TASKS, FAILED_TASKS
@ -644,8 +654,9 @@ async def main():
    logging.info(f'TaskExecutor: RAGFlow version: {get_ragflow_version()}')
    settings.init_settings()
    print_rag_settings()
-    signal.signal(signal.SIGUSR1, start_tracemalloc_and_snapshot)
-    signal.signal(signal.SIGUSR2, stop_tracemalloc)
+    if sys.platform != "win32":
+        signal.signal(signal.SIGUSR1, start_tracemalloc_and_snapshot)
+        signal.signal(signal.SIGUSR2, stop_tracemalloc)
    TRACE_MALLOC_ENABLED = int(os.environ.get('TRACE_MALLOC_ENABLED', "0"))
    if TRACE_MALLOC_ENABLED:
        start_tracemalloc_and_snapshot(None, None)
@ -658,4 +669,6 @@ async def main():
    logging.error("BUG!!! You should not reach here!!!")

 if __name__ == "__main__":
+    faulthandler.enable()
+    initRootLogger(CONSUMER_NAME)
    trio.run(main)
--- a/rag/utils/tavily_conn.py
+++ b/rag/utils/tavily_conn.py
@ -27,7 +27,8 @@ class Tavily:
        try:
            response = self.tavily_client.search(
                query=query,
-                search_depth="advanced"
+                search_depth="advanced",
+                max_results=6
            )
            return [{"url": res["url"], "title": res["title"], "content": res["content"], "score": res["score"]} for res in response["results"]]
        except Exception as e:
--- a/sdk/python/pyproject.toml
+++ b/sdk/python/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "ragflow-sdk"
-version = "0.17.1"
+version = "0.17.2"
 description = "Python client sdk of [RAGFlow](https://github.com/infiniflow/ragflow). RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding."
 authors = [
    { name = "Zhichang Yu", email = "yuzhichang@gmail.com" }
@ -11,7 +11,13 @@ requires-python = ">=3.10,<3.13"
 dependencies = [
    "requests>=2.30.0,<3.0.0",
    "beartype>=0.18.5,<0.19.0",
-    "pytest>=8.0.0,<9.0.0"
+    "pytest>=8.0.0,<9.0.0",
+    "requests-toolbelt>=1.0.0",
+    "python-docx>=1.1.2",
+    "openpyxl>=3.1.5",
+    "python-pptx>=1.0.2",
+    "pillow>=11.1.0",
+    "reportlab>=4.3.1",
 ]

 [project.optional-dependencies]
@ -23,4 +29,4 @@ test = [
 markers = [
    "slow: marks tests as slow (deselect with '-m \"not slow\"')",
    "wip: marks tests as work in progress (deselect with '-m \"not wip\"')"
-]
+]
--- a/sdk/python/test/data/logo.svg
+++ b/sdk/python/test/data/logo.svg
@ -1,29 +0,0 @@
-<svg width="32" height="34" viewBox="0 0 32 34" fill="none" xmlns="http://www.w3.org/2000/svg">
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M3.43265 20.7677C4.15835 21.5062 4.15834 22.7035 3.43262 23.4419L3.39546 23.4797C2.66974 24.2182 1.49312 24.2182 0.767417 23.4797C0.0417107 22.7412 0.0417219 21.544 0.767442 20.8055L0.804608 20.7677C1.53033 20.0292 2.70694 20.0293 3.43265 20.7677Z"
-        fill="#B2DDFF" />
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M12.1689 21.3375C12.8933 22.0773 12.8912 23.2746 12.1641 24.0117L7.01662 29.2307C6.2896 29.9678 5.11299 29.9657 4.38859 29.2259C3.66419 28.4861 3.66632 27.2888 4.39334 26.5517L9.54085 21.3327C10.2679 20.5956 11.4445 20.5977 12.1689 21.3375Z"
-        fill="#53B1FD" />
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M19.1551 30.3217C19.7244 29.4528 20.8781 29.218 21.7321 29.7973L21.8436 29.8729C22.6975 30.4522 22.9283 31.6262 22.359 32.4952C21.7897 33.3641 20.6359 33.5989 19.782 33.0196L19.6705 32.944C18.8165 32.3647 18.5858 31.1907 19.1551 30.3217Z"
-        fill="#B2DDFF" />
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M31.4184 20.6544C32.1441 21.3929 32.1441 22.5902 31.4184 23.3286L28.8911 25.9003C28.1654 26.6388 26.9887 26.6388 26.263 25.9003C25.5373 25.1619 25.5373 23.9646 26.263 23.2261L28.7903 20.6544C29.516 19.916 30.6927 19.916 31.4184 20.6544Z"
-        fill="#53B1FD" />
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M31.4557 11.1427C32.1814 11.8812 32.1814 13.0785 31.4557 13.8169L12.7797 32.8209C12.054 33.5594 10.8774 33.5594 10.1517 32.8209C9.42599 32.0825 9.42599 30.8852 10.1517 30.1467L28.8277 11.1427C29.5534 10.4043 30.73 10.4043 31.4557 11.1427Z"
-        fill="#1570EF" />
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M27.925 5.29994C28.6508 6.0384 28.6508 7.23568 27.925 7.97414L17.184 18.9038C16.4583 19.6423 15.2817 19.6423 14.556 18.9038C13.8303 18.1653 13.8303 16.9681 14.556 16.2296L25.297 5.29994C26.0227 4.56148 27.1993 4.56148 27.925 5.29994Z"
-        fill="#1570EF" />
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M22.256 1.59299C22.9822 2.33095 22.983 3.52823 22.2578 4.26718L8.45055 18.3358C7.72533 19.0748 6.54871 19.0756 5.82251 18.3376C5.09631 17.5996 5.09552 16.4024 5.82075 15.6634L19.6279 1.59478C20.3532 0.855827 21.5298 0.855022 22.256 1.59299Z"
-        fill="#1570EF" />
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M8.58225 6.09619C9.30671 6.83592 9.30469 8.0332 8.57772 8.77038L3.17006 14.2541C2.4431 14.9913 1.26649 14.9893 0.542025 14.2495C-0.182438 13.5098 -0.180413 12.3125 0.546548 11.5753L5.95421 6.09159C6.68117 5.3544 7.85778 5.35646 8.58225 6.09619Z"
-        fill="#53B1FD" />
-    <path fill-rule="evenodd" clip-rule="evenodd"
-        d="M11.893 0.624023C12.9193 0.624023 13.7513 1.47063 13.7513 2.51497V2.70406C13.7513 3.7484 12.9193 4.59501 11.893 4.59501C10.8667 4.59501 10.0347 3.7484 10.0347 2.70406V2.51497C10.0347 1.47063 10.8667 0.624023 11.893 0.624023Z"
-        fill="#B2DDFF" />
-</svg>
--- a/sdk/python/test/test_http_api/test_dataset_mangement/conftest.py
+++ b/sdk/python/test/test_http_api/test_dataset_mangement/conftest.py
@ -13,12 +13,3 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
-
-import pytest
-from common import delete_dataset
-
-
-@pytest.fixture(scope="function", autouse=True)
-def clear_datasets(get_http_api_auth):
-    yield
-    delete_dataset(get_http_api_auth)
--- a/sdk/python/test/libs/utils/init.py
+++ b/sdk/python/test/libs/utils/init.py
@ -0,0 +1,25 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import base64
+from pathlib import Path
+
+
+def encode_avatar(image_path):
+    with Path.open(image_path, "rb") as file:
+        binary_data = file.read()
+    base64_encoded = base64.b64encode(binary_data).decode("utf-8")
+    return base64_encoded
--- a/sdk/python/test/libs/utils/file_utils.py
+++ b/sdk/python/test/libs/utils/file_utils.py
@ -0,0 +1,107 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import json
+
+from docx import Document  # pip install python-docx
+from openpyxl import Workbook  # pip install openpyxl
+from PIL import Image, ImageDraw  # pip install Pillow
+from pptx import Presentation  # pip install python-pptx
+from reportlab.pdfgen import canvas  # pip install reportlab
+
+
+def create_docx_file(path):
+    doc = Document()
+    doc.add_paragraph("这是一个测试 DOCX 文件。")
+    doc.save(path)
+    return path
+
+
+def create_excel_file(path):
+    wb = Workbook()
+    ws = wb.active
+    ws["A1"] = "测试 Excel 文件"
+    wb.save(path)
+    return path
+
+
+def create_ppt_file(path):
+    prs = Presentation()
+    slide = prs.slides.add_slide(prs.slide_layouts[0])
+    slide.shapes.title.text = "测试 PPT 文件"
+    prs.save(path)
+    return path
+
+
+def create_image_file(path):
+    img = Image.new("RGB", (100, 100), color="blue")
+    draw = ImageDraw.Draw(img)
+    draw.text((10, 40), "Test", fill="white")
+    img.save(path)
+    return path
+
+
+def create_pdf_file(path):
+    if not isinstance(path, str):
+        path = str(path)
+    c = canvas.Canvas(path)
+    c.drawString(100, 750, "测试 PDF 文件")
+    c.save()
+    return path
+
+
+def create_txt_file(path):
+    with open(path, "w", encoding="utf-8") as f:
+        f.write("这是测试 TXT 文件的内容。")
+    return path
+
+
+def create_md_file(path):
+    md_content = "# 测试 MD 文件\n\n这是一份 Markdown 格式的测试文件。"
+    with open(path, "w", encoding="utf-8") as f:
+        f.write(md_content)
+    return path
+
+
+def create_json_file(path):
+    data = {"message": "这是测试 JSON 文件", "value": 123}
+    with open(path, "w", encoding="utf-8") as f:
+        json.dump(data, f, indent=2)
+    return path
+
+
+def create_eml_file(path):
+    eml_content = (
+        "From: sender@example.com\n"
+        "To: receiver@example.com\n"
+        "Subject: 测试 EML 文件\n\n"
+        "这是一封测试邮件的内容。\n"
+    )
+    with open(path, "w", encoding="utf-8") as f:
+        f.write(eml_content)
+    return path
+
+
+def create_html_file(path):
+    html_content = (
+        "<html>\n"
+        "<head><title>测试 HTML 文件</title></head>\n"
+        "<body><h1>这是一个测试 HTML 文件</h1></body>\n"
+        "</html>"
+    )
+    with open(path, "w", encoding="utf-8") as f:
+        f.write(html_content)
+    return path
--- a/sdk/python/test/test_frontend_api/common.py
+++ b/sdk/python/test/test_frontend_api/common.py
@ -68,7 +68,7 @@ def upload_file(auth, dataset_id, path):

 def list_document(auth, dataset_id):
    authorization = {"Authorization": auth}
-    url = f"{HOST_ADDRESS}/v1/document/list?kb_id={dataset_id}"
+    url = f"{HOST_ADDRESS}/v1/document/list?kb_id={dataset_id}" 
    res = requests.get(url=url, headers=authorization)
    return res.json()

@ -85,7 +85,7 @@ def parse_docs(auth, doc_ids):
    authorization = {"Authorization": auth}
    json_req = {
        "doc_ids": doc_ids,
-        "run": 1
+        "run": 1 
    }
    url = f"{HOST_ADDRESS}/v1/document/run"
    res = requests.post(url=url, headers=authorization, json=json_req)
--- a/sdk/python/test/test_http_api/common.py
+++ b/sdk/python/test/test_http_api/common.py
@ -0,0 +1,101 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import os
+from pathlib import Path
+
+import requests
+from requests_toolbelt import MultipartEncoder
+
+HEADERS = {"Content-Type": "application/json"}
+HOST_ADDRESS = os.getenv("HOST_ADDRESS", "http://127.0.0.1:9380")
+DATASETS_API_URL = "/api/v1/datasets"
+FILE_API_URL = "/api/v1/datasets/{dataset_id}/documents"
+
+INVALID_API_TOKEN = "invalid_key_123"
+DATASET_NAME_LIMIT = 128
+DOCUMENT_NAME_LIMIT = 128
+
+
+# DATASET MANAGEMENT
+def create_dataset(auth, payload):
+    res = requests.post(
+        url=f"{HOST_ADDRESS}{DATASETS_API_URL}",
+        headers=HEADERS,
+        auth=auth,
+        json=payload,
+    )
+    return res.json()
+
+
+def list_dataset(auth, params=None):
+    res = requests.get(
+        url=f"{HOST_ADDRESS}{DATASETS_API_URL}",
+        headers=HEADERS,
+        auth=auth,
+        params=params,
+    )
+    return res.json()
+
+
+def update_dataset(auth, dataset_id, payload):
+    res = requests.put(
+        url=f"{HOST_ADDRESS}{DATASETS_API_URL}/{dataset_id}",
+        headers=HEADERS,
+        auth=auth,
+        json=payload,
+    )
+    return res.json()
+
+
+def delete_dataset(auth, payload=None):
+    res = requests.delete(
+        url=f"{HOST_ADDRESS}{DATASETS_API_URL}",
+        headers=HEADERS,
+        auth=auth,
+        json=payload,
+    )
+    return res.json()
+
+
+def create_datasets(auth, num):
+    ids = []
+    for i in range(num):
+        res = create_dataset(auth, {"name": f"dataset_{i}"})
+        ids.append(res["data"]["id"])
+    return ids
+
+
+# FILE MANAGEMENT WITHIN DATASET
+def upload_documnets(auth, dataset_id, files_path=None):
+    url = f"{HOST_ADDRESS}{FILE_API_URL}".format(dataset_id=dataset_id)
+
+    if files_path is None:
+        files_path = []
+
+    fields = []
+    for i, fp in enumerate(files_path):
+        p = Path(fp)
+        fields.append(("file", (p.name, p.open("rb"))))
+    m = MultipartEncoder(fields=fields)
+
+    res = requests.post(
+        url=url,
+        headers={"Content-Type": m.content_type},
+        auth=auth,
+        data=m,
+    )
+    return res.json()
--- a/sdk/python/test/test_http_api/conftest.py
+++ b/sdk/python/test/test_http_api/conftest.py
@ -0,0 +1,73 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+
+import pytest
+from common import delete_dataset
+from libs.utils.file_utils import (
+    create_docx_file,
+    create_eml_file,
+    create_excel_file,
+    create_html_file,
+    create_image_file,
+    create_json_file,
+    create_md_file,
+    create_pdf_file,
+    create_ppt_file,
+    create_txt_file,
+)
+
+
+@pytest.fixture(scope="function", autouse=True)
+def clear_datasets(get_http_api_auth):
+    yield
+    delete_dataset(get_http_api_auth)
+
+
+@pytest.fixture
+def generate_test_files(tmp_path):
+    files = {}
+    files["docx"] = tmp_path / "ragflow_test.docx"
+    create_docx_file(files["docx"])
+
+    files["excel"] = tmp_path / "ragflow_test.xlsx"
+    create_excel_file(files["excel"])
+
+    files["ppt"] = tmp_path / "ragflow_test.pptx"
+    create_ppt_file(files["ppt"])
+
+    files["image"] = tmp_path / "ragflow_test.png"
+    create_image_file(files["image"])
+
+    files["pdf"] = tmp_path / "ragflow_test.pdf"
+    create_pdf_file(files["pdf"])
+
+    files["txt"] = tmp_path / "ragflow_test.txt"
+    create_txt_file(files["txt"])
+
+    files["md"] = tmp_path / "ragflow_test.md"
+    create_md_file(files["md"])
+
+    files["json"] = tmp_path / "ragflow_test.json"
+    create_json_file(files["json"])
+
+    files["eml"] = tmp_path / "ragflow_test.eml"
+    create_eml_file(files["eml"])
+
+    files["html"] = tmp_path / "ragflow_test.html"
+    create_html_file(files["html"])
+
+    return files
--- a/sdk/python/test/test_http_api/test_dataset_mangement/common.py
+++ b/sdk/python/test/test_http_api/test_dataset_mangement/common.py
@ -1,57 +0,0 @@
-#
-#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
-#
-#  Licensed under the Apache License, Version 2.0 (the "License");
-#  you may not use this file except in compliance with the License.
-#  You may obtain a copy of the License at
-#
-#      http://www.apache.org/licenses/LICENSE-2.0
-#
-#  Unless required by applicable law or agreed to in writing, software
-#  distributed under the License is distributed on an "AS IS" BASIS,
-#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#  See the License for the specific language governing permissions and
-#  limitations under the License.
-#
-
-import os
-
-import requests
-
-HOST_ADDRESS = os.getenv("HOST_ADDRESS", "http://127.0.0.1:9380")
-API_URL = f"{HOST_ADDRESS}/api/v1/datasets"
-HEADERS = {"Content-Type": "application/json"}
-
-
-INVALID_API_TOKEN = "invalid_key_123"
-DATASET_NAME_LIMIT = 128
-
-
-def create_dataset(auth, payload):
-    res = requests.post(url=API_URL, headers=HEADERS, auth=auth, json=payload)
-    return res.json()
-
-
-def list_dataset(auth, params=None):
-    res = requests.get(url=API_URL, headers=HEADERS, auth=auth, params=params)
-    return res.json()
-
-
-def update_dataset(auth, dataset_id, payload):
-    res = requests.put(
-        url=f"{API_URL}/{dataset_id}", headers=HEADERS, auth=auth, json=payload
-    )
-    return res.json()
-
-
-def delete_dataset(auth, payload=None):
-    res = requests.delete(url=API_URL, headers=HEADERS, auth=auth, json=payload)
-    return res.json()
-
-
-def create_datasets(auth, num):
-    ids = []
-    for i in range(num):
-        res = create_dataset(auth, {"name": f"dataset_{i}"})
-        ids.append(res["data"]["id"])
-    return ids
--- a/sdk/python/test/test_http_api/test_dataset_mangement/test_create_dataset.py
+++ b/sdk/python/test/test_http_api/test_dataset_mangement/test_create_dataset.py
@ -13,12 +13,12 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
-import base64
-from pathlib import Path

 import pytest
 from common import DATASET_NAME_LIMIT, INVALID_API_TOKEN, create_dataset
 from libs.auth import RAGFlowHttpApiAuth
+from libs.utils import encode_avatar
+from libs.utils.file_utils import create_image_file


 class TestAuthorization:
@ -75,18 +75,11 @@ class TestDatasetCreation:


 class TestAdvancedConfigurations:
-    def test_avatar(self, get_http_api_auth, request):
-        def encode_avatar(image_path):
-            with Path.open(image_path, "rb") as file:
-                binary_data = file.read()
-            base64_encoded = base64.b64encode(binary_data).decode("utf-8")
-            return base64_encoded
-
+    def test_avatar(self, get_http_api_auth, tmp_path):
+        fn = create_image_file(tmp_path / "ragflow_test.png")
        payload = {
            "name": "avatar_test",
-            "avatar": encode_avatar(
-                Path(request.config.rootdir) / "test/data/logo.svg"
-            ),
+            "avatar": encode_avatar(fn),
        }
        res = create_dataset(get_http_api_auth, payload)
        assert res["code"] == 0
--- a/sdk/python/test/test_http_api/test_dataset_mangement/test_list_dataset.py
+++ b/sdk/python/test/test_http_api/test_dataset_mangement/test_list_dataset.py
@ -224,7 +224,6 @@ class TestDatasetList:
    ):
        create_datasets(get_http_api_auth, 3)
        res = list_dataset(get_http_api_auth, params=params)
-        # print(res)
        assert res["code"] == expected_code
        if expected_code == 0:
            if callable(assertions):
--- a/sdk/python/test/test_http_api/test_dataset_mangement/test_update_dataset.py
+++ b/sdk/python/test/test_http_api/test_dataset_mangement/test_update_dataset.py
@ -0,0 +1,289 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from concurrent.futures import ThreadPoolExecutor
+
+import pytest
+from common import (
+    DATASET_NAME_LIMIT,
+    INVALID_API_TOKEN,
+    create_datasets,
+    list_dataset,
+    update_dataset,
+)
+from libs.auth import RAGFlowHttpApiAuth
+from libs.utils import encode_avatar
+from libs.utils.file_utils import create_image_file
+
+# TODO: Missing scenario for updating embedding_model with chunk_count != 0
+
+
+class TestAuthorization:
+    @pytest.mark.parametrize(
+        "auth, expected_code, expected_message",
+        [
+            (None, 0, "`Authorization` can't be empty"),
+            (
+                RAGFlowHttpApiAuth(INVALID_API_TOKEN),
+                109,
+                "Authentication error: API key is invalid!",
+            ),
+        ],
+    )
+    def test_invalid_auth(
+        self, get_http_api_auth, auth, expected_code, expected_message
+    ):
+        ids = create_datasets(get_http_api_auth, 1)
+        res = update_dataset(auth, ids[0], {"name": "new_name"})
+        assert res["code"] == expected_code
+        assert res["message"] == expected_message
+
+
+class TestDatasetUpdate:
+    @pytest.mark.parametrize(
+        "name, expected_code, expected_message",
+        [
+            ("valid_name", 0, ""),
+            (
+                "a" * (DATASET_NAME_LIMIT + 1),
+                102,
+                "Dataset name should not be longer than 128 characters.",
+            ),
+            (0, 100, """AttributeError("\'int\' object has no attribute \'strip\'")"""),
+            (
+                None,
+                100,
+                """AttributeError("\'NoneType\' object has no attribute \'strip\'")""",
+            ),
+            pytest.param("", 102, "", marks=pytest.mark.xfail(reason="issue#5915")),
+            ("dataset_1", 102, "Duplicated dataset name in updating dataset."),
+            ("DATASET_1", 102, "Duplicated dataset name in updating dataset."),
+        ],
+    )
+    def test_name(self, get_http_api_auth, name, expected_code, expected_message):
+        ids = create_datasets(get_http_api_auth, 2)
+        res = update_dataset(get_http_api_auth, ids[0], {"name": name})
+        assert res["code"] == expected_code
+        if expected_code == 0:
+            res = list_dataset(get_http_api_auth, {"id": ids[0]})
+            assert res["data"][0]["name"] == name
+        else:
+            assert res["message"] == expected_message
+
+    @pytest.mark.parametrize(
+        "embedding_model, expected_code, expected_message",
+        [
+            ("BAAI/bge-large-zh-v1.5", 0, ""),
+            ("BAAI/bge-base-en-v1.5", 0, ""),
+            ("BAAI/bge-large-en-v1.5", 0, ""),
+            ("BAAI/bge-small-en-v1.5", 0, ""),
+            ("BAAI/bge-small-zh-v1.5", 0, ""),
+            ("jinaai/jina-embeddings-v2-base-en", 0, ""),
+            ("jinaai/jina-embeddings-v2-small-en", 0, ""),
+            ("nomic-ai/nomic-embed-text-v1.5", 0, ""),
+            ("sentence-transformers/all-MiniLM-L6-v2", 0, ""),
+            ("text-embedding-v2", 0, ""),
+            ("text-embedding-v3", 0, ""),
+            ("maidalun1020/bce-embedding-base_v1", 0, ""),
+            (
+                "other_embedding_model",
+                102,
+                "`embedding_model` other_embedding_model doesn't exist",
+            ),
+            (None, 102, "`embedding_model` can't be empty"),
+        ],
+    )
+    def test_embedding_model(
+        self, get_http_api_auth, embedding_model, expected_code, expected_message
+    ):
+        ids = create_datasets(get_http_api_auth, 1)
+        res = update_dataset(
+            get_http_api_auth, ids[0], {"embedding_model": embedding_model}
+        )
+        assert res["code"] == expected_code
+        if expected_code == 0:
+            res = list_dataset(get_http_api_auth, {"id": ids[0]})
+            assert res["data"][0]["embedding_model"] == embedding_model
+        else:
+            assert res["message"] == expected_message
+
+    @pytest.mark.parametrize(
+        "chunk_method, expected_code, expected_message",
+        [
+            ("naive", 0, ""),
+            ("manual", 0, ""),
+            ("qa", 0, ""),
+            ("table", 0, ""),
+            ("paper", 0, ""),
+            ("book", 0, ""),
+            ("laws", 0, ""),
+            ("presentation", 0, ""),
+            ("picture", 0, ""),
+            ("one", 0, ""),
+            ("knowledge_graph", 0, ""),
+            ("email", 0, ""),
+            ("tag", 0, ""),
+            ("", 0, ""),
+            (
+                "other_chunk_method",
+                102,
+                "'other_chunk_method' is not in ['naive', 'manual', 'qa', 'table',"
+                " 'paper', 'book', 'laws', 'presentation', 'picture', 'one', "
+                "'knowledge_graph', 'email', 'tag']",
+            ),
+        ],
+    )
+    def test_chunk_method(
+        self, get_http_api_auth, chunk_method, expected_code, expected_message
+    ):
+        ids = create_datasets(get_http_api_auth, 1)
+        res = update_dataset(get_http_api_auth, ids[0], {"chunk_method": chunk_method})
+        assert res["code"] == expected_code
+        if expected_code == 0:
+            res = list_dataset(get_http_api_auth, {"id": ids[0]})
+            if chunk_method != "":
+                assert res["data"][0]["chunk_method"] == chunk_method
+            else:
+                assert res["data"][0]["chunk_method"] == "naive"
+        else:
+            assert res["message"] == expected_message
+
+    def test_avatar(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        fn = create_image_file(tmp_path / "ragflow_test.png")
+        payload = {"avatar": encode_avatar(fn)}
+        res = update_dataset(get_http_api_auth, ids[0], payload)
+        assert res["code"] == 0
+
+    def test_description(self, get_http_api_auth):
+        ids = create_datasets(get_http_api_auth, 1)
+        payload = {"description": "description"}
+        res = update_dataset(get_http_api_auth, ids[0], payload)
+        assert res["code"] == 0
+
+        res = list_dataset(get_http_api_auth, {"id": ids[0]})
+        assert res["data"][0]["description"] == "description"
+
+    def test_pagerank(self, get_http_api_auth):
+        ids = create_datasets(get_http_api_auth, 1)
+        payload = {"pagerank": 1}
+        res = update_dataset(get_http_api_auth, ids[0], payload)
+        assert res["code"] == 0
+
+        res = list_dataset(get_http_api_auth, {"id": ids[0]})
+        assert res["data"][0]["pagerank"] == 1
+
+    def test_similarity_threshold(self, get_http_api_auth):
+        ids = create_datasets(get_http_api_auth, 1)
+        payload = {"similarity_threshold": 1}
+        res = update_dataset(get_http_api_auth, ids[0], payload)
+        assert res["code"] == 0
+
+        res = list_dataset(get_http_api_auth, {"id": ids[0]})
+        assert res["data"][0]["similarity_threshold"] == 1
+
+    @pytest.mark.parametrize(
+        "permission, expected_code",
+        [
+            ("me", 0),
+            ("team", 0),
+            ("", 0),
+            ("ME", 102),
+            ("TEAM", 102),
+            ("other_permission", 102),
+        ],
+    )
+    def test_permission(self, get_http_api_auth, permission, expected_code):
+        ids = create_datasets(get_http_api_auth, 1)
+        payload = {"permission": permission}
+        res = update_dataset(get_http_api_auth, ids[0], payload)
+        assert res["code"] == expected_code
+
+        res = list_dataset(get_http_api_auth, {"id": ids[0]})
+        if expected_code == 0 and permission != "":
+            assert res["data"][0]["permission"] == permission
+        if permission == "":
+            assert res["data"][0]["permission"] == "me"
+
+    def test_vector_similarity_weight(self, get_http_api_auth):
+        ids = create_datasets(get_http_api_auth, 1)
+        payload = {"vector_similarity_weight": 1}
+        res = update_dataset(get_http_api_auth, ids[0], payload)
+        assert res["code"] == 0
+
+        res = list_dataset(get_http_api_auth, {"id": ids[0]})
+        assert res["data"][0]["vector_similarity_weight"] == 1
+
+    def test_invalid_dataset_id(self, get_http_api_auth):
+        create_datasets(get_http_api_auth, 1)
+        res = update_dataset(
+            get_http_api_auth, "invalid_dataset_id", {"name": "invalid_dataset_id"}
+        )
+        assert res["code"] == 102
+        assert res["message"] == "You don't own the dataset"
+
+    @pytest.mark.parametrize(
+        "payload, expected_code, expected_message",
+        [
+            ({"chunk_count": 1}, 102, "Can't change `chunk_count`."),
+            (
+                {"create_date": "Tue, 11 Mar 2025 13:37:23 GMT"},
+                102,
+                "The input parameters are invalid.",
+            ),
+            ({"create_time": 1741671443322}, 102, "The input parameters are invalid."),
+            ({"created_by": "aa"}, 102, "The input parameters are invalid."),
+            ({"document_count": 1}, 102, "Can't change `document_count`."),
+            ({"id": "id"}, 102, "The input parameters are invalid."),
+            ({"status": "1"}, 102, "The input parameters are invalid."),
+            (
+                {"tenant_id": "e57c1966f99211efb41e9e45646e0111"},
+                102,
+                "Can't change `tenant_id`.",
+            ),
+            ({"token_num": 1}, 102, "The input parameters are invalid."),
+            (
+                {"update_date": "Tue, 11 Mar 2025 13:37:23 GMT"},
+                102,
+                "The input parameters are invalid.",
+            ),
+            ({"update_time": 1741671443339}, 102, "The input parameters are invalid."),
+        ],
+    )
+    def test_modify_read_only_field(
+        self, get_http_api_auth, payload, expected_code, expected_message
+    ):
+        ids = create_datasets(get_http_api_auth, 1)
+        res = update_dataset(get_http_api_auth, ids[0], payload)
+        assert res["code"] == expected_code
+        assert res["message"] == expected_message
+
+    def test_modify_unknown_field(self, get_http_api_auth):
+        ids = create_datasets(get_http_api_auth, 1)
+        res = update_dataset(get_http_api_auth, ids[0], {"unknown_field": 0})
+        assert res["code"] == 100
+
+    def test_concurrent_update(self, get_http_api_auth):
+        ids = create_datasets(get_http_api_auth, 1)
+
+        with ThreadPoolExecutor(max_workers=5) as executor:
+            futures = [
+                executor.submit(
+                    update_dataset, get_http_api_auth, ids[0], {"name": f"dataset_{i}"}
+                )
+                for i in range(100)
+            ]
+        responses = [f.result() for f in futures]
+        assert all(r["code"] == 0 for r in responses)
--- a/sdk/python/test/test_http_api/test_file_management_within_dataset/test_upload_documents.py
+++ b/sdk/python/test/test_http_api/test_file_management_within_dataset/test_upload_documents.py
@ -0,0 +1,230 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import string
+from concurrent.futures import ThreadPoolExecutor
+
+import pytest
+import requests
+from common import (
+    DOCUMENT_NAME_LIMIT,
+    FILE_API_URL,
+    HOST_ADDRESS,
+    INVALID_API_TOKEN,
+    create_datasets,
+    list_dataset,
+    upload_documnets,
+)
+from libs.auth import RAGFlowHttpApiAuth
+from libs.utils.file_utils import create_txt_file
+from requests_toolbelt import MultipartEncoder
+
+
+class TestAuthorization:
+    @pytest.mark.parametrize(
+        "auth, expected_code, expected_message",
+        [
+            (None, 0, "`Authorization` can't be empty"),
+            (
+                RAGFlowHttpApiAuth(INVALID_API_TOKEN),
+                109,
+                "Authentication error: API key is invalid!",
+            ),
+        ],
+    )
+    def test_invalid_auth(
+        self, get_http_api_auth, auth, expected_code, expected_message
+    ):
+        ids = create_datasets(get_http_api_auth, 1)
+        res = upload_documnets(auth, ids[0])
+        assert res["code"] == expected_code
+        assert res["message"] == expected_message
+
+
+class TestUploadDocuments:
+    def test_valid_single_upload(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        fp = create_txt_file(tmp_path / "ragflow_test.txt")
+        res = upload_documnets(get_http_api_auth, ids[0], [fp])
+        assert res["code"] == 0
+        assert res["data"][0]["dataset_id"] == ids[0]
+        assert res["data"][0]["name"] == fp.name
+
+    @pytest.mark.parametrize(
+        "file_type",
+        [
+            "docx",
+            "excel",
+            "ppt",
+            "image",
+            "pdf",
+            "txt",
+            "md",
+            "json",
+            "eml",
+            "html",
+        ],
+    )
+    def test_file_type_validation(
+        self, get_http_api_auth, generate_test_files, file_type
+    ):
+        ids = create_datasets(get_http_api_auth, 1)
+        fp = generate_test_files[file_type]
+        res = upload_documnets(get_http_api_auth, ids[0], [fp])
+        assert res["code"] == 0
+        assert res["data"][0]["dataset_id"] == ids[0]
+        assert res["data"][0]["name"] == fp.name
+
+    @pytest.mark.parametrize(
+        "file_type",
+        ["exe", "unknown"],
+    )
+    def test_unsupported_file_type(self, get_http_api_auth, tmp_path, file_type):
+        ids = create_datasets(get_http_api_auth, 1)
+        fp = tmp_path / f"ragflow_test.{file_type}"
+        fp.touch()
+        res = upload_documnets(get_http_api_auth, ids[0], [fp])
+        assert res["code"] == 500
+        assert (
+            res["message"]
+            == f"ragflow_test.{file_type}: This type of file has not been supported yet!"
+        )
+
+    def test_missing_file(self, get_http_api_auth):
+        ids = create_datasets(get_http_api_auth, 1)
+        res = upload_documnets(get_http_api_auth, ids[0])
+        assert res["code"] == 101
+        assert res["message"] == "No file part!"
+
+    def test_empty_file(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        fp = tmp_path / "empty.txt"
+        fp.touch()
+
+        res = upload_documnets(get_http_api_auth, ids[0], [fp])
+        assert res["code"] == 0
+        assert res["data"][0]["size"] == 0
+
+    def test_filename_empty(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        fp = create_txt_file(tmp_path / "ragflow_test.txt")
+        url = f"{HOST_ADDRESS}{FILE_API_URL}".format(dataset_id=ids[0])
+        fields = (("file", ("", fp.open("rb"))),)
+        m = MultipartEncoder(fields=fields)
+        res = requests.post(
+            url=url,
+            headers={"Content-Type": m.content_type},
+            auth=get_http_api_auth,
+            data=m,
+        )
+        assert res.json()["code"] == 101
+        assert res.json()["message"] == "No file selected!"
+
+    def test_filename_exceeds_max_length(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        # filename_length = 129
+        fp = create_txt_file(tmp_path / f"{'a' * (DOCUMENT_NAME_LIMIT - 3)}.txt")
+        res = upload_documnets(get_http_api_auth, ids[0], [fp])
+        assert res["code"] == 500
+        assert (
+            res["message"]
+            == f"{'a' * (DOCUMENT_NAME_LIMIT - 3)}.txt: Exceed the maximum length of file name!"
+        )
+
+    def test_invalid_dataset_id(self, get_http_api_auth, tmp_path):
+        fp = create_txt_file(tmp_path / "ragflow_test.txt")
+        res = upload_documnets(get_http_api_auth, "invalid_dataset_id", [fp])
+        assert res["code"] == 100
+        assert (
+            res["message"]
+            == """LookupError("Can\'t find the dataset with ID invalid_dataset_id!")"""
+        )
+
+    def test_duplicate_files(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        fp = create_txt_file(tmp_path / "ragflow_test.txt")
+        res = upload_documnets(get_http_api_auth, ids[0], [fp, fp])
+        assert res["code"] == 0
+        assert len(res["data"]) == 2
+        for i in range(len(res["data"])):
+            assert res["data"][i]["dataset_id"] == ids[0]
+            expected_name = fp.name
+            if i != 0:
+                expected_name = f"{fp.stem}({i}){fp.suffix}"
+            assert res["data"][i]["name"] == expected_name
+
+    def test_same_file_repeat(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        fp = create_txt_file(tmp_path / "ragflow_test.txt")
+        for i in range(10):
+            res = upload_documnets(get_http_api_auth, ids[0], [fp])
+            assert res["code"] == 0
+            assert len(res["data"]) == 1
+            assert res["data"][0]["dataset_id"] == ids[0]
+            expected_name = fp.name
+            if i != 0:
+                expected_name = f"{fp.stem}({i}){fp.suffix}"
+            assert res["data"][0]["name"] == expected_name
+
+    def test_filename_special_characters(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        illegal_chars = '<>:"/\\|?*'
+        translation_table = str.maketrans({char: "_" for char in illegal_chars})
+        safe_filename = string.punctuation.translate(translation_table)
+        fp = tmp_path / f"{safe_filename}.txt"
+        fp.write_text("Sample text content")
+
+        res = upload_documnets(get_http_api_auth, ids[0], [fp])
+        assert res["code"] == 0
+        assert len(res["data"]) == 1
+        assert res["data"][0]["dataset_id"] == ids[0]
+        assert res["data"][0]["name"] == fp.name
+
+    def test_multiple_files(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+        expected_document_count = 20
+        fps = []
+        for i in range(expected_document_count):
+            fp = create_txt_file(tmp_path / f"ragflow_test_{i}.txt")
+            fps.append(fp)
+        res = upload_documnets(get_http_api_auth, ids[0], fps)
+        assert res["code"] == 0
+
+        res = list_dataset(get_http_api_auth, {"id": ids[0]})
+        assert res["data"][0]["document_count"] == expected_document_count
+
+    @pytest.mark.xfail
+    def test_concurrent_upload(self, get_http_api_auth, tmp_path):
+        ids = create_datasets(get_http_api_auth, 1)
+
+        expected_document_count = 20
+        fps = []
+        for i in range(expected_document_count):
+            fp = create_txt_file(tmp_path / f"ragflow_test_{i}.txt")
+            fps.append(fp)
+
+        with ThreadPoolExecutor(max_workers=5) as executor:
+            futures = [
+                executor.submit(
+                    upload_documnets, get_http_api_auth, ids[0], fps[i : i + 1]
+                )
+                for i in range(expected_document_count)
+            ]
+        responses = [f.result() for f in futures]
+        assert all(r["code"] == 0 for r in responses)
+
+        res = list_dataset(get_http_api_auth, {"id": ids[0]})
+        assert res["data"][0]["document_count"] == expected_document_count
--- a/sdk/python/uv.lock
+++ b/sdk/python/uv.lock
@ -19,6 +19,15 @@ wheels = [
    { url = "https://mirrors.aliyun.com/pypi/packages/38/fc/bce832fd4fd99766c04d1ee0eead6b0ec6486fb100ae5e74c1d91292b982/certifi-2025.1.31-py3-none-any.whl", hash = "sha256:ca78db4565a652026a4db2bcdf68f2fb589ea80d0be70e03929ed730746b84fe" },
 ]

+[[package]]
+name = "chardet"
+version = "5.2.0"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/f3/0d/f7b6ab21ec75897ed80c17d79b15951a719226b9fababf1e40ea74d69079/chardet-5.2.0.tar.gz", hash = "sha256:1b3b6ff479a8c414bc3fa2c0852995695c4a026dcd6d0633b2dd092ca39c1cf7" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/38/6f/f5fbc992a329ee4e0f288c1fe0e2ad9485ed064cac731ed2fe47dcc38cbf/chardet-5.2.0-py3-none-any.whl", hash = "sha256:e1cf59446890a00105fe7b7912492ea04b6e6f06d4b742b2c788469e34c82970" },
+]
+
 [[package]]
 name = "charset-normalizer"
 version = "3.4.1"
@ -76,6 +85,15 @@ wheels = [
    { url = "https://mirrors.aliyun.com/pypi/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6" },
 ]

+[[package]]
+name = "et-xmlfile"
+version = "2.0.0"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/d3/38/af70d7ab1ae9d4da450eeec1fa3918940a5fafb9055e934af8d6eb0c2313/et_xmlfile-2.0.0.tar.gz", hash = "sha256:dab3f4764309081ce75662649be815c4c9081e88f0837825f90fd28317d4da54" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/c1/8b/5fe2cc11fee489817272089c4203e679c63b570a5aaeb18d852ae3cbba6a/et_xmlfile-2.0.0-py3-none-any.whl", hash = "sha256:7a91720bc756843502c3b7504c77b8fe44217c85c537d85037f0f536151b2caa" },
+]
+
 [[package]]
 name = "exceptiongroup"
 version = "1.2.2"
@ -103,6 +121,83 @@ wheels = [
    { url = "https://mirrors.aliyun.com/pypi/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl", hash = "sha256:b6a85871a79d2e3b22d2d1b94ac2824226a63c6b741c88f7ae975f18b6778374" },
 ]

+[[package]]
+name = "lxml"
+version = "5.3.1"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/ef/f6/c15ca8e5646e937c148e147244817672cf920b56ac0bf2cc1512ae674be8/lxml-5.3.1.tar.gz", hash = "sha256:106b7b5d2977b339f1e97efe2778e2ab20e99994cbb0ec5e55771ed0795920c8" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/80/4b/73426192004a643c11a644ed2346dbe72da164c8e775ea2e70f60e63e516/lxml-5.3.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:a4058f16cee694577f7e4dd410263cd0ef75644b43802a689c2b3c2a7e69453b" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/30/c2/3b28f642b43fdf9580d936e8fdd3ec43c01a97ecfe17fd67f76ce9099752/lxml-5.3.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:364de8f57d6eda0c16dcfb999af902da31396949efa0e583e12675d09709881b" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/1f/a5/45279e464174b99d72d25bc018b097f9211c0925a174ca582a415609f036/lxml-5.3.1-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:528f3a0498a8edc69af0559bdcf8a9f5a8bf7c00051a6ef3141fdcf27017bbf5" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/f0/e7/10cd8b9e27ffb6b3465b76604725b67b7c70d4e399750ff88de1b38ab9eb/lxml-5.3.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:db4743e30d6f5f92b6d2b7c86b3ad250e0bad8dee4b7ad8a0c44bfb276af89a3" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/ce/54/2d6f634924920b17122445136345d44c6d69178c9c49e161aa8f206739d6/lxml-5.3.1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:17b5d7f8acf809465086d498d62a981fa6a56d2718135bb0e4aa48c502055f5c" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/a2/fe/7f5ae8fd1f357fcb21b0d4e20416fae870d654380b6487adbcaaf0df9b31/lxml-5.3.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:928e75a7200a4c09e6efc7482a1337919cc61fe1ba289f297827a5b76d8969c2" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/af/70/22fecb6f2ca8dc77d14ab6be3cef767ff8340040bc95dca384b5b1cb333a/lxml-5.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5a997b784a639e05b9d4053ef3b20c7e447ea80814a762f25b8ed5a89d261eac" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/63/91/21619cc14f7fd1de3f1bdf86cc8106edacf4d685b540d658d84247a3a32a/lxml-5.3.1-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:7b82e67c5feb682dbb559c3e6b78355f234943053af61606af126df2183b9ef9" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/50/0f/27183248fa3cdd2040047ceccd320ff1ed1344167f38a4ac26aed092268b/lxml-5.3.1-cp310-cp310-manylinux_2_28_ppc64le.whl", hash = "sha256:f1de541a9893cf8a1b1db9bf0bf670a2decab42e3e82233d36a74eda7822b4c9" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/c6/8d/9b7388d5b23ed2f239a992a478cbd0ce313aaa2d008dd73c4042b190b6a9/lxml-5.3.1-cp310-cp310-manylinux_2_28_s390x.whl", hash = "sha256:de1fc314c3ad6bc2f6bd5b5a5b9357b8c6896333d27fdbb7049aea8bd5af2d79" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/65/8e/590e20833220eac55b6abcde71d3ae629d38ac1c3543bcc2bfe1f3c2f5d1/lxml-5.3.1-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:7c0536bd9178f754b277a3e53f90f9c9454a3bd108b1531ffff720e082d824f2" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/4e/77/cabdf5569fd0415a88ebd1d62d7f2814e71422439b8564aaa03e7eefc069/lxml-5.3.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:68018c4c67d7e89951a91fbd371e2e34cd8cfc71f0bb43b5332db38497025d51" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/49/bd/f0b6d50ea7b8b54aaa5df4410cb1d5ae6ffa016b8e0503cae08b86c24674/lxml-5.3.1-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:aa826340a609d0c954ba52fd831f0fba2a4165659ab0ee1a15e4aac21f302406" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/fa/69/1793d00a4e3da7f27349edb5a6f3da947ed921263cd9a243fab11c6cbc07/lxml-5.3.1-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:796520afa499732191e39fc95b56a3b07f95256f2d22b1c26e217fb69a9db5b5" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/d3/c9/e2449129b6cb2054c898df8d850ea4dadd75b4c33695a6c4b0f35082f1e7/lxml-5.3.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:3effe081b3135237da6e4c4530ff2a868d3f80be0bda027e118a5971285d42d0" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/ed/63/e5da540eba6ab9a0d4188eeaa5c85767b77cafa8efeb70da0593d6cd3b81/lxml-5.3.1-cp310-cp310-win32.whl", hash = "sha256:a22f66270bd6d0804b02cd49dae2b33d4341015545d17f8426f2c4e22f557a23" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/08/71/853a3ad812cd24c35b7776977cb0ae40c2b64ff79ad6d6c36c987daffc49/lxml-5.3.1-cp310-cp310-win_amd64.whl", hash = "sha256:0bcfadea3cdc68e678d2b20cb16a16716887dd00a881e16f7d806c2138b8ff0c" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/57/bb/2faea15df82114fa27f2a86eec220506c532ee8ce211dff22f48881b353a/lxml-5.3.1-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:e220f7b3e8656ab063d2eb0cd536fafef396829cafe04cb314e734f87649058f" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/9f/d3/374114084abb1f96026eccb6cd48b070f85de82fdabae6c2f1e198fa64e5/lxml-5.3.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:0f2cfae0688fd01f7056a17367e3b84f37c545fb447d7282cf2c242b16262607" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/0f/fb/44a46efdc235c2dd763c1e929611d8ff3b920c32b8fcd9051d38f4d04633/lxml-5.3.1-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:67d2f8ad9dcc3a9e826bdc7802ed541a44e124c29b7d95a679eeb58c1c14ade8" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/3b/e5/168ddf9f16a90b590df509858ae97a8219d6999d5a132ad9f72427454bed/lxml-5.3.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:db0c742aad702fd5d0c6611a73f9602f20aec2007c102630c06d7633d9c8f09a" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/f9/0e/3e2742c6f4854b202eb8587c1f7ed760179f6a9fcb34a460497c8c8f3078/lxml-5.3.1-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:198bb4b4dd888e8390afa4f170d4fa28467a7eaf857f1952589f16cfbb67af27" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/b8/03/b2f2ab9e33c47609c80665e75efed258b030717e06693835413b34e797cb/lxml-5.3.1-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:d2a3e412ce1849be34b45922bfef03df32d1410a06d1cdeb793a343c2f1fd666" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/93/ad/0ecfb082b842358c8a9e3115ec944b7240f89821baa8cd7c0cb8a38e05cb/lxml-5.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2b8969dbc8d09d9cd2ae06362c3bad27d03f433252601ef658a49bd9f2b22d79" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/64/5b/3e93d8ebd2b7eb984c2ad74dfff75493ce96e7b954b12e4f5fc34a700414/lxml-5.3.1-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:5be8f5e4044146a69c96077c7e08f0709c13a314aa5315981185c1f00235fe65" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/91/83/7dc412362ee7a0259c7f64349393262525061fad551a1340ef92c59d9732/lxml-5.3.1-cp311-cp311-manylinux_2_28_ppc64le.whl", hash = "sha256:133f3493253a00db2c870d3740bc458ebb7d937bd0a6a4f9328373e0db305709" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/1e/41/c337f121d9dca148431f246825e021fa1a3f66a6b975deab1950530fdb04/lxml-5.3.1-cp311-cp311-manylinux_2_28_s390x.whl", hash = "sha256:52d82b0d436edd6a1d22d94a344b9a58abd6c68c357ed44f22d4ba8179b37629" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/a5/73/762c319c4906b3db67e4abc7cfe7d66c34996edb6d0e8cb60f462954d662/lxml-5.3.1-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:1b6f92e35e2658a5ed51c6634ceb5ddae32053182851d8cad2a5bc102a359b33" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/c1/e7/d1e296cb3b3b46371220a31350730948d7bea41cc9123c5fd219dea33c29/lxml-5.3.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:203b1d3eaebd34277be06a3eb880050f18a4e4d60861efba4fb946e31071a295" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/df/90/4adc854475105b93ead6c0c736f762d29371751340dcf5588cfcf8191b8a/lxml-5.3.1-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:155e1a5693cf4b55af652f5c0f78ef36596c7f680ff3ec6eb4d7d85367259b2c" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/f0/0d/39864efbd231c13eb53edee2ab91c742c24d2f93efe2af7d3fe4343e42c1/lxml-5.3.1-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:22ec2b3c191f43ed21f9545e9df94c37c6b49a5af0a874008ddc9132d49a2d9c" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/8d/7a/630a64ceb1088196de182e2e33b5899691c3e1ae21af688e394208bd6810/lxml-5.3.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:7eda194dd46e40ec745bf76795a7cccb02a6a41f445ad49d3cf66518b0bd9cff" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/b2/3d/091bc7b592333754cb346c1507ca948ab39bc89d83577ac8f1da3be4dece/lxml-5.3.1-cp311-cp311-win32.whl", hash = "sha256:fb7c61d4be18e930f75948705e9718618862e6fc2ed0d7159b2262be73f167a2" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/12/8c/7d47cfc0d04fd4e3639ec7e1c96c2561d5e890eb900de8f76eea75e0964a/lxml-5.3.1-cp311-cp311-win_amd64.whl", hash = "sha256:c809eef167bf4a57af4b03007004896f5c60bd38dc3852fcd97a26eae3d4c9e6" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/3b/f4/5121aa9ee8e09b8b8a28cf3709552efe3d206ca51a20d6fa471b60bb3447/lxml-5.3.1-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:e69add9b6b7b08c60d7ff0152c7c9a6c45b4a71a919be5abde6f98f1ea16421c" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/0a/ca/8e9aa01edddc74878f4aea85aa9ab64372f46aa804d1c36dda861bf9eabf/lxml-5.3.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:4e52e1b148867b01c05e21837586ee307a01e793b94072d7c7b91d2c2da02ffe" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/b2/b3/ea40a5c98619fbd7e9349df7007994506d396b97620ced34e4e5053d3734/lxml-5.3.1-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a4b382e0e636ed54cd278791d93fe2c4f370772743f02bcbe431a160089025c9" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/3a/5e/375418be35f8a695cadfe7e7412f16520e62e24952ed93c64c9554755464/lxml-5.3.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c2e49dc23a10a1296b04ca9db200c44d3eb32c8d8ec532e8c1fd24792276522a" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/79/7c/d258eaaa9560f6664f9b426a5165103015bee6512d8931e17342278bad0a/lxml-5.3.1-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:4399b4226c4785575fb20998dc571bc48125dc92c367ce2602d0d70e0c455eb0" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/03/bc/a041415be4135a1b3fdf017a5d873244cc16689456166fbdec4b27fba153/lxml-5.3.1-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5412500e0dc5481b1ee9cf6b38bb3b473f6e411eb62b83dc9b62699c3b7b79f7" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/32/88/047f24967d5e3fc97848ea2c207eeef0f16239cdc47368c8b95a8dc93a33/lxml-5.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1c93ed3c998ea8472be98fb55aed65b5198740bfceaec07b2eba551e55b7b9ae" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/3d/b5/ecf5a20937ecd21af02c5374020f4e3a3538e10a32379a7553fca3d77094/lxml-5.3.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:63d57fc94eb0bbb4735e45517afc21ef262991d8758a8f2f05dd6e4174944519" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/a4/05/56c359e07275911ed5f35ab1d63c8cd3360d395fb91e43927a2ae90b0322/lxml-5.3.1-cp312-cp312-manylinux_2_28_ppc64le.whl", hash = "sha256:b450d7cabcd49aa7ab46a3c6aa3ac7e1593600a1a0605ba536ec0f1b99a04322" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/b7/f4/f95e3ae12e9f32fbcde00f9affa6b0df07f495117f62dbb796a9a31c84d6/lxml-5.3.1-cp312-cp312-manylinux_2_28_s390x.whl", hash = "sha256:4df0ec814b50275ad6a99bc82a38b59f90e10e47714ac9871e1b223895825468" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/c5/f8/309546aec092434166a6e11c7dcecb5c2d0a787c18c072d61e18da9eba57/lxml-5.3.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:d184f85ad2bb1f261eac55cddfcf62a70dee89982c978e92b9a74a1bfef2e367" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/71/1c/b951817cb5058ca7c332d012dfe8bc59dabd0f0a8911ddd7b7ea8e41cfbd/lxml-5.3.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:b725e70d15906d24615201e650d5b0388b08a5187a55f119f25874d0103f90dd" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/31/23/45feba8dae1d35fcca1e51b051f59dc4223cbd23e071a31e25f3f73938a8/lxml-5.3.1-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:a31fa7536ec1fb7155a0cd3a4e3d956c835ad0a43e3610ca32384d01f079ea1c" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/61/69/be245d7b2dbef81c542af59c97fcd641fbf45accf2dc1c325bae7d0d014c/lxml-5.3.1-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:3c3c8b55c7fc7b7e8877b9366568cc73d68b82da7fe33d8b98527b73857a225f" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/69/06/128af2ed04bac99b8f83becfb74c480f1aa18407b5c329fad457e08a1bf4/lxml-5.3.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:d61ec60945d694df806a9aec88e8f29a27293c6e424f8ff91c80416e3c617645" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/8a/2d/f03a21cf6cc75cdd083563e509c7b6b159d761115c4142abb5481094ed8c/lxml-5.3.1-cp312-cp312-win32.whl", hash = "sha256:f4eac0584cdc3285ef2e74eee1513a6001681fd9753b259e8159421ed28a72e5" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/2b/9c/8abe21585d20ef70ad9cec7562da4332b764ed69ec29b7389d23dfabcea0/lxml-5.3.1-cp312-cp312-win_amd64.whl", hash = "sha256:29bfc8d3d88e56ea0a27e7c4897b642706840247f59f4377d81be8f32aa0cfbf" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/d2/b4/89a68d05f267f05cc1b8b2f289a8242955705b1b0a9d246198227817ee46/lxml-5.3.1-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:afa578b6524ff85fb365f454cf61683771d0170470c48ad9d170c48075f86725" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/7f/0d/c034a541e7a1153527d7880c62493a74f2277f38e64de2480cadd0d4cf96/lxml-5.3.1-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:67f5e80adf0aafc7b5454f2c1cb0cde920c9b1f2cbd0485f07cc1d0497c35c5d" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/35/5c/38e183c2802f14fbdaa75c3266e11d0ca05c64d78e8cdab2ee84e954a565/lxml-5.3.1-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2dd0b80ac2d8f13ffc906123a6f20b459cb50a99222d0da492360512f3e50f84" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/18/5b/14f93b359b3c29673d5d282bc3a6edb3a629879854a77541841aba37607f/lxml-5.3.1-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:422c179022ecdedbe58b0e242607198580804253da220e9454ffe848daa1cfd2" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/f6/08/8471de65f3dee70a3a50e7082fd7409f0ac7a1ace777c13fca4aea1a5759/lxml-5.3.1-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:524ccfded8989a6595dbdda80d779fb977dbc9a7bc458864fc9a0c2fc15dc877" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/83/29/00b9b0322a473aee6cda87473401c9abb19506cd650cc69a8aa38277ea74/lxml-5.3.1-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:48fd46bf7155def2e15287c6f2b133a2f78e2d22cdf55647269977b873c65499" },
+]
+
+[[package]]
+name = "openpyxl"
+version = "3.1.5"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+dependencies = [
+    { name = "et-xmlfile" },
+]
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/3d/f9/88d94a75de065ea32619465d2f77b29a0469500e99012523b91cc4141cd1/openpyxl-3.1.5.tar.gz", hash = "sha256:cf0e3cf56142039133628b5acffe8ef0c12bc902d2aadd3e0fe5878dc08d1050" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/c0/da/977ded879c29cbd04de313843e76868e6e13408a94ed6b987245dc7c8506/openpyxl-3.1.5-py2.py3-none-any.whl", hash = "sha256:5282c12b107bffeef825f4617dc029afaf41d0ea60823bbb665ef3079dc79de2" },
+]
+
 [[package]]
 name = "packaging"
 version = "24.2"
@ -112,6 +207,54 @@ wheels = [
    { url = "https://mirrors.aliyun.com/pypi/packages/88/ef/eb23f262cca3c0c4eb7ab1933c3b1f03d021f2c48f54763065b6f0e321be/packaging-24.2-py3-none-any.whl", hash = "sha256:09abb1bccd265c01f4a3aa3f7a7db064b36514d2cba19a2f694fe6150451a759" },
 ]

+[[package]]
+name = "pillow"
+version = "11.1.0"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/f3/af/c097e544e7bd278333db77933e535098c259609c4eb3b85381109602fb5b/pillow-11.1.0.tar.gz", hash = "sha256:368da70808b36d73b4b390a8ffac11069f8a5c85f29eff1f1b01bcf3ef5b2a20" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/50/1c/2dcea34ac3d7bc96a1fd1bd0a6e06a57c67167fec2cff8d95d88229a8817/pillow-11.1.0-cp310-cp310-macosx_10_10_x86_64.whl", hash = "sha256:e1abe69aca89514737465752b4bcaf8016de61b3be1397a8fc260ba33321b3a8" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/14/ca/6bec3df25e4c88432681de94a3531cc738bd85dea6c7aa6ab6f81ad8bd11/pillow-11.1.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:c640e5a06869c75994624551f45e5506e4256562ead981cce820d5ab39ae2192" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/d4/2c/668e18e5521e46eb9667b09e501d8e07049eb5bfe39d56be0724a43117e6/pillow-11.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a07dba04c5e22824816b2615ad7a7484432d7f540e6fa86af60d2de57b0fcee2" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/02/80/79f99b714f0fc25f6a8499ecfd1f810df12aec170ea1e32a4f75746051ce/pillow-11.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e267b0ed063341f3e60acd25c05200df4193e15a4a5807075cd71225a2386e26" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/81/aa/8d4ad25dc11fd10a2001d5b8a80fdc0e564ac33b293bdfe04ed387e0fd95/pillow-11.1.0-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:bd165131fd51697e22421d0e467997ad31621b74bfc0b75956608cb2906dda07" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/84/7a/cd0c3eaf4a28cb2a74bdd19129f7726277a7f30c4f8424cd27a62987d864/pillow-11.1.0-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:abc56501c3fd148d60659aae0af6ddc149660469082859fa7b066a298bde9482" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/8f/8b/a907fdd3ae8f01c7670dfb1499c53c28e217c338b47a813af8d815e7ce97/pillow-11.1.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:54ce1c9a16a9561b6d6d8cb30089ab1e5eb66918cb47d457bd996ef34182922e" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/6f/9a/9f139d9e8cccd661c3efbf6898967a9a337eb2e9be2b454ba0a09533100d/pillow-11.1.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:73ddde795ee9b06257dac5ad42fcb07f3b9b813f8c1f7f870f402f4dc54b5269" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/a8/68/0d8d461f42a3f37432203c8e6df94da10ac8081b6d35af1c203bf3111088/pillow-11.1.0-cp310-cp310-win32.whl", hash = "sha256:3a5fe20a7b66e8135d7fd617b13272626a28278d0e578c98720d9ba4b2439d49" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/14/81/d0dff759a74ba87715509af9f6cb21fa21d93b02b3316ed43bda83664db9/pillow-11.1.0-cp310-cp310-win_amd64.whl", hash = "sha256:b6123aa4a59d75f06e9dd3dac5bf8bc9aa383121bb3dd9a7a612e05eabc9961a" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/ce/1f/8d50c096a1d58ef0584ddc37e6f602828515219e9d2428e14ce50f5ecad1/pillow-11.1.0-cp310-cp310-win_arm64.whl", hash = "sha256:a76da0a31da6fcae4210aa94fd779c65c75786bc9af06289cd1c184451ef7a65" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/dd/d6/2000bfd8d5414fb70cbbe52c8332f2283ff30ed66a9cde42716c8ecbe22c/pillow-11.1.0-cp311-cp311-macosx_10_10_x86_64.whl", hash = "sha256:e06695e0326d05b06833b40b7ef477e475d0b1ba3a6d27da1bb48c23209bf457" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/d9/45/3fe487010dd9ce0a06adf9b8ff4f273cc0a44536e234b0fad3532a42c15b/pillow-11.1.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:96f82000e12f23e4f29346e42702b6ed9a2f2fea34a740dd5ffffcc8c539eb35" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/e3/72/776b3629c47d9d5f1c160113158a7a7ad177688d3a1159cd3b62ded5a33a/pillow-11.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a3cd561ded2cf2bbae44d4605837221b987c216cff94f49dfeed63488bb228d2" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/e4/c2/e25199e7e4e71d64eeb869f5b72c7ddec70e0a87926398785ab944d92375/pillow-11.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f189805c8be5ca5add39e6f899e6ce2ed824e65fb45f3c28cb2841911da19070" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/c1/ed/51d6136c9d5911f78632b1b86c45241c712c5a80ed7fa7f9120a5dff1eba/pillow-11.1.0-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:dd0052e9db3474df30433f83a71b9b23bd9e4ef1de13d92df21a52c0303b8ab6" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/48/a4/fbfe9d5581d7b111b28f1d8c2762dee92e9821bb209af9fa83c940e507a0/pillow-11.1.0-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:837060a8599b8f5d402e97197d4924f05a2e0d68756998345c829c33186217b1" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/39/db/0b3c1a5018117f3c1d4df671fb8e47d08937f27519e8614bbe86153b65a5/pillow-11.1.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:aa8dd43daa836b9a8128dbe7d923423e5ad86f50a7a14dc688194b7be5c0dea2" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/d9/58/bc128da7fea8c89fc85e09f773c4901e95b5936000e6f303222490c052f3/pillow-11.1.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:0a2f91f8a8b367e7a57c6e91cd25af510168091fb89ec5146003e424e1558a96" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/5f/bb/58f34379bde9fe197f51841c5bbe8830c28bbb6d3801f16a83b8f2ad37df/pillow-11.1.0-cp311-cp311-win32.whl", hash = "sha256:c12fc111ef090845de2bb15009372175d76ac99969bdf31e2ce9b42e4b8cd88f" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/3a/c6/fce9255272bcf0c39e15abd2f8fd8429a954cf344469eaceb9d0d1366913/pillow-11.1.0-cp311-cp311-win_amd64.whl", hash = "sha256:fbd43429d0d7ed6533b25fc993861b8fd512c42d04514a0dd6337fb3ccf22761" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/c8/52/8ba066d569d932365509054859f74f2a9abee273edcef5cd75e4bc3e831e/pillow-11.1.0-cp311-cp311-win_arm64.whl", hash = "sha256:f7955ecf5609dee9442cbface754f2c6e541d9e6eda87fad7f7a989b0bdb9d71" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/95/20/9ce6ed62c91c073fcaa23d216e68289e19d95fb8188b9fb7a63d36771db8/pillow-11.1.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:2062ffb1d36544d42fcaa277b069c88b01bb7298f4efa06731a7fd6cc290b81a" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/b9/d8/f6004d98579a2596c098d1e30d10b248798cceff82d2b77aa914875bfea1/pillow-11.1.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:a85b653980faad27e88b141348707ceeef8a1186f75ecc600c395dcac19f385b" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/08/d9/892e705f90051c7a2574d9f24579c9e100c828700d78a63239676f960b74/pillow-11.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9409c080586d1f683df3f184f20e36fb647f2e0bc3988094d4fd8c9f4eb1b3b3" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/8c/aa/7f29711f26680eab0bcd3ecdd6d23ed6bce180d82e3f6380fb7ae35fcf3b/pillow-11.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7fdadc077553621911f27ce206ffcbec7d3f8d7b50e0da39f10997e8e2bb7f6a" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/c8/c4/8f0fe3b9e0f7196f6d0bbb151f9fba323d72a41da068610c4c960b16632a/pillow-11.1.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:93a18841d09bcdd774dcdc308e4537e1f867b3dec059c131fde0327899734aa1" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/38/0d/84200ed6a871ce386ddc82904bfadc0c6b28b0c0ec78176871a4679e40b3/pillow-11.1.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:9aa9aeddeed452b2f616ff5507459e7bab436916ccb10961c4a382cd3e03f47f" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/84/9c/9bcd66f714d7e25b64118e3952d52841a4babc6d97b6d28e2261c52045d4/pillow-11.1.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:3cdcdb0b896e981678eee140d882b70092dac83ac1cdf6b3a60e2216a73f2b91" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/db/61/ada2a226e22da011b45f7104c95ebda1b63dcbb0c378ad0f7c2a710f8fd2/pillow-11.1.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:36ba10b9cb413e7c7dfa3e189aba252deee0602c86c309799da5a74009ac7a1c" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/e7/c4/fc6e86750523f367923522014b821c11ebc5ad402e659d8c9d09b3c9d70c/pillow-11.1.0-cp312-cp312-win32.whl", hash = "sha256:cfd5cd998c2e36a862d0e27b2df63237e67273f2fc78f47445b14e73a810e7e6" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/08/5c/2104299949b9d504baf3f4d35f73dbd14ef31bbd1ddc2c1b66a5b7dfda44/pillow-11.1.0-cp312-cp312-win_amd64.whl", hash = "sha256:a697cd8ba0383bba3d2d3ada02b34ed268cb548b369943cd349007730c92bddf" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/37/f3/9b18362206b244167c958984b57c7f70a0289bfb59a530dd8af5f699b910/pillow-11.1.0-cp312-cp312-win_arm64.whl", hash = "sha256:4dd43a78897793f60766563969442020e90eb7847463eca901e41ba186a7d4a5" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/fa/c5/389961578fb677b8b3244fcd934f720ed25a148b9a5cc81c91bdf59d8588/pillow-11.1.0-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:8c730dc3a83e5ac137fbc92dfcfe1511ce3b2b5d7578315b63dbbb76f7f51d90" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/c4/fa/803c0e50ffee74d4b965229e816af55276eac1d5806712de86f9371858fd/pillow-11.1.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:7d33d2fae0e8b170b6a6c57400e077412240f6f5bb2a342cf1ee512a787942bb" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/dc/67/2a3a5f8012b5d8c63fe53958ba906c1b1d0482ebed5618057ef4d22f8076/pillow-11.1.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a8d65b38173085f24bc07f8b6c505cbb7418009fa1a1fcb111b1f4961814a442" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/e5/a0/514f0d317446c98c478d1872497eb92e7cde67003fed74f696441e647446/pillow-11.1.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:015c6e863faa4779251436db398ae75051469f7c903b043a48f078e437656f83" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/cd/00/20f40a935514037b7d3f87adfc87d2c538430ea625b63b3af8c3f5578e72/pillow-11.1.0-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:d44ff19eea13ae4acdaaab0179fa68c0c6f2f45d66a4d8ec1eda7d6cecbcc15f" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/28/3c/7de681727963043e093c72e6c3348411b0185eab3263100d4490234ba2f6/pillow-11.1.0-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:d3d8da4a631471dfaf94c10c85f5277b1f8e42ac42bade1ac67da4b4a7359b73" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/41/67/936f9814bdd74b2dfd4822f1f7725ab5d8ff4103919a1664eb4874c58b2f/pillow-11.1.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:4637b88343166249fe8aa94e7c4a62a180c4b3898283bb5d3d2fd5fe10d8e4e0" },
+]
+
 [[package]]
 name = "pluggy"
 version = "1.5.0"
@ -138,14 +281,48 @@ wheels = [
    { url = "https://mirrors.aliyun.com/pypi/packages/30/3d/64ad57c803f1fa1e963a7946b6e0fea4a70df53c1a7fed304586539c2bac/pytest-8.3.5-py3-none-any.whl", hash = "sha256:c69214aa47deac29fad6c2a4f590b9c4a9fdb16a403176fe154b79c0b4d4d820" },
 ]

+[[package]]
+name = "python-docx"
+version = "1.1.2"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+dependencies = [
+    { name = "lxml" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/35/e4/386c514c53684772885009c12b67a7edd526c15157778ac1b138bc75063e/python_docx-1.1.2.tar.gz", hash = "sha256:0cf1f22e95b9002addca7948e16f2cd7acdfd498047f1941ca5d293db7762efd" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/3e/3d/330d9efbdb816d3f60bf2ad92f05e1708e4a1b9abe80461ac3444c83f749/python_docx-1.1.2-py3-none-any.whl", hash = "sha256:08c20d6058916fb19853fcf080f7f42b6270d89eac9fa5f8c15f691c0017fabe" },
+]
+
+[[package]]
+name = "python-pptx"
+version = "1.0.2"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+dependencies = [
+    { name = "lxml" },
+    { name = "pillow" },
+    { name = "typing-extensions" },
+    { name = "xlsxwriter" },
+]
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/52/a9/0c0db8d37b2b8a645666f7fd8accea4c6224e013c42b1d5c17c93590cd06/python_pptx-1.0.2.tar.gz", hash = "sha256:479a8af0eaf0f0d76b6f00b0887732874ad2e3188230315290cd1f9dd9cc7095" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/d9/4f/00be2196329ebbff56ce564aa94efb0fbc828d00de250b1980de1a34ab49/python_pptx-1.0.2-py3-none-any.whl", hash = "sha256:160838e0b8565a8b1f67947675886e9fea18aa5e795db7ae531606d68e785cba" },
+]
+
 [[package]]
 name = "ragflow-sdk"
-version = "0.17.1"
+version = "0.17.2"
 source = { virtual = "." }
 dependencies = [
    { name = "beartype" },
+    { name = "openpyxl" },
+    { name = "pillow" },
    { name = "pytest" },
+    { name = "python-docx" },
+    { name = "python-pptx" },
+    { name = "reportlab" },
    { name = "requests" },
+    { name = "requests-toolbelt" },
 ]

 [package.optional-dependencies]
@ -156,9 +333,28 @@ test = [
 [package.metadata]
 requires-dist = [
    { name = "beartype", specifier = ">=0.18.5,<0.19.0" },
+    { name = "openpyxl", specifier = ">=3.1.5" },
+    { name = "pillow", specifier = ">=11.1.0" },
    { name = "pytest", specifier = ">=8.0.0,<9.0.0" },
    { name = "pytest", marker = "extra == 'test'", specifier = ">=8.0.0,<9.0.0" },
+    { name = "python-docx", specifier = ">=1.1.2" },
+    { name = "python-pptx", specifier = ">=1.0.2" },
+    { name = "reportlab", specifier = ">=4.3.1" },
    { name = "requests", specifier = ">=2.30.0,<3.0.0" },
+    { name = "requests-toolbelt", specifier = ">=1.0.0" },
+]
+
+[[package]]
+name = "reportlab"
+version = "4.3.1"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+dependencies = [
+    { name = "chardet" },
+    { name = "pillow" },
+]
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/a7/5c/9b23c8a9a69f2bc1f1268ed545f393a60b59cbe5f9d861a28b676f809729/reportlab-4.3.1.tar.gz", hash = "sha256:230f78b21667194d8490ac9d12958d5c14686352db7fbe03b95140fafdf5aa97" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/ce/6b/42805895ed08a314a01be6110584b5d059328386988ddbc4f8f10014d30e/reportlab-4.3.1-py3-none-any.whl", hash = "sha256:0f37dd16652db3ef84363cf744632a28c38bd480d5bf94683466852d7bb678dd" },
 ]

 [[package]]
@ -176,6 +372,18 @@ wheels = [
    { url = "https://mirrors.aliyun.com/pypi/packages/f9/9b/335f9764261e915ed497fcdeb11df5dfd6f7bf257d4a6a2a686d80da4d54/requests-2.32.3-py3-none-any.whl", hash = "sha256:70761cfe03c773ceb22aa2f671b4757976145175cdfca038c02654d061d6dcc6" },
 ]

+[[package]]
+name = "requests-toolbelt"
+version = "1.0.0"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+dependencies = [
+    { name = "requests" },
+]
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/f3/61/d7545dafb7ac2230c70d38d31cbfe4cc64f7144dc41f6e4e4b78ecd9f5bb/requests-toolbelt-1.0.0.tar.gz", hash = "sha256:7681a0a3d047012b5bdc0ee37d7f8f07ebe76ab08caeccfc3921ce23c88d5bc6" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/3f/51/d4db610ef29373b879047326cbf6fa98b6c1969d6f6dc423279de2b1be2c/requests_toolbelt-1.0.0-py2.py3-none-any.whl", hash = "sha256:cccfdd665f0a24fcf4726e690f65639d272bb0637b9b92dfd91a5568ccf6bd06" },
+]
+
 [[package]]
 name = "tomli"
 version = "2.2.1"
@ -205,6 +413,15 @@ wheels = [
    { url = "https://mirrors.aliyun.com/pypi/packages/6e/c2/61d3e0f47e2b74ef40a68b9e6ad5984f6241a942f7cd3bbfbdbd03861ea9/tomli-2.2.1-py3-none-any.whl", hash = "sha256:cb55c73c5f4408779d0cf3eef9f762b9c9f147a77de7b258bef0a5628adc85cc" },
 ]

+[[package]]
+name = "typing-extensions"
+version = "4.12.2"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz", hash = "sha256:1a7ead55c7e559dd4dee8856e3a88b41225abfe1ce8df57b7c13915fe121ffb8" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/26/9f/ad63fc0248c5379346306f8668cda6e2e2e9c95e01216d2b8ffd9ff037d0/typing_extensions-4.12.2-py3-none-any.whl", hash = "sha256:04e5ca0351e0f3f85c6853954072df659d0d13fac324d0072316b67d7794700d" },
+]
+
 [[package]]
 name = "urllib3"
 version = "2.3.0"
@ -213,3 +430,12 @@ sdist = { url = "https://mirrors.aliyun.com/pypi/packages/aa/63/e53da845320b757b
 wheels = [
    { url = "https://mirrors.aliyun.com/pypi/packages/c8/19/4ec628951a74043532ca2cf5d97b7b14863931476d117c471e8e2b1eb39f/urllib3-2.3.0-py3-none-any.whl", hash = "sha256:1cee9ad369867bfdbbb48b7dd50374c0967a0bb7710050facf0dd6911440e3df" },
 ]
+
+[[package]]
+name = "xlsxwriter"
+version = "3.2.2"
+source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/a1/08/26f69d1e9264e8107253018de9fc6b96f9219817d01c5f021e927384a8d1/xlsxwriter-3.2.2.tar.gz", hash = "sha256:befc7f92578a85fed261639fb6cde1fd51b79c5e854040847dde59d4317077dc" }
+wheels = [
+    { url = "https://mirrors.aliyun.com/pypi/packages/9b/07/df054f7413bdfff5e98f75056e4ed0977d0c8716424011fac2587864d1d3/XlsxWriter-3.2.2-py3-none-any.whl", hash = "sha256:272ce861e7fa5e82a4a6ebc24511f2cb952fde3461f6c6e1a1e81d3272db1471" },
+]
--- a/uv.lock
+++ b/uv.lock
@ -14,7 +14,7 @@ resolution-markers = [

 [[package]]
 name = "accelerate"
-version = "1.4.0"
+version = "1.5.1"
 source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
 dependencies = [
    { name = "huggingface-hub" },
@ -25,18 +25,18 @@ dependencies = [
    { name = "safetensors" },
    { name = "torch" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/8f/02/24a4c4edb9cf0f1e0bc32bb6829e2138f1cc201442e7a24f0daf93b8a15a/accelerate-1.4.0.tar.gz", hash = "sha256:37d413e1b64cb8681ccd2908ae211cf73e13e6e636a2f598a96eccaa538773a5" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/64/fb/10daafb0efbb1af95d782c9907004bd50fcfd74d6e11e6a91945df37768e/accelerate-1.5.1.tar.gz", hash = "sha256:5d936faf3a31894c6160f2f2a984a38aecbba760ef919ae298b2ecd57ea9bf87" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/0a/f6/791b9d7eb371a2f385da3b7f1769ced72ead7bf09744637ea2985c83d7ee/accelerate-1.4.0-py3-none-any.whl", hash = "sha256:f6e1e7dfaf9d799a20a1dc45efbf4b1546163eac133faa5acd0d89177c896e55" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/4b/ef/2723a3c53d06619dac38c1630bac3d9b7aec91e1a18a82a08b93696b8baf/accelerate-1.5.1-py3-none-any.whl", hash = "sha256:4838cff9ed1bb0ddc9d967530ced62a1d74ea21cdb57688400359ab32682f03e" },
 ]

 [[package]]
 name = "aiohappyeyeballs"
-version = "2.5.0"
+version = "2.6.1"
 source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/a2/0c/458958007041f4b4de2d307e6b75d9e7554dad0baf26fe7a48b741aac126/aiohappyeyeballs-2.5.0.tar.gz", hash = "sha256:18fde6204a76deeabc97c48bdd01d5801cfda5d6b9c8bbeb1aaaee9d648ca191" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/26/30/f84a107a9c4331c14b2b586036f40965c128aa4fee4dda5d3d51cb14ad54/aiohappyeyeballs-2.6.1.tar.gz", hash = "sha256:c3f9d0113123803ccadfdf3f0faa505bc78e6a72d1cc4806cbd719826e943558" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/1b/9a/e4886864ce06e1579bd428208127fbdc0d62049c751e4e9e3b509c0059dc/aiohappyeyeballs-2.5.0-py3-none-any.whl", hash = "sha256:0850b580748c7071db98bffff6d4c94028d0d3035acc20fd721a0ce7e8cac35d" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/0f/15/5bf3b99495fb160b63f95972b81750f18f7f4e02ad051373b669d17d44f2/aiohappyeyeballs-2.6.1-py3-none-any.whl", hash = "sha256:f349ba8f4b75cb25c99c5c2d84e997e485204d2902a9597802b0371f09331fb8" },
 ]

 [[package]]
@ -149,7 +149,7 @@ wheels = [

 [[package]]
 name = "akshare"
-version = "1.16.42"
+version = "1.16.44"
 source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
 dependencies = [
    { name = "akracer", marker = "sys_platform == 'linux'" },
@ -168,9 +168,9 @@ dependencies = [
    { name = "urllib3" },
    { name = "xlrd" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/59/86/2571a0ea70a352e55100cabc3c2eb46708c78d1721693f2c708a96d9ab4c/akshare-1.16.42.tar.gz", hash = "sha256:e69fb98dbeaedd287002ed79963279ad17502cb463c7d8fa8b8692d215fcc80f" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/23/39/d431fcc82ec60a48f0e658d6ef5d7e8d08a7b948435da2d40185a4f84841/akshare-1.16.44.tar.gz", hash = "sha256:11cdb794bcf349b1aea208b9f0f697642b387947617a00f61eac674520a4040d" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/3b/e7/b1a3e515f34de3416ba4e96adba8697c12b8b8fd74aa1a7bd505dd450f57/akshare-1.16.42-py3-none-any.whl", hash = "sha256:154374adb01b030ed9baed992d836887858154fbec42db7885d818119d081a71" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/ce/67/93d2f437da14c20ef04d63de7815a1171727037d173ca364323ebdbdb2a4/akshare-1.16.44-py3-none-any.whl", hash = "sha256:f22af3ab0ef85143444880f93fd62b89eb62ee1945f3cfb02cc66e44d4cfcff2" },
 ]

 [[package]]
@ -327,11 +327,11 @@ wheels = [

 [[package]]
 name = "attrs"
-version = "25.1.0"
+version = "25.2.0"
 source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/49/7c/fdf464bcc51d23881d110abd74b512a42b3d5d376a55a831b44c603ae17f/attrs-25.1.0.tar.gz", hash = "sha256:1c97078a80c814273a76b2a298a932eb681c87415c11dee0a6921de7f1b02c3e" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/69/82/3c4e1d44f3cbaa2a578127d641fe385ba3bff6c38b789447ae11a21fa413/attrs-25.2.0.tar.gz", hash = "sha256:18a06db706db43ac232cce80443fcd9f2500702059ecf53489e3c5a3f417acaf" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/fc/30/d4986a882011f9df997a55e6becd864812ccfcd821d64aac8570ee39f719/attrs-25.1.0-py3-none-any.whl", hash = "sha256:c75a69e28a550a7e93789579c22aa26b0f5b83b75dc4e08fe092980051e1090a" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/03/33/7a7388b9ef94aab40539939d94461ec682afbd895458945ed25be07f03f6/attrs-25.2.0-py3-none-any.whl", hash = "sha256:611344ff0a5fed735d86d7784610c84f8126b95e549bcad9ff61b4242f2d386b" },
 ]

 [[package]]
@ -1733,7 +1733,7 @@ grpc = [

 [[package]]
 name = "google-api-python-client"
-version = "2.163.0"
+version = "2.164.0"
 source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
 dependencies = [
    { name = "google-api-core" },
@ -1742,9 +1742,9 @@ dependencies = [
    { name = "httplib2" },
    { name = "uritemplate" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/29/9f/535346bb1469ec91139c38f0438ad70bd229a6b11452367065fe49303860/google_api_python_client-2.163.0.tar.gz", hash = "sha256:88dee87553a2d82176e2224648bf89272d536c8f04dcdda37ef0a71473886dd7" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/32/5b/4ed16fac5ef6928d0c1ca0fba42f27e73938f04729ef97e63d7a7bb5fd6d/google_api_python_client-2.164.0.tar.gz", hash = "sha256:116f5a05dfb95ed7f7ea0d0f561fc5464146709c583226cc814690f9bb221492" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/91/3d/203d1c18cb239313ac125721f284e94c01c7eee947adb2ef2d9a85ac8d66/google_api_python_client-2.163.0-py2.py3-none-any.whl", hash = "sha256:080e8bc0669cb4c1fb8efb8da2f5b91a2625d8f0e7796cfad978f33f7016c6c4" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/94/0d/4eacf5bff40a42e6be3086b85164f0624fee9724c11bb2c79305fbc2f355/google_api_python_client-2.164.0-py2.py3-none-any.whl", hash = "sha256:b2037c3d280793c8d5180b04317b16be4acd5f77af5dfa7213ace32d140a9ffe" },
 ]

 [[package]]
@ -3068,16 +3068,16 @@ wheels = [

 [[package]]
 name = "msal"
-version = "1.31.1"
+version = "1.32.0"
 source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
 dependencies = [
    { name = "cryptography" },
    { name = "pyjwt", extra = ["crypto"] },
    { name = "requests" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/3f/f3/cdf2681e83a73c3355883c2884b6ff2f2d2aadfc399c28e9ac4edc3994fd/msal-1.31.1.tar.gz", hash = "sha256:11b5e6a3f802ffd3a72107203e20c4eac6ef53401961b880af2835b723d80578" }
+sdist = { url = "https://mirrors.aliyun.com/pypi/packages/aa/5f/ef42ef25fba682e83a8ee326a1a788e60c25affb58d014495349e37bce50/msal-1.32.0.tar.gz", hash = "sha256:5445fe3af1da6be484991a7ab32eaa82461dc2347de105b76af92c610c3335c2" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/30/7c/489cd931a752d05753d730e848039f08f65f86237cf1b8724d0a1cbd700b/msal-1.31.1-py3-none-any.whl", hash = "sha256:29d9882de247e96db01386496d59f29035e5e841bcac892e6d7bf4390bf6bd17" },
+    { url = "https://mirrors.aliyun.com/pypi/packages/93/5a/2e663ef56a5d89eba962941b267ebe5be8c5ea340a9929d286e2f5fac505/msal-1.32.0-py3-none-any.whl", hash = "sha256:9dbac5384a10bbbf4dae5c7ea0d707d14e087b92c5aa4954b3feaa2d1aa0bcb7" },
 ]

 [[package]]
@ -4716,7 +4716,7 @@ wheels = [

 [[package]]
 name = "ragflow"
-version = "0.17.1"
+version = "0.17.2"
 source = { virtual = "." }
 dependencies = [
    { name = "akshare" },
--- a/web/src/app.tsx
+++ b/web/src/app.tsx
@ -7,6 +7,7 @@ import enUS from 'antd/locale/en_US';
 import vi_VN from 'antd/locale/vi_VN';
 import zhCN from 'antd/locale/zh_CN';
 import zh_HK from 'antd/locale/zh_HK';
+import deDE from 'antd/locale/de_DE';
 import dayjs from 'dayjs';
 import advancedFormat from 'dayjs/plugin/advancedFormat';
 import customParseFormat from 'dayjs/plugin/customParseFormat';
@ -32,6 +33,7 @@ const AntLanguageMap = {
  'zh-TRADITIONAL': zh_HK,
  vi: vi_VN,
  'pt-BR': pt_BR,
+  de: deDE,
 };

 const queryClient = new QueryClient();
--- a/web/src/components/max-token-number.tsx
+++ b/web/src/components/max-token-number.tsx
@ -6,7 +6,7 @@ interface IProps {
  max?: number;
 }

-const MaxTokenNumber = ({ initialValue = 128, max = 2048 }: IProps) => {
+const MaxTokenNumber = ({ initialValue = 512, max = 2048 }: IProps) => {
  const { t } = useTranslate('knowledgeConfiguration');

  return (
--- a/web/src/components/tavily-item.tsx
+++ b/web/src/components/tavily-item.tsx
@ -0,0 +1,25 @@
+import { useTranslate } from '@/hooks/common-hooks';
+import { Form, Input, Typography } from 'antd';
+
+interface IProps {
+  name?: string | string[];
+}
+
+export function TavilyItem({
+  name = ['prompt_config', 'tavily_api_key'],
+}: IProps) {
+  const { t } = useTranslate('chat');
+
+  return (
+    <Form.Item label={'Tavily API Key'} tooltip={t('tavilyApiKeyTip')}>
+      <div className="flex flex-col gap-1">
+        <Form.Item name={name} noStyle>
+          <Input.Password placeholder={t('tavilyApiKeyMessage')} />
+        </Form.Item>
+        <Typography.Link href="https://app.tavily.com/home" target={'_blank'}>
+          {t('tavilyApiKeyHelp')}
+        </Typography.Link>
+      </div>
+    </Form.Item>
+  );
+}
--- a/web/src/components/ui/breadcrumb.tsx
+++ b/web/src/components/ui/breadcrumb.tsx
@ -0,0 +1,115 @@
+import { Slot } from '@radix-ui/react-slot';
+import { ChevronRight, MoreHorizontal } from 'lucide-react';
+import * as React from 'react';
+
+import { cn } from '@/lib/utils';
+
+const Breadcrumb = React.forwardRef<
+  HTMLElement,
+  React.ComponentPropsWithoutRef<'nav'> & {
+    separator?: React.ReactNode;
+  }
+>(({ ...props }, ref) => <nav ref={ref} aria-label="breadcrumb" {...props} />);
+Breadcrumb.displayName = 'Breadcrumb';
+
+const BreadcrumbList = React.forwardRef<
+  HTMLOListElement,
+  React.ComponentPropsWithoutRef<'ol'>
+>(({ className, ...props }, ref) => (
+  <ol
+    ref={ref}
+    className={cn(
+      'flex flex-wrap items-center gap-1.5 break-words text-sm text-muted-foreground sm:gap-2.5',
+      className,
+    )}
+    {...props}
+  />
+));
+BreadcrumbList.displayName = 'BreadcrumbList';
+
+const BreadcrumbItem = React.forwardRef<
+  HTMLLIElement,
+  React.ComponentPropsWithoutRef<'li'>
+>(({ className, ...props }, ref) => (
+  <li
+    ref={ref}
+    className={cn('inline-flex items-center gap-1.5', className)}
+    {...props}
+  />
+));
+BreadcrumbItem.displayName = 'BreadcrumbItem';
+
+const BreadcrumbLink = React.forwardRef<
+  HTMLAnchorElement,
+  React.ComponentPropsWithoutRef<'a'> & {
+    asChild?: boolean;
+  }
+>(({ asChild, className, ...props }, ref) => {
+  const Comp = asChild ? Slot : 'a';
+
+  return (
+    <Comp
+      ref={ref}
+      className={cn('transition-colors hover:text-foreground', className)}
+      {...props}
+    />
+  );
+});
+BreadcrumbLink.displayName = 'BreadcrumbLink';
+
+const BreadcrumbPage = React.forwardRef<
+  HTMLSpanElement,
+  React.ComponentPropsWithoutRef<'span'>
+>(({ className, ...props }, ref) => (
+  <span
+    ref={ref}
+    role="link"
+    aria-disabled="true"
+    aria-current="page"
+    className={cn('font-normal text-foreground', className)}
+    {...props}
+  />
+));
+BreadcrumbPage.displayName = 'BreadcrumbPage';
+
+const BreadcrumbSeparator = ({
+  children,
+  className,
+  ...props
+}: React.ComponentProps<'li'>) => (
+  <li
+    role="presentation"
+    aria-hidden="true"
+    className={cn('[&>svg]:w-3.5 [&>svg]:h-3.5', className)}
+    {...props}
+  >
+    {children ?? <ChevronRight />}
+  </li>
+);
+BreadcrumbSeparator.displayName = 'BreadcrumbSeparator';
+
+const BreadcrumbEllipsis = ({
+  className,
+  ...props
+}: React.ComponentProps<'span'>) => (
+  <span
+    role="presentation"
+    aria-hidden="true"
+    className={cn('flex h-9 w-9 items-center justify-center', className)}
+    {...props}
+  >
+    <MoreHorizontal className="h-4 w-4" />
+    <span className="sr-only">More</span>
+  </span>
+);
+BreadcrumbEllipsis.displayName = 'BreadcrumbElipssis';
+
+export {
+  Breadcrumb,
+  BreadcrumbEllipsis,
+  BreadcrumbItem,
+  BreadcrumbLink,
+  BreadcrumbList,
+  BreadcrumbPage,
+  BreadcrumbSeparator,
+};
--- a/web/src/components/ui/transfer-list.tsx
+++ b/web/src/components/ui/transfer-list.tsx
@ -10,15 +10,19 @@ import {
 } from 'lucide-react';
 import React from 'react';

-type Item = {
+export type TransferListItemType = {
  key: string;
  label: string;
  selected?: boolean;
 };

-export default function TransferList({ items }: { items: Item[] }) {
-  const [leftList, setLeftList] = React.useState<Item[]>(items);
-  const [rightList, setRightList] = React.useState<Item[]>([]);
+export default function TransferList({
+  items,
+}: {
+  items: TransferListItemType[];
+}) {
+  const [leftList, setLeftList] = React.useState<TransferListItemType[]>(items);
+  const [rightList, setRightList] = React.useState<TransferListItemType[]>([]);
  const [leftSearch, setLeftSearch] = React.useState('');
  const [rightSearch, setRightSearch] = React.useState('');

@ -35,8 +39,8 @@ export default function TransferList({ items }: { items: Item[] }) {
  };

  const toggleSelection = (
-    list: Item[],
-    setList: React.Dispatch<React.SetStateAction<Item[]>>,
+    list: TransferListItemType[],
+    setList: React.Dispatch<React.SetStateAction<TransferListItemType[]>>,
    key: string,
  ) => {
    const updatedList = list.map((item) => {
--- a/web/src/components/use-knowledge-graph-item.tsx
+++ b/web/src/components/use-knowledge-graph-item.tsx
@ -3,7 +3,7 @@ import { useTranslation } from 'react-i18next';
 import { SwitchFormField } from './switch-fom-field';

 type IProps = {
-  filedName: string[];
+  filedName: string[] | string;
 };

 export function UseKnowledgeGraphItem({ filedName }: IProps) {
--- a/web/src/constants/common.ts
+++ b/web/src/constants/common.ts
@ -48,6 +48,7 @@ export const LanguageList = [
  'Vietnamese',
  'Japanese',
  'Portuguese BR',
+  'German',
 ];

 export const LanguageMap = {
@ -59,6 +60,7 @@ export const LanguageMap = {
  Vietnamese: 'Tiếng việt',
  Japanese: '日本語',
  'Portuguese BR': 'Português BR',
+  German: 'German',
 };

 export enum LanguageAbbreviation {
@ -70,6 +72,7 @@ export enum LanguageAbbreviation {
  Es = 'es',
  Vi = 'vi',
  PtBr = 'pt-BR',
+  De = 'de',
 }

 export const LanguageAbbreviationMap = {
@ -81,6 +84,7 @@ export const LanguageAbbreviationMap = {
  [LanguageAbbreviation.Vi]: 'Tiếng việt',
  [LanguageAbbreviation.Ja]: '日本語',
  [LanguageAbbreviation.PtBr]: 'Português BR',
+  [LanguageAbbreviation.De]: 'Deutsch',
 };

 export const LanguageTranslationMap = {
@ -92,6 +96,7 @@ export const LanguageTranslationMap = {
  Vietnamese: 'vi',
  Japanese: 'ja',
  'Portuguese BR': 'pt-br',
+  German: 'de',
 };

 export enum FileMimeType {
--- a/web/src/locales/config.ts
+++ b/web/src/locales/config.ts
@ -8,6 +8,7 @@ import translation_es from './es';
 import translation_id from './id';
 import translation_ja from './ja';
 import translation_pt_br from './pt-br';
+import translation_de from './de';
 import { createTranslationTable, flattenObject } from './until';
 import translation_vi from './vi';
 import translation_zh from './zh';
@ -22,6 +23,7 @@ const resources = {
  [LanguageAbbreviation.Es]: translation_es,
  [LanguageAbbreviation.Vi]: translation_vi,
  [LanguageAbbreviation.PtBr]: translation_pt_br,
+  [LanguageAbbreviation.De]: translation_de,
 };
 const enFlattened = flattenObject(translation_en);
 const viFlattened = flattenObject(translation_vi);
@ -30,6 +32,7 @@ const zhFlattened = flattenObject(translation_zh);
 const jaFlattened = flattenObject(translation_ja);
 const pt_brFlattened = flattenObject(translation_pt_br);
 const zh_traditionalFlattened = flattenObject(translation_zh_traditional);
+const deFlattened = flattenObject(translation_de);
 export const translationTable = createTranslationTable(
  [
    enFlattened,
@ -39,8 +42,9 @@ export const translationTable = createTranslationTable(
    zh_traditionalFlattened,
    jaFlattened,
    pt_brFlattened,
+    deFlattened,
  ],
-  ['English', 'Vietnamese', 'Spanish', 'zh', 'zh-TRADITIONAL', 'ja', 'pt-BR'],
+  ['English', 'Vietnamese', 'Spanish', 'zh', 'zh-TRADITIONAL', 'ja', 'pt-BR', 'Deutsch'],
 );
 i18n
  .use(initReactI18next)
--- a/web/src/locales/de.ts
+++ b/web/src/locales/de.ts
--- a/web/src/pages/chat/chat-configuration-modal/assistant-setting.tsx
+++ b/web/src/pages/chat/chat-configuration-modal/assistant-setting.tsx
@ -1,8 +1,9 @@
 import KnowledgeBaseItem from '@/components/knowledge-base-item';
+import { TavilyItem } from '@/components/tavily-item';
 import { useTranslate } from '@/hooks/common-hooks';
 import { useFetchTenantInfo } from '@/hooks/user-setting-hooks';
 import { PlusOutlined } from '@ant-design/icons';
-import { Form, Input, message, Select, Switch, Typography, Upload } from 'antd';
+import { Form, Input, message, Select, Switch, Upload } from 'antd';
 import classNames from 'classnames';
 import { useCallback } from 'react';
 import { ISegmentedContentProps } from '../interface';
@ -147,16 +148,7 @@ const AssistantSetting = ({
      >
        <Switch onChange={handleTtsChange} />
      </Form.Item>
-      <Form.Item label={'Tavily API Key'} tooltip={t('tavilyApiKeyTip')}>
-        <div className="flex flex-col gap-1">
-          <Form.Item name={['prompt_config', 'tavily_api_key']} noStyle>
-            <Input.Password placeholder={t('tavilyApiKeyMessage')} />
-          </Form.Item>
-          <Typography.Link href="https://app.tavily.com/home" target={'_blank'}>
-            {t('tavilyApiKeyHelp')}
-          </Typography.Link>
-        </div>
-      </Form.Item>
+      <TavilyItem></TavilyItem>
      <KnowledgeBaseItem
        required={false}
        onChange={handleChange}
--- a/web/src/pages/flow/constant.tsx
+++ b/web/src/pages/flow/constant.tsx
@ -397,6 +397,7 @@ export const initialRetrievalValues = {
  similarity_threshold: 0.2,
  keywords_similarity_weight: 0.3,
  top_n: 8,
+  use_kg: false,
  ...initialQueryBaseValues,
 };

--- a/web/src/pages/flow/form/retrieval-form/index.tsx
+++ b/web/src/pages/flow/form/retrieval-form/index.tsx
@ -1,7 +1,9 @@
 import KnowledgeBaseItem from '@/components/knowledge-base-item';
 import Rerank from '@/components/rerank';
 import SimilaritySlider from '@/components/similarity-slider';
+import { TavilyItem } from '@/components/tavily-item';
 import TopNItem from '@/components/top-n-item';
+import { UseKnowledgeGraphItem } from '@/components/use-knowledge-graph-item';
 import { useTranslate } from '@/hooks/common-hooks';
 import type { FormProps } from 'antd';
 import { Form, Input } from 'antd';
@ -39,6 +41,8 @@ const RetrievalForm = ({ onValuesChange, form, node }: IOperatorForm) => {
      ></SimilaritySlider>
      <TopNItem></TopNItem>
      <Rerank></Rerank>
+      <TavilyItem name={'tavily_api_key'}></TavilyItem>
+      <UseKnowledgeGraphItem filedName={'use_kg'}></UseKnowledgeGraphItem>
      <KnowledgeBaseItem></KnowledgeBaseItem>
      <Form.Item
        name={'empty_response'}
--- a/web/src/pages/user-setting/setting-locale/index.tsx
+++ b/web/src/pages/user-setting/setting-locale/index.tsx
@ -13,6 +13,7 @@ function UserSettingLocale() {
        'zh-TRADITIONAL',
        'ja',
        'pt-br',
+        'German',
      ]}
    />
  );
Author	SHA1	Message	Date
writinwaters	baf3b9be7c	Added 0.17.2 release notes (#6028 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-13 15:59:58 +08:00
Kevin Hu	4df4bf68a2	DOCS: for release. (#6023 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-13 15:09:29 +08:00
Kevin Hu	471bd92b4c	Fix: empty api-key causes problems. (#6022 ) ### What problem does this PR solve? #5926 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-13 14:57:47 +08:00
balibabu	3af1063737	Feat: Set the default value of Chunk token number to 512 #6016 (#6017 ) ### What problem does this PR solve? Feat: Set the default value of Chunk token number to 512 #6016 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-13 14:51:55 +08:00
writinwaters	9c8060f619	0.17.1 release notes (#6021 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-13 14:43:24 +08:00
Zhichang Yu	e213873852	Optimize graphrag cache get entity (#6018 ) ### What problem does this PR solve? Optimize graphrag cache get entity ### Type of change - [x] Performance Improvement	2025-03-13 14:37:59 +08:00
liu an	56acb340d2	Test: update test cases per issue #5920 #5923 (#6007 ) ### What problem does this PR solve? update test cases per issue #5920 #5923 ### Type of change - [x] update test case	2025-03-13 10:53:07 +08:00
Kevin Hu	e05cdc2f9c	Fix: encode detect error. (#6006 ) ### What problem does this PR solve? #5967 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-13 10:47:58 +08:00
Kevin Hu	3571270191	Refa: refine the context window size warning. (#5993 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-03-12 19:40:54 +08:00
liu an	bd5eb47441	TEST: Added test cases for Upload Documents HTTP API (#5991 ) ### What problem does this PR solve? cover upload docments endpoints ### Type of change - [x] add test cases	2025-03-12 19:38:52 +08:00
Yongteng Lei	7cd37c37cd	Feat: add CSV file parsing support (#5989 ) ### What problem does this PR solve? Add CSV file parsing support #4552, #5849, #5870 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-12 19:20:50 +08:00
Kevin Hu	d660f6b9a5	Feat: add use KG to retrieval component. (#5988 ) ### What problem does this PR solve? #5973 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-12 19:10:07 +08:00
balibabu	80389ae61e	Feat: Alter Item to TransferListItemType #3221 (#5986 ) ### What problem does this PR solve? Feat: Alter Item to TransferListItemType #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-12 18:54:41 +08:00
kuro5989	6e13922bdc	Feat: Add qwq model support to Tongyi-Qianwen factory (#5981 ) ### What problem does this PR solve? add qwq model support to Tongyi-Qianwen factory https://github.com/infiniflow/ragflow/issues/5869 ### Type of change - [x] New Feature (non-breaking change which adds functionality) ![image](https://github.com/user-attachments/assets/49f5c6a0-ecaf-41dd-a23a-2009f854d62c) ![image](https://github.com/user-attachments/assets/93ffa303-920e-4942-8188-bcd6b7209204) ![1741774779438](https://github.com/user-attachments/assets/25f2fd1d-8640-4df0-9a08-78ee9daaa8fe) ![image](https://github.com/user-attachments/assets/4763cf6c-1f76-43c4-80ee-74dfd666a184) Co-authored-by: zhaozhicheng <zhicheng.zhao@fastonetech.com>	2025-03-12 18:54:15 +08:00
balibabu	c57f16d16f	Feat: Why can't Retrieval component support internet web search. #5973 (#5978 ) ### What problem does this PR solve? Feat: Why can't Retrieval component support internet web search. #5973 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-12 18:47:22 +08:00
so95	3c43a7aee8	For an Agent with an Input Begin value, on the first call the return … (#5957 ) …session_id does not exist in the session For an Agent with an Input Begin value, on the first call the return session_id does not exist in the session ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-12 17:01:44 +08:00
Kevin Hu	dd8779b257	Feat: `Retrieval` supports internet search. (#5974 ) ### What problem does this PR solve? #5973 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-12 16:51:01 +08:00
liu an	46bdfb9661	TEST: Remove unstable assertion introduced in PR #5924 (#5968 ) ### What problem does this PR solve? Remove unstable assertion introduced in PR #5924 ### Type of change - [x] update test cases	2025-03-12 16:09:45 +08:00
liwenju0	e3ea4b7ec2	Fix: Add Knowledge Base Document Parsing Status Check (#5966 ) When creating and updating chats, add a check for the parsing status of knowledge base documents. Ensure that all documents have been parsed before allowing chat creation to improve user experience and system stability. Main Changes: - Add document parsing status check logic in `chat.py`. - Implement the `is_parsed_done` method in `knowledgebase_service.py`. - Prevent chat creation when documents are being parsed or parsing has failed. ### What problem does this PR solve? fix this bug：https://github.com/infiniflow/ragflow/issues/5960 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenju.li <wenju.li@deepctr.cn>	2025-03-12 16:07:45 +08:00
writinwaters	41c67ce8dd	Fixed a Docusaurus display issue. (#5969 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-12 16:07:22 +08:00
liwenju0	870a6e93da	Refactoring: Optimization of the Deep Research Module Code Structure (#5959 ) This commit refactors the deep research module (deep_research.py), with the following major improvements: The complex thinking and retrieval logic has been broken down into multiple independent private methods, enhancing code readability and maintainability. Static methods and class methods have been introduced to simplify the logic for tag processing. The search and reasoning processes have been optimized, increasing the modularity of the code. The flexibility of information retrieval and processing has been improved. The refactored code structure is now clearer, making it easier to understand and extend the functionality of the deep research module. ### What problem does this PR solve? increase the modularity of the code ### Type of change - [x] Refactoring Co-authored-by: wenju.li <wenju.li@deepctr.cn>	2025-03-12 15:34:52 +08:00
Kevin Hu	80f87913bb	Fix: empty value updating. (#5949 ) ### What problem does this PR solve? #5920 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-12 11:25:17 +08:00
Kevin Hu	45123dcc0a	Fix: ollama model add error. (#5947 ) ### What problem does this PR solve? #5944 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-12 10:56:05 +08:00
Raghav Patidar	49d560583f	Fix: HTTP API Updates Read-Only Dataset Fields During Modification #5923 (#5937 ) ### What problem does this PR solve? Fixes #5923 Fixes the readonly variables from payload at /datasets/<dataset_id> _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ Now if user tries to modify readonly values then it will show " The input parameters are invalid. " invalid_keys = {"id", "embd_id", "chunk_num", "doc_num", "parser_id", "create_date", "create_time", "created_by", "status","token_num","update_date","update_time"} if any(key in req for key in invalid_keys): return get_error_data_result(message="The input parameters are invalid.") i have include those readonly keys in invalid_keys ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Raghav <2020csb1115@iitrpr.ac.in>	2025-03-12 10:27:02 +08:00
donblack01	1c663b32b9	Fix:signal.SIGUSR1 and signal.SIGUSR2 can't use in window. so don't bind signal.SIGUSR1 and signal.SIGUSR2 in the windows env (#5941 ) ### What problem does this PR solve? Fix:signal.SIGUSR1 and signal.SIGUSR2 can't use in window. so don't bind signal.SIGUSR1 and signal.SIGUSR2 in the windows env ### Type of change - [✓ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Co-authored-by: tangyu <1@1.com>	2025-03-12 09:43:18 +08:00
Kevin Hu	caecaa7562	Feat: apply LLM to optimize citations. (#5935 ) ### What problem does this PR solve? #5905 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-11 19:56:21 +08:00
任奇	ed11be23bf	Fix: When calling the Create chat completion API, the response data… (#5928 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: renqi <renqi08266@fxomail.com>	2025-03-11 19:56:07 +08:00
balibabu	7bd5a52019	Feat: Add Breadcrumb component #3221 (#5929 ) ### What problem does this PR solve? Feat: Add Breadcrumb component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-11 18:55:25 +08:00
liu an	87763ef0a0	TEST: Added test cases for Update Dataset HTTP API (#5924 ) ### What problem does this PR solve? cover dataset update endpoints ### Type of change - [x] Add test cases	2025-03-11 18:55:11 +08:00
Zhichang Yu	939e668096	Optimized graphrag again (#5927 ) ### What problem does this PR solve? Optimized graphrag again ### Type of change - [x] Performance Improvement	2025-03-11 18:36:10 +08:00
Kevin Hu	45318e7575	Docs: updates. (#5921 ) ### What problem does this PR solve? ### Type of change - [x] Other (please describe):	2025-03-11 16:43:50 +08:00
Philipp Rien	8250b9f6b0	Feat: Add german translations (#5866 ) ### What problem does this PR solve? Add Support for german language ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-11 16:13:58 +08:00
Kevin Hu	1abf03351d	Docs: reformat. (#5914 ) ### What problem does this PR solve? ### Type of change - [x] Other (please describe):	2025-03-11 16:11:27 +08:00
writinwaters	46b95d5cfe	Reverted some of the version changes (#5908 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-11 16:03:11 +08:00
Kevin Hu	59ba4777ee	Docs: updates issue templates. (#5913 ) ### What problem does this PR solve? ### Type of change - [x] Other (please describe):	2025-03-11 16:02:28 +08:00