Feat: add SearXNG search tool to Agent (frontend + backend, i18n) (#9699 )

### What problem does this PR solve? This PR integrates SearXNG as a new search tool for Agents. It adds corresponding form/config UI on the frontend and a new tool implementation on the backend, enabling aggregated web searches via a self-hosted SearXNG instance within chats/workflows. It also adds multilingual copy to support internationalized presentation and configuration guidance. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### What’s Changed - Frontend: new SearXNG tool configuration, forms, and command wiring - Main changes under `web/src/pages/agent/` - New components and form entries are connected to Agent tool selection and workflow node configuration - Backend: new tool implementation - `agent/tools/searxng.py`: connects to a SearXNG instance and performs search based on the provided instance URL and query parameters - i18n updates - Added/updated keys under `web/src/locales/`: `searXNG` and `searXNGDescription` - English reference in [web/src/locales/en.ts](cci:7://file:///c:/Users/ruy_x/Work/CRSC/2025/Software_Development/2025.8/ragflow-pr/ragflow/web/src/locales/en.ts:0:0-0:0): - `searXNG: 'SearXNG'` - `searXNGDescription: 'A component that searches via your provided SearXNG instance URL. Specify TopN and the instance URL.'` - Other languages have `searXNG` and `searXNGDescription` added as well, but accuracy is only guaranteed for English, Simplified Chinese, and Traditional Chinese. --------- Co-authored-by: xurui <xurui@crscd.com.cn>
Fix: Fixed the issue that similarity threshold modification in chat and search configuration failed #3221 (#9821 )
2026-01-04 03:25:30 +08:00 · 2025-08-29 14:15:40 +08:00 · 2025-08-29 14:10:10 +08:00 · 2025-08-29 13:35:41 +08:00 · 2025-08-29 10:57:29 +08:00 · 2025-08-29 10:40:41 +08:00
281 changed files with 6793 additions and 3102 deletions
--- a/README.md
+++ b/README.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.2">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -190,7 +190,7 @@ releases! 🌟
 > All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
 > If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.

-   > The command below downloads the `v0.20.2-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.2-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2` for the full edition `v0.20.2`.
+   > The command below downloads the `v0.20.4-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.4-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` for the full edition `v0.20.4`.

   ```bash
   $ cd ragflow/docker
@ -203,8 +203,8 @@ releases! 🌟

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   |-------------------|-----------------|-----------------------|--------------------------|
-   | v0.20.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.2-slim      | &approx;2       | ❌                   | Stable release            |
+   | v0.20.4           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.20.4-slim      | &approx;2       | ❌                   | Stable release            |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                   | _Unstable_ nightly build  |

@ -307,7 +307,7 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly

 ## 🔨 Launch service from source for development

-1. Install uv, or skip this step if it is already installed:
+1. Install `uv` and `pre-commit`, or skip this step if they are already installed:

   ```bash
   pipx install uv pre-commit
--- a/README_id.md
+++ b/README_id.md
@ -22,7 +22,7 @@
        <img alt="Lencana Daring" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.2">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Rilis%20Terbaru" alt="Rilis Terbaru">
@ -181,7 +181,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
 > Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
 > Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).

-> Perintah di bawah ini mengunduh edisi v0.20.2-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.20.2-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2 untuk edisi lengkap v0.20.2.
+> Perintah di bawah ini mengunduh edisi v0.20.4-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.20.4-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4 untuk edisi lengkap v0.20.4.

 ```bash
 $ cd ragflow/docker
@ -194,8 +194,8 @@ $ docker compose -f docker-compose.yml up -d

 | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
 | ----------------- | --------------- | --------------------- | ------------------------ |
-| v0.20.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
-| v0.20.2-slim      | &approx;2       | ❌                    | Stable release           |
+| v0.20.4           | &approx;9       | :heavy_check_mark:    | Stable release           |
+| v0.20.4-slim      | &approx;2       | ❌                    | Stable release           |
 | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
 | nightly-slim      | &approx;2       | ❌                    | _Unstable_ nightly build |

@ -271,7 +271,7 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly

 ## 🔨 Menjalankan Aplikasi dari untuk Pengembangan

-1. Instal uv, atau lewati langkah ini jika sudah terinstal:
+1. Instal `uv` dan `pre-commit`, atau lewati langkah ini jika sudah terinstal:

   ```bash
   pipx install uv pre-commit
--- a/README_ja.md
+++ b/README_ja.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.2">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -160,7 +160,7 @@
 > 現在、公式に提供されているすべての Docker イメージは x86 アーキテクチャ向けにビルドされており、ARM64 用の Docker イメージは提供されていません。
 > ARM64 アーキテクチャのオペレーティングシステムを使用している場合は、[このドキュメント](https://ragflow.io/docs/dev/build_docker_image)を参照して Docker イメージを自分でビルドしてください。

-   > 以下のコマンドは、RAGFlow Docker イメージの v0.20.2-slim エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.20.2-slim とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。例えば、完全版 v0.20.2 をダウンロードするには、RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2 と設定します。
+   > 以下のコマンドは、RAGFlow Docker イメージの v0.20.4-slim エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.20.4-slim とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。例えば、完全版 v0.20.4 をダウンロードするには、RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4 と設定します。

   ```bash
   $ cd ragflow/docker
@ -173,8 +173,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.20.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.2-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.20.4           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.20.4-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

@ -266,7 +266,7 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly

 ## 🔨 ソースコードからサービスを起動する方法

-1. uv をインストールする。すでにインストールされている場合は、このステップをスキップしてください:
+1. `uv` と `pre-commit` をインストールする。すでにインストールされている場合は、このステップをスキップしてください:

   ```bash
   pipx install uv pre-commit
--- a/README_ko.md
+++ b/README_ko.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.2">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -160,7 +160,7 @@
 > 모든 Docker 이미지는 x86 플랫폼을 위해 빌드되었습니다. 우리는 현재 ARM64 플랫폼을 위한 Docker 이미지를 제공하지 않습니다.
 > ARM64 플랫폼을 사용 중이라면, [시스템과 호환되는 Docker 이미지를 빌드하려면 이 가이드를 사용해 주세요](https://ragflow.io/docs/dev/build_docker_image).

-   > 아래 명령어는 RAGFlow Docker 이미지의 v0.20.2-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.20.2-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.20.2을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2로 설정합니다.
+   > 아래 명령어는 RAGFlow Docker 이미지의 v0.20.4-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.20.4-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.20.4을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4로 설정합니다.

   ```bash
   $ cd ragflow/docker
@ -173,8 +173,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.20.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.2-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.20.4           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.20.4-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

@ -265,7 +265,7 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly

 ## 🔨 소스 코드로 서비스를 시작합니다.

-1. uv를 설치하거나 이미 설치된 경우 이 단계를 건너뜁니다:
+1. `uv` 와 `pre-commit` 을 설치하거나, 이미 설치된 경우 이 단계를 건너뜁니다:

   ```bash
   pipx install uv pre-commit
--- a/README_pt_br.md
+++ b/README_pt_br.md
@ -22,7 +22,7 @@
        <img alt="Badge Estático" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.2">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Última%20Relese" alt="Última Versão">
@ -180,7 +180,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
 > Todas as imagens Docker são construídas para plataformas x86. Atualmente, não oferecemos imagens Docker para ARM64.
 > Se você estiver usando uma plataforma ARM64, por favor, utilize [este guia](https://ragflow.io/docs/dev/build_docker_image) para construir uma imagem Docker compatível com o seu sistema.

-    > O comando abaixo baixa a edição `v0.20.2-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.20.2-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2` para a edição completa `v0.20.2`.
+    > O comando abaixo baixa a edição `v0.20.4-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.20.4-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` para a edição completa `v0.20.4`.

    ```bash
    $ cd ragflow/docker
@ -193,8 +193,8 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).

    | Tag da imagem RAGFlow | Tamanho da imagem (GB) | Possui modelos de incorporação? | Estável?                 |
    | --------------------- | ---------------------- | ------------------------------- | ------------------------ |
-    | v0.20.2               | ~9                     | :heavy_check_mark:              | Lançamento estável       |
-    | v0.20.2-slim          | ~2                     | ❌                              | Lançamento estável       |
+    | v0.20.4               | ~9                     | :heavy_check_mark:              | Lançamento estável       |
+    | v0.20.4-slim          | ~2                     | ❌                              | Lançamento estável       |
    | nightly               | ~9                     | :heavy_check_mark:              | _Instável_ build noturno |
    | nightly-slim          | ~2                     | ❌                               | _Instável_ build noturno |

@ -289,7 +289,7 @@ docker build --platform linux/amd64 -f Dockerfile -t infiniflow/ragflow:nightly

 ## 🔨 Lançar o serviço a partir do código-fonte para desenvolvimento

-1. Instale o `uv`, ou pule esta etapa se ele já estiver instalado:
+1. Instale o `uv` e o `pre-commit`, ou pule esta etapa se eles já estiverem instalados:

   ```bash
   pipx install uv pre-commit
--- a/README_tzh.md
+++ b/README_tzh.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.2">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -183,7 +183,7 @@
 > 所有 Docker 映像檔都是為 x86 平台建置的。目前，我們不提供 ARM64 平台的 Docker 映像檔。
 > 如果您使用的是 ARM64 平台，請使用 [這份指南](https://ragflow.io/docs/dev/build_docker_image) 來建置適合您系統的 Docker 映像檔。

-   > 執行以下指令會自動下載 RAGFlow slim Docker 映像 `v0.20.2-slim`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.20.2-slim` 的 Docker 映像，請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。例如，你可以透過設定 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2` 來下載 RAGFlow 鏡像的 `v0.20.2` 完整發行版。
+   > 執行以下指令會自動下載 RAGFlow slim Docker 映像 `v0.20.4-slim`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.20.4-slim` 的 Docker 映像，請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。例如，你可以透過設定 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` 來下載 RAGFlow 鏡像的 `v0.20.4` 完整發行版。

   ```bash
   $ cd ragflow/docker
@ -196,8 +196,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.20.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.2-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.20.4           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.20.4-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

@ -301,7 +301,7 @@ docker build --platform linux/amd64 --build-arg NEED_MIRROR=1 -f Dockerfile -t i

 ## 🔨 以原始碼啟動服務

-1. 安裝 uv。如已安裝，可跳過此步驟：
+1. 安裝 `uv` 和 `pre-commit`。如已安裝，可跳過此步驟：

   ```bash
   pipx install uv pre-commit
--- a/README_zh.md
+++ b/README_zh.md
@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.2">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.4">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -183,7 +183,7 @@
 > 请注意，目前官方提供的所有 Docker 镜像均基于 x86 架构构建，并不提供基于 ARM64 的 Docker 镜像。
 > 如果你的操作系统是 ARM64 架构，请参考[这篇文档](https://ragflow.io/docs/dev/build_docker_image)自行构建 Docker 镜像。

-   > 运行以下命令会自动下载 RAGFlow slim Docker 镜像 `v0.20.2-slim`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.20.2-slim` 的 Docker 镜像，请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。比如，你可以通过设置 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2` 来下载 RAGFlow 镜像的 `v0.20.2` 完整发行版。
+   > 运行以下命令会自动下载 RAGFlow slim Docker 镜像 `v0.20.4-slim`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.20.4-slim` 的 Docker 镜像，请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。比如，你可以通过设置 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` 来下载 RAGFlow 镜像的 `v0.20.4` 完整发行版。

   ```bash
   $ cd ragflow/docker
@ -196,8 +196,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.20.2           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.2-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.20.4           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.20.4-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

@ -301,7 +301,7 @@ docker build --platform linux/amd64 --build-arg NEED_MIRROR=1 -f Dockerfile -t i

 ## 🔨 以源代码启动服务

-1. 安装 uv。如已经安装，可跳过本步骤：
+1. 安装 `uv` 和 `pre-commit`。如已经安装，可跳过本步骤：

   ```bash
   pipx install uv pre-commit
--- a/agent/canvas.py
+++ b/agent/canvas.py
@ -29,83 +29,52 @@ from api.utils import get_uuid, hash_str2int
 from rag.prompts.prompts import chunks_format
 from rag.utils.redis_conn import REDIS_CONN

-
-class Canvas:
+class Graph:
    """
-    dsl = {
-        "components": {
-            "begin": {
-                "obj":{
-                    "component_name": "Begin",
-                    "params": {},
-                },
-                "downstream": ["answer_0"],
-                "upstream": [],
-            },
-            "retrieval_0": {
-                "obj": {
-                    "component_name": "Retrieval",
-                    "params": {}
-                },
-                "downstream": ["generate_0"],
-                "upstream": ["answer_0"],
-            },
-            "generate_0": {
-                "obj": {
-                    "component_name": "Generate",
-                    "params": {}
-                },
-                "downstream": ["answer_0"],
-                "upstream": ["retrieval_0"],
-            }
-        },
-        "history": [],
-        "path": ["begin"],
-        "retrieval": {"chunks": [], "doc_aggs": []},
-        "globals": {
-            "sys.query": "",
-            "sys.user_id": tenant_id,
-            "sys.conversation_turns": 0,
-            "sys.files": []
-        }
-    }
-    """
-
-    def __init__(self, dsl: str, tenant_id=None, task_id=None):
-        self.path = []
-        self.history = []
-        self.components = {}
-        self.error = ""
-        self.globals = {
-            "sys.query": "",
-            "sys.user_id": tenant_id,
-            "sys.conversation_turns": 0,
-            "sys.files": []
-        }
-        self.dsl = json.loads(dsl) if dsl else {
+        dsl = {
            "components": {
                "begin": {
-                    "obj": {
+                    "obj":{
                        "component_name": "Begin",
-                        "params": {
-                            "prologue": "Hi there!"
-                        }
+                        "params": {},
                    },
-                    "downstream": [],
+                    "downstream": ["answer_0"],
                    "upstream": [],
-                    "parent_id": ""
+                },
+                "retrieval_0": {
+                    "obj": {
+                        "component_name": "Retrieval",
+                        "params": {}
+                    },
+                    "downstream": ["generate_0"],
+                    "upstream": ["answer_0"],
+                },
+                "generate_0": {
+                    "obj": {
+                        "component_name": "Generate",
+                        "params": {}
+                    },
+                    "downstream": ["answer_0"],
+                    "upstream": ["retrieval_0"],
                }
            },
            "history": [],
-            "path": [],
-            "retrieval": [],
+            "path": ["begin"],
+            "retrieval": {"chunks": [], "doc_aggs": []},
            "globals": {
                "sys.query": "",
-                "sys.user_id": "",
+                "sys.user_id": tenant_id,
                "sys.conversation_turns": 0,
                "sys.files": []
            }
        }
+        """
+
+    def __init__(self, dsl: str, tenant_id=None, task_id=None):
+        self.path = []
+        self.components = {}
+        self.error = ""
+        self.dsl = json.loads(dsl)
        self._tenant_id = tenant_id
        self.task_id = task_id if task_id else get_uuid()
        self.load()
@ -116,8 +85,6 @@ class Canvas:
        for k, cpn in self.components.items():
            cpn_nms.add(cpn["obj"]["component_name"])

-        assert "Begin" in cpn_nms, "There have to be an 'Begin' component."
-
        for k, cpn in self.components.items():
            cpn_nms.add(cpn["obj"]["component_name"])
            param = component_class(cpn["obj"]["component_name"] + "Param")()
@ -130,18 +97,10 @@ class Canvas:
            cpn["obj"] = component_class(cpn["obj"]["component_name"])(self, k, param)

        self.path = self.dsl["path"]
-        self.history = self.dsl["history"]
-        self.globals = self.dsl["globals"]
-        self.retrieval = self.dsl["retrieval"]
-        self.memory = self.dsl.get("memory", [])

    def __str__(self):
        self.dsl["path"] = self.path
-        self.dsl["history"] = self.history
-        self.dsl["globals"] = self.globals
        self.dsl["task_id"] = self.task_id
-        self.dsl["retrieval"] = self.retrieval
-        self.dsl["memory"] = self.memory
        dsl = {
            "components": {}
        }
@ -160,14 +119,79 @@ class Canvas:
                dsl["components"][k][c] = deepcopy(cpn[c])
        return json.dumps(dsl, ensure_ascii=False)

-    def reset(self, mem=False):
+    def reset(self):
        self.path = []
+        for k, cpn in self.components.items():
+            self.components[k]["obj"].reset()
+        try:
+            REDIS_CONN.delete(f"{self.task_id}-logs")
+        except Exception as e:
+            logging.exception(e)
+
+    def get_component_name(self, cid):
+        for n in self.dsl.get("graph", {}).get("nodes", []):
+            if cid == n["id"]:
+                return n["data"]["name"]
+        return ""
+
+    def run(self, **kwargs):
+        raise NotImplementedError()
+
+    def get_component(self, cpn_id) -> Union[None, dict[str, Any]]:
+        return self.components.get(cpn_id)
+
+    def get_component_obj(self, cpn_id) -> ComponentBase:
+        return self.components.get(cpn_id)["obj"]
+
+    def get_component_type(self, cpn_id) -> str:
+        return self.components.get(cpn_id)["obj"].component_name
+
+    def get_component_input_form(self, cpn_id) -> dict:
+        return self.components.get(cpn_id)["obj"].get_input_form()
+
+    def get_tenant_id(self):
+        return self._tenant_id
+
+
+class Canvas(Graph):
+
+    def __init__(self, dsl: str, tenant_id=None, task_id=None):
+        self.globals = {
+            "sys.query": "",
+            "sys.user_id": tenant_id,
+            "sys.conversation_turns": 0,
+            "sys.files": []
+        }
+        super().__init__(dsl, tenant_id, task_id)
+
+    def load(self):
+        super().load()
+        self.history = self.dsl["history"]
+        if "globals" in self.dsl:
+            self.globals = self.dsl["globals"]
+        else:
+            self.globals = {
+            "sys.query": "",
+            "sys.user_id": "",
+            "sys.conversation_turns": 0,
+            "sys.files": []
+        }
+            
+        self.retrieval = self.dsl["retrieval"]
+        self.memory = self.dsl.get("memory", [])
+
+    def __str__(self):
+        self.dsl["history"] = self.history
+        self.dsl["retrieval"] = self.retrieval
+        self.dsl["memory"] = self.memory
+        return super().__str__()
+
+    def reset(self, mem=False):
+        super().reset()
        if not mem:
            self.history = []
            self.retrieval = []
            self.memory = []
-        for k, cpn in self.components.items():
-            self.components[k]["obj"].reset()

        for k in self.globals.keys():
            if isinstance(self.globals[k], str):
@ -183,22 +207,13 @@ class Canvas:
            else:
                self.globals[k] = None

-        try:
-            REDIS_CONN.delete(f"{self.task_id}-logs")
-        except Exception as e:
-            logging.exception(e)
-
-    def get_component_name(self, cid):
-        for n in self.dsl.get("graph", {}).get("nodes", []):
-            if cid == n["id"]:
-                return n["data"]["name"]
-        return ""
-
    def run(self, **kwargs):
        st = time.perf_counter()
        self.message_id = get_uuid()
        created_at = int(time.time())
        self.add_user_input(kwargs.get("query"))
+        for k, cpn in self.components.items():
+            self.components[k]["obj"].reset(True)

        for k in kwargs.keys():
            if k in ["query", "user_id", "files"] and kwargs[k]:
@ -377,18 +392,6 @@ class Canvas:
                       })
            self.history.append(("assistant", self.get_component_obj(self.path[-1]).output()))

-    def get_component(self, cpn_id) -> Union[None, dict[str, Any]]:
-        return self.components.get(cpn_id)
-
-    def get_component_obj(self, cpn_id) -> ComponentBase:
-        return self.components.get(cpn_id)["obj"]
-
-    def get_component_type(self, cpn_id) -> str:
-        return self.components.get(cpn_id)["obj"].component_name
-
-    def get_component_input_form(self, cpn_id) -> dict:
-        return self.components.get(cpn_id)["obj"].get_input_form()
-
    def is_reff(self, exp: str) -> bool:
        exp = exp.strip("{").strip("}")
        if exp.find("@") < 0:
@ -410,14 +413,11 @@ class Canvas:
            raise Exception(f"Can't find variable: '{cpn_id}@{var_nm}'")
        return cpn["obj"].output(var_nm)

-    def get_tenant_id(self):
-        return self._tenant_id
-
    def get_history(self, window_size):
        convs = []
        if window_size <= 0:
            return convs
-        for role, obj in self.history[window_size * -1:]:
+        for role, obj in self.history[window_size * -2:]:
            if isinstance(obj, dict):
                convs.append({"role": role, "content": obj.get("content", "")})
            else:
@ -427,39 +427,12 @@ class Canvas:
    def add_user_input(self, question):
        self.history.append(("user", question))

-    def _find_loop(self, max_loops=6):
-        path = self.path[-1][::-1]
-        if len(path) < 2:
-            return False
-
-        for i in range(len(path)):
-            if path[i].lower().find("answer") == 0 or path[i].lower().find("iterationitem") == 0:
-                path = path[:i]
-                break
-
-        if len(path) < 2:
-            return False
-
-        for loc in range(2, len(path) // 2):
-            pat = ",".join(path[0:loc])
-            path_str = ",".join(path)
-            if len(pat) >= len(path_str):
-                return False
-            loop = max_loops
-            while path_str.find(pat) == 0 and loop >= 0:
-                loop -= 1
-                if len(pat)+1 >= len(path_str):
-                    return False
-                path_str = path_str[len(pat)+1:]
-            if loop < 0:
-                pat = " => ".join([p.split(":")[0] for p in path[0:loc]])
-                return pat + " => " + pat
-
-        return False
-
    def get_prologue(self):
        return self.components["begin"]["obj"]._param.prologue

+    def get_mode(self):
+        return self.components["begin"]["obj"]._param.mode
+
    def set_global_param(self, **kwargs):
        self.globals.update(kwargs)

--- a/agent/component/init.py
+++ b/agent/component/init.py
@ -50,8 +50,9 @@ del _package_path, _import_submodules, _extract_classes_from_module


 def component_class(class_name):
-    m = importlib.import_module("agent.component")
-    try:
-        return getattr(m, class_name)
-    except Exception:
-        return getattr(importlib.import_module("agent.tools"), class_name)
+    for mdl in ["agent.component", "agent.tools", "rag.flow"]:
+        try:
+            return getattr(importlib.import_module(mdl), class_name)
+        except Exception:
+            pass
+    assert False, f"Can't import {class_name}"
--- a/agent/component/base.py
+++ b/agent/component/base.py
@ -16,7 +16,7 @@

 import re
 import time
-from abc import ABC, abstractmethod
+from abc import ABC
 import builtins
 import json
 import os
@ -36,7 +36,7 @@ _IS_RAW_CONF = "_is_raw_conf"

 class ComponentParamBase(ABC):
    def __init__(self):
-        self.message_history_window_size = 22
+        self.message_history_window_size = 13
        self.inputs = {}
        self.outputs = {}
        self.description = ""
@ -410,8 +410,8 @@ class ComponentBase(ABC):
        )

    def __init__(self, canvas, id, param: ComponentParamBase):
-        from agent.canvas import Canvas  # Local import to avoid cyclic dependency
-        assert isinstance(canvas, Canvas), "canvas must be an instance of Canvas"
+        from agent.canvas import Graph  # Local import to avoid cyclic dependency
+        assert isinstance(canvas, Graph), "canvas must be an instance of Canvas"
        self._canvas = canvas
        self._id = id
        self._param = param
@ -448,9 +448,11 @@ class ComponentBase(ABC):
    def error(self):
        return self._param.outputs.get("_ERROR", {}).get("value")

-    def reset(self):
+    def reset(self, only_output=False):
        for k in self._param.outputs.keys():
            self._param.outputs[k]["value"] = None
+        if only_output:
+            return
        for k in self._param.inputs.keys():
            self._param.inputs[k]["value"] = None
        self._param.debug_inputs = {}
@ -526,6 +528,10 @@ class ComponentBase(ABC):
        cpn_nms = self._canvas.get_component(self._id)['upstream']
        return cpn_nms

+    def get_downstream(self) -> List[str]:
+        cpn_nms = self._canvas.get_component(self._id)['downstream']
+        return cpn_nms
+
    @staticmethod
    def string_format(content: str, kv: dict[str, str]) -> str:
        for n, v in kv.items():
@ -554,6 +560,5 @@ class ComponentBase(ABC):
    def set_exception_default_value(self):
        self.set_output("result", self.get_exception_default_value())

-    @abstractmethod
    def thoughts(self) -> str:
-        ...
+        raise NotImplementedError()
--- a/agent/component/llm.py
+++ b/agent/component/llm.py
@ -18,11 +18,8 @@ import logging
 import os
 import re
 from typing import Any, Generator
-
 import json_repair
-from copy import deepcopy
 from functools import partial
-
 from api.db import LLMType
 from api.db.services.llm_service import LLMBundle
 from api.db.services.tenant_llm_service import TenantLLMService
@ -130,7 +127,7 @@ class LLM(ComponentBase):

        args = {}
        vars = self.get_input_elements() if not self._param.debug_inputs else self._param.debug_inputs
-        prompt = self._param.sys_prompt
+        sys_prompt = self._param.sys_prompt
        for k, o in vars.items():
            args[k] = o["value"]
            if not isinstance(args[k], str):
@ -141,14 +138,18 @@ class LLM(ComponentBase):
            self.set_input_value(k, args[k])

        msg = self._canvas.get_history(self._param.message_history_window_size)[:-1]
-        msg.extend(deepcopy(self._param.prompts))
-        prompt = self.string_format(prompt, args)
+        for p in self._param.prompts:
+            if msg and msg[-1]["role"] == p["role"]:
+                continue
+            msg.append(p)
+
+        sys_prompt = self.string_format(sys_prompt, args)
        for m in msg:
            m["content"] = self.string_format(m["content"], args)
        if self._param.cite and self._canvas.get_reference()["chunks"]:
-            prompt += citation_prompt()
+            sys_prompt += citation_prompt()

-        return prompt, msg
+        return sys_prompt, msg

    def _generate(self, msg:list[dict], **kwargs) -> str:
        if not self.imgs:
--- a/agent/templates/choose_your_knowledge_base_agent.json
+++ b/agent/templates/choose_your_knowledge_base_agent.json
@ -1,8 +1,12 @@
 {
    "id": 19,
-    "title": "Choose Your Knowledge Base Agent",
-    "description": "Select your desired knowledge base from the dropdown menu. The Agent will only retrieve from the selected knowledge base and use this content  to generate responses.",
-    "canvas_type": "Agent", 
+    "title": {
+		"en": "Choose Your Knowledge Base Agent",
+		"zh": "选择知识库智能体"},
+    "description": {
+		"en": "Select your desired knowledge base from the dropdown menu. The Agent will only retrieve from the selected knowledge base and use this content  to generate responses.",
+		"zh": "从下拉菜单中选择知识库，智能体将仅根据所选知识库内容生成回答。"},
+	"canvas_type": "Agent", 
    "dsl": {
 		"components": {
 			"Agent:BraveParksJoke": {
--- a/agent/templates/choose_your_knowledge_base_workflow.json
+++ b/agent/templates/choose_your_knowledge_base_workflow.json
@ -1,8 +1,12 @@
 {
    "id": 18,
-    "title": "Choose Your Knowledge Base Workflow",
-    "description": "Select your desired knowledge base from the dropdown menu. The retrieval assistant will only use data from your selected knowledge base to generate responses.",
-    "canvas_type": "Other",
+    "title": {
+		"en": "Choose Your Knowledge Base Workflow",
+		"zh": "选择知识库工作流"},
+    "description": {
+		"en": "Select your desired knowledge base from the dropdown menu. The retrieval assistant will only use data from your selected knowledge base to generate responses.",
+		"zh": "从下拉菜单中选择知识库，工作流将仅根据所选知识库内容生成回答。"},
+	"canvas_type": "Other",
    "dsl": {
 		"components": {
 			"Agent:ProudDingosShout": {
--- a/agent/templates/customer_review_analysis.json
+++ b/agent/templates/customer_review_analysis.json
@ -1,9 +1,13 @@

 {
    "id": 11,
-    "title": "Customer Review Analysis",
-    "description": "Automatically classify customer reviews using LLM (Large Language Model) and route them via email to the relevant departments.",
-    "canvas_type": "Customer Support",
+    "title": {
+		"en": "Customer Review Analysis",
+		"zh": "客户评价分析"},
+    "description": {
+		"en": "Automatically classify customer reviews using LLM (Large Language Model) and route them via email to the relevant departments.",
+		"zh": "大模型将自动分类客户评价，并通过电子邮件将结果发送到相关部门。"},
+	"canvas_type": "Customer Support",
    "dsl": {
 		"components": {
 			"Categorize:FourTeamsFold": {
--- a/agent/templates/customer_service.json
+++ b/agent/templates/customer_service.json
--- a/agent/templates/customer_support.json
+++ b/agent/templates/customer_support.json
@ -1,8 +1,12 @@

 {
    "id": 10,
-    "title": "Customer Support",
-    "description": "This is an intelligent customer service processing system workflow based on user intent classification. It uses LLM to identify user demand types and transfers them to the corresponding professional agent for processing.",
+    "title": {
+        "en":"Customer Support",
+        "zh": "客户支持"},
+    "description": {
+        "en": "This is an intelligent customer service processing system workflow based on user intent classification. It uses LLM to identify user demand types and transfers them to the corresponding professional agent for processing.",
+        "zh": "工作流系统，用于智能客服场景。基于用户意图分类。使用大模型识别用户需求类型，并将需求转移给相应的智能体进行处理。"},
    "canvas_type": "Customer Support",
    "dsl": {
            "components": {
--- a/agent/templates/cv_analysis_and_candidate_evaluation.json
+++ b/agent/templates/cv_analysis_and_candidate_evaluation.json
@ -1,8 +1,12 @@

 {
    "id": 15,
-    "title": "CV Analysis and Candidate Evaluation",
-    "description": "This is a workflow that helps companies evaluate resumes, HR uploads a job description first, then submits multiple resumes via the chat window for evaluation.",
+    "title": {
+        "en": "CV Analysis and Candidate Evaluation",
+        "zh": "简历分析和候选人评估"},
+    "description": {
+        "en": "This is a workflow that helps companies evaluate resumes, HR uploads a job description first, then submits multiple resumes via the chat window for evaluation.",
+        "zh": "帮助公司评估简历的工作流。HR首先上传职位描述，通过聊天窗口提交多份简历进行评估。"},
    "canvas_type": "Other",
    "dsl": {
            "components": {
--- a/agent/templates/cv_evaluation.json
+++ b/agent/templates/cv_evaluation.json
--- a/agent/templates/deep_research.json
+++ b/agent/templates/deep_research.json
@ -1,8 +1,12 @@
      
 {
    "id": 1,
-    "title": "Deep Research",
-    "description": "For professionals in sales, marketing, policy, or consulting, the Multi-Agent Deep Research Agent conducts structured, multi-step investigations across diverse sources and delivers consulting-style reports with clear citations.",
+    "title": {
+        "en": "Deep Research",
+        "zh": "深度研究"},
+    "description": {
+        "en": "For professionals in sales, marketing, policy, or consulting, the Multi-Agent Deep Research Agent conducts structured, multi-step investigations across diverse sources and delivers consulting-style reports with clear citations.",
+        "zh": "专为销售、市场、政策或咨询领域的专业人士设计，多智能体的深度研究会结合多源信息进行结构化、多步骤地回答问题，并附带有清晰的引用。"},
    "canvas_type": "Recommended",
    "dsl": {
            "components": {
--- a/agent/templates/deep_search_r.json
+++ b/agent/templates/deep_search_r.json
@ -1,8 +1,12 @@

 {
    "id": 6,
-    "title": "Deep Research",
-    "description": "For professionals in sales, marketing, policy, or consulting, the Multi-Agent Deep Research Agent conducts structured, multi-step investigations across diverse sources and delivers consulting-style reports with clear citations.",
+    "title": {
+        "en": "Deep Research",
+        "zh": "深度研究"},
+    "description": {
+        "en": "For professionals in sales, marketing, policy, or consulting, the Multi-Agent Deep Research Agent conducts structured, multi-step investigations across diverse sources and delivers consulting-style reports with clear citations.",
+        "zh": "专为销售、市场、政策或咨询领域的专业人士设计，多智能体的深度研究会结合多源信息进行结构化、多步骤地回答问题，并附带有清晰的引用。"},
    "canvas_type": "Agent",
    "dsl": {
            "components": {
--- a/agent/templates/ecommerce_customer_service_workflow.json
+++ b/agent/templates/ecommerce_customer_service_workflow.json
--- a/agent/templates/generate_SEO_blog.json
+++ b/agent/templates/generate_SEO_blog.json
@ -1,7 +1,11 @@
 {
    "id": 8,
-    "title": "Generate SEO Blog",
-    "description": "This is a multi-agent version of the SEO blog generation workflow. It simulates a small team of AI “writers”, where each agent plays a specialized role — just like a real editorial team.",
+    "title": {
+        "en": "Generate SEO Blog",
+        "zh": "生成SEO博客"},
+    "description": {
+        "en": "This is a multi-agent version of the SEO blog generation workflow. It simulates a small team of AI “writers”, where each agent plays a specialized role — just like a real editorial team.",
+        "zh": "多智能体架构可根据简单的用户输入自动生成完整的SEO博客文章。模拟小型“作家”团队，其中每个智能体扮演一个专业角色——就像真正的编辑团队。"},
    "canvas_type": "Agent",
    "dsl": {
            "components": {
--- a/agent/templates/image_lingo.json
+++ b/agent/templates/image_lingo.json
@ -1,7 +1,11 @@
 {
    "id": 13,
-    "title": "ImageLingo",
-    "description": "ImageLingo lets you snap any photo containing text—menus, signs, or documents—and instantly recognize and translate it into your language of choice using advanced AI-powered translation technology.",
+    "title": {
+        "en": "ImageLingo",
+        "zh": "图片解析"},
+    "description": {
+        "en": "ImageLingo lets you snap any photo containing text—menus, signs, or documents—and instantly recognize and translate it into your language of choice using advanced AI-powered translation technology.",
+        "zh": "多模态大模型允许您拍摄任何包含文本的照片——菜单、标志或文档——立即识别并转换成您选择的语言。"},
    "canvas_type": "Consumer App",
    "dsl": {
            "components": {
--- a/agent/templates/knowledge_base_report.json
+++ b/agent/templates/knowledge_base_report.json
@ -1,7 +1,11 @@
 {
    "id": 20,
-    "title": "Report Agent Using Knowledge Base",
-    "description": "A report generation assistant using local knowledge base, with advanced capabilities in task planning, reasoning, and reflective analysis. Recommended for academic research paper Q&A",
+    "title": {
+        "en": "Report Agent Using Knowledge Base",
+        "zh": "知识库检索智能体"},
+    "description": {
+        "en": "A report generation assistant using local knowledge base, with advanced capabilities in task planning, reasoning, and reflective analysis. Recommended for academic research paper Q&A",
+        "zh": "一个使用本地知识库的报告生成助手，具备高级能力，包括任务规划、推理和反思性分析。推荐用于学术研究论文问答。"},
    "canvas_type": "Agent",
    "dsl": {
        "components": {
--- a/agent/templates/knowledge_base_report_r.json
+++ b/agent/templates/knowledge_base_report_r.json
@ -0,0 +1,331 @@
+{
+    "id": 21,
+    "title": {
+        "en": "Report Agent Using Knowledge Base", 
+        "zh": "知识库检索智能体"},
+    "description": {
+        "en": "A report generation assistant using local knowledge base, with advanced capabilities in task planning, reasoning, and reflective analysis. Recommended for academic research paper Q&A",
+        "zh": "一个使用本地知识库的报告生成助手，具备高级能力，包括任务规划、推理和反思性分析。推荐用于学术研究论文问答。"},
+    "canvas_type": "Recommended",
+    "dsl": {
+        "components": {
+            "Agent:NewPumasLick": {
+                "downstream": [
+                    "Message:OrangeYearsShine"
+                ],
+                "obj": {
+                    "component_name": "Agent",
+                    "params": {
+                        "delay_after_error": 1,
+                        "description": "",
+                        "exception_comment": "",
+                        "exception_default_value": "",
+                        "exception_goto": [],
+                        "exception_method": null,
+                        "frequencyPenaltyEnabled": false,
+                        "frequency_penalty": 0.5,
+                        "llm_id": "qwen3-235b-a22b-instruct-2507@Tongyi-Qianwen",
+                        "maxTokensEnabled": true,
+                        "max_retries": 3,
+                        "max_rounds": 3,
+                        "max_tokens": 128000,
+                        "mcp": [],
+                        "message_history_window_size": 12,
+                        "outputs": {
+                            "content": {
+                                "type": "string",
+                                "value": ""
+                            }
+                        },
+                        "parameter": "Precise",
+                        "presencePenaltyEnabled": false,
+                        "presence_penalty": 0.5,
+                        "prompts": [
+                            {
+                                "content": "# User Query\n {sys.query}",
+                                "role": "user"
+                            }
+                        ],
+                        "sys_prompt": "## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n   * Actions:\n     * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n     * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n   * Example rules:\n     * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n     * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n     * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n   * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n   * Breadth-first: list subtasks, prioritize them, and assign search terms.\n   * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n   * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n   * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n   * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n  * [ ] Key concepts and expected outputs identified\n  * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n  * [ ] Meets quality standards (see above)\n  * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n  * [ ] Each conclusion has at least one direct evidence source\n  * [ ] State assumptions/uncertainties\n  * [ ] Provide next-step suggestions or experiment/retrieval plans\n  * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n   * Hierarchically clear modular structure\n   * Logical grouping according to the MECE principle\n   * Professionally presented formatting\n   * Step-by-step cognitive guidance\n   * Reasonable use of headings and dividers for clarity\n   * *Italicize* key parameters\n   * **Bold** critical information\n5. **LaTeX formula requirements**:\n   * Inline formulas: start and end with `$`\n   * Block formulas: start and end with `$$`, each `$$` on its own line\n   * Block formula content must comply with LaTeX math syntax\n   * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
+                        "temperature": "0.1",
+                        "temperatureEnabled": true,
+                        "tools": [
+                            {
+                                "component_name": "Retrieval",
+                                "name": "Retrieval",
+                                "params": {
+                                    "cross_languages": [],
+                                    "description": "",
+                                    "empty_response": "",
+                                    "kb_ids": [],
+                                    "keywords_similarity_weight": 0.7,
+                                    "outputs": {
+                                        "formalized_content": {
+                                            "type": "string",
+                                            "value": ""
+                                        }
+                                    },
+                                    "rerank_id": "",
+                                    "similarity_threshold": 0.2,
+                                    "top_k": 1024,
+                                    "top_n": 8,
+                                    "use_kg": false
+                                }
+                            }
+                        ],
+                        "topPEnabled": false,
+                        "top_p": 0.75,
+                        "user_prompt": "",
+                        "visual_files_var": ""
+                    }
+                },
+                "upstream": [
+                    "begin"
+                ]
+            },
+            "Message:OrangeYearsShine": {
+                "downstream": [],
+                "obj": {
+                    "component_name": "Message",
+                    "params": {
+                        "content": [
+                            "{Agent:NewPumasLick@content}"
+                        ]
+                    }
+                },
+                "upstream": [
+                    "Agent:NewPumasLick"
+                ]
+            },
+            "begin": {
+                "downstream": [
+                    "Agent:NewPumasLick"
+                ],
+                "obj": {
+                    "component_name": "Begin",
+                    "params": {
+                        "enablePrologue": true,
+                        "inputs": {},
+                        "mode": "conversational",
+                        "prologue": "\u4f60\u597d\uff01 \u6211\u662f\u4f60\u7684\u52a9\u7406\uff0c\u6709\u4ec0\u4e48\u53ef\u4ee5\u5e2e\u5230\u4f60\u7684\u5417\uff1f"
+                    }
+                },
+                "upstream": []
+            }
+        },
+        "globals": {
+            "sys.conversation_turns": 0,
+            "sys.files": [],
+            "sys.query": "",
+            "sys.user_id": ""
+        },
+        "graph": {
+            "edges": [
+                {
+                    "data": {
+                        "isHovered": false
+                    },
+                    "id": "xy-edge__beginstart-Agent:NewPumasLickend",
+                    "source": "begin",
+                    "sourceHandle": "start",
+                    "target": "Agent:NewPumasLick",
+                    "targetHandle": "end"
+                },
+                {
+                    "data": {
+                        "isHovered": false
+                    },
+                    "id": "xy-edge__Agent:NewPumasLickstart-Message:OrangeYearsShineend",
+                    "markerEnd": "logo",
+                    "source": "Agent:NewPumasLick",
+                    "sourceHandle": "start",
+                    "style": {
+                        "stroke": "rgba(91, 93, 106, 1)",
+                        "strokeWidth": 1
+                    },
+                    "target": "Message:OrangeYearsShine",
+                    "targetHandle": "end",
+                    "type": "buttonEdge",
+                    "zIndex": 1001
+                },
+                {
+                    "data": {
+                        "isHovered": false
+                    },
+                    "id": "xy-edge__Agent:NewPumasLicktool-Tool:AllBirdsNailend",
+                    "selected": false,
+                    "source": "Agent:NewPumasLick",
+                    "sourceHandle": "tool",
+                    "target": "Tool:AllBirdsNail",
+                    "targetHandle": "end"
+                }
+            ],
+            "nodes": [
+                {
+                    "data": {
+                        "form": {
+                            "enablePrologue": true,
+                            "inputs": {},
+                            "mode": "conversational",
+                            "prologue": "\u4f60\u597d\uff01 \u6211\u662f\u4f60\u7684\u52a9\u7406\uff0c\u6709\u4ec0\u4e48\u53ef\u4ee5\u5e2e\u5230\u4f60\u7684\u5417\uff1f"
+                        },
+                        "label": "Begin",
+                        "name": "begin"
+                    },
+                    "dragging": false,
+                    "id": "begin",
+                    "measured": {
+                        "height": 48,
+                        "width": 200
+                    },
+                    "position": {
+                        "x": -9.569875358221438,
+                        "y": 205.84018385864917
+                    },
+                    "selected": false,
+                    "sourcePosition": "left",
+                    "targetPosition": "right",
+                    "type": "beginNode"
+                },
+                {
+                    "data": {
+                        "form": {
+                            "content": [
+                                "{Agent:NewPumasLick@content}"
+                            ]
+                        },
+                        "label": "Message",
+                        "name": "Response"
+                    },
+                    "dragging": false,
+                    "id": "Message:OrangeYearsShine",
+                    "measured": {
+                        "height": 56,
+                        "width": 200
+                    },
+                    "position": {
+                        "x": 734.4061285881053,
+                        "y": 199.9706031723009
+                    },
+                    "selected": false,
+                    "sourcePosition": "right",
+                    "targetPosition": "left",
+                    "type": "messageNode"
+                },
+                {
+                    "data": {
+                        "form": {
+                            "delay_after_error": 1,
+                            "description": "",
+                            "exception_comment": "",
+                            "exception_default_value": "",
+                            "exception_goto": [],
+                            "exception_method": null,
+                            "frequencyPenaltyEnabled": false,
+                            "frequency_penalty": 0.5,
+                            "llm_id": "qwen3-235b-a22b-instruct-2507@Tongyi-Qianwen",
+                            "maxTokensEnabled": true,
+                            "max_retries": 3,
+                            "max_rounds": 3,
+                            "max_tokens": 128000,
+                            "mcp": [],
+                            "message_history_window_size": 12,
+                            "outputs": {
+                                "content": {
+                                    "type": "string",
+                                    "value": ""
+                                }
+                            },
+                            "parameter": "Precise",
+                            "presencePenaltyEnabled": false,
+                            "presence_penalty": 0.5,
+                            "prompts": [
+                                {
+                                    "content": "# User Query\n {sys.query}",
+                                    "role": "user"
+                                }
+                            ],
+                            "sys_prompt": "## Role & Task\nYou are a **\u201cKnowledge Base Retrieval Q\\&A Agent\u201d** whose goal is to break down the user\u2019s question into retrievable subtasks, and then produce a multi-source-verified, structured, and actionable research report using the internal knowledge base.\n## Execution Framework (Detailed Steps & Key Points)\n1. **Assessment & Decomposition**\n   * Actions:\n     * Automatically extract: main topic, subtopics, entities (people/organizations/products/technologies), time window, geographic/business scope.\n     * Output as a list: N facts/data points that must be collected (*N* ranges from 5\u201320 depending on question complexity).\n2. **Query Type Determination (Rule-Based)**\n   * Example rules:\n     * If the question involves a single issue but requests \u201cmethod comparison/multiple explanations\u201d \u2192 use **depth-first**.\n     * If the question can naturally be split into \u22653 independent sub-questions \u2192 use **breadth-first**.\n     * If the question can be answered by a single fact/specification/definition \u2192 use **simple query**.\n3. **Research Plan Formulation**\n   * Depth-first: define 3\u20135 perspectives (methodology/stakeholders/time dimension/technical route, etc.), assign search keywords, target document types, and output format for each perspective.\n   * Breadth-first: list subtasks, prioritize them, and assign search terms.\n   * Simple query: directly provide the search sentence and required fields.\n4. **Retrieval Execution**\n   * After retrieval: perform coverage check (does it contain the key facts?) and quality check (source diversity, authority, latest update time).\n   * If standards are not met, automatically loop: rewrite queries (synonyms/cross-domain terms) and retry \u22643 times, or flag as requiring external search.\n5. **Integration & Reasoning**\n   * Build the answer using a **fact\u2013evidence\u2013reasoning** chain. For each conclusion, attach 1\u20132 strongest pieces of evidence.\n---\n## Quality Gate Checklist (Verify at Each Stage)\n* **Stage 1 (Decomposition)**:\n  * [ ] Key concepts and expected outputs identified\n  * [ ] Required facts/data points listed\n* **Stage 2 (Retrieval)**:\n  * [ ] Meets quality standards (see above)\n  * [ ] If not met: execute query iteration\n* **Stage 3 (Generation)**:\n  * [ ] Each conclusion has at least one direct evidence source\n  * [ ] State assumptions/uncertainties\n  * [ ] Provide next-step suggestions or experiment/retrieval plans\n  * [ ] Final length and depth match user expectations (comply with word count/format if specified)\n---\n## Core Principles\n1. **Strict reliance on the knowledge base**: answers must be **fully bounded** by the content retrieved from the knowledge base.\n2. **No fabrication**: do not generate, infer, or create information that is not explicitly present in the knowledge base.\n3. **Accuracy first**: prefer incompleteness over inaccurate content.\n4. **Output format**:\n   * Hierarchically clear modular structure\n   * Logical grouping according to the MECE principle\n   * Professionally presented formatting\n   * Step-by-step cognitive guidance\n   * Reasonable use of headings and dividers for clarity\n   * *Italicize* key parameters\n   * **Bold** critical information\n5. **LaTeX formula requirements**:\n   * Inline formulas: start and end with `$`\n   * Block formulas: start and end with `$$`, each `$$` on its own line\n   * Block formula content must comply with LaTeX math syntax\n   * Verify formula correctness\n---\n## Additional Notes (Interaction & Failure Strategy)\n* If the knowledge base does not cover critical facts: explicitly inform the user (with sample wording)\n* For time-sensitive issues: enforce time filtering in the search request, and indicate the latest retrieval date in the answer.\n* Language requirement: answer in the user\u2019s preferred language\n",
+                            "temperature": "0.1",
+                            "temperatureEnabled": true,
+                            "tools": [
+                                {
+                                    "component_name": "Retrieval",
+                                    "name": "Retrieval",
+                                    "params": {
+                                        "cross_languages": [],
+                                        "description": "",
+                                        "empty_response": "",
+                                        "kb_ids": [],
+                                        "keywords_similarity_weight": 0.7,
+                                        "outputs": {
+                                            "formalized_content": {
+                                                "type": "string",
+                                                "value": ""
+                                            }
+                                        },
+                                        "rerank_id": "",
+                                        "similarity_threshold": 0.2,
+                                        "top_k": 1024,
+                                        "top_n": 8,
+                                        "use_kg": false
+                                    }
+                                }
+                            ],
+                            "topPEnabled": false,
+                            "top_p": 0.75,
+                            "user_prompt": "",
+                            "visual_files_var": ""
+                        },
+                        "label": "Agent",
+                        "name": "Knowledge Base Agent"
+                    },
+                    "dragging": false,
+                    "id": "Agent:NewPumasLick",
+                    "measured": {
+                        "height": 84,
+                        "width": 200
+                    },
+                    "position": {
+                        "x": 347.00048227952215,
+                        "y": 186.49109364794631
+                    },
+                    "selected": false,
+                    "sourcePosition": "right",
+                    "targetPosition": "left",
+                    "type": "agentNode"
+                },
+                {
+                    "data": {
+                        "form": {
+                            "description": "This is an agent for a specific task.",
+                            "user_prompt": "This is the order you need to send to the agent."
+                        },
+                        "label": "Tool",
+                        "name": "flow.tool_10"
+                    },
+                    "dragging": false,
+                    "id": "Tool:AllBirdsNail",
+                    "measured": {
+                        "height": 48,
+                        "width": 200
+                    },
+                    "position": {
+                        "x": 220.24819746977118,
+                        "y": 403.31576836482583
+                    },
+                    "selected": false,
+                    "sourcePosition": "right",
+                    "targetPosition": "left",
+                    "type": "toolNode"
+                }
+            ]
+        },
+        "history": [],
+        "memory": [],
+        "messages": [],
+        "path": [],
+        "retrieval": []
+    },
+    "avatar": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADAAAAAwCAYAAABXAvmHAAAH0klEQVR4nO2ZC1BU1wGG/3uRp/IygG+DGK0GOjE1U6cxI4tT03Y0E+kENbaJbKpj60wzgNMwnTjuEtu0miGasY+0krI202kMVEnVxtoOLG00oVa0LajVBDcSEI0REFBgkZv/3GWXfdzdvctuHs7kmzmec9//d+45914XCXc4Xwjk1+59VJGGF7C5QAFSWBvgyWmWLl7IKiny6QNL173B5YjB84bOyrpKA4B1DLySdQpLKAiZGtZ7a/KMVoQJz6UfEZyhTWwaEBmssiLvCueu6BJg8EwFqGTTAC+uvNWC9w82sRWcux/JwaSHstjywcogRt4RG0KExwWG4QsVYCebKSwe3L5lR9OOWjyzfg2WL/0a1/jncO3b2FHxGnKeWYqo+Giu8UEMrWJKWBACPMY/DG+63txhvnKshUu+DF2/hayMDFRsL+VScDb++AVc6OjAuInxXPJl2tfnIikrzUyJMi7qQmLRhOEr2fOFbX/7P6STF7BqoWevfdij4NWGQfx+57OYO2sG1wSnsek8Nm15EU8sikF6ouelXz9ph7JwDqYt+5IIZaGEkauDIrH4wPBmhjexCSEws+VdVG1M4NIoj+2xYzBuJtavWcEl/VS8dggx/ZdQvcGzQwp+cxOXsu5RBQQMVkYJM4LA/Txh+ELFMWFVPARS5kFiabZdx8Olh7l17BzdvhzZmROhdJ3j6D/nIyBgOCMlLAgA9xmF4TMV4BSbrgnrLiBl5rOsRCRRbDUsBzQFiJjY91PCBj9w+yiP1lXWsTLAjc9YQGB9I8+Yx1oTiUWFvW9QgDo2PdASaDp/EQ8/sRnhcPTVcuTMncXwQQVESL9DidscaPW+QEtAICRu9PSxFTpJiePV8AI9AsTvXZBY/Pa+wJ9ApNApIILm8S5Y4QXXQwhYFH6csemDP4G3G5v579i5d04mknknQhDYS4HCrCVr/mC3D305KnbCEpvVIia5Onw6WaWw+KAl0Np+FUXbdiMcyoqfUoeRHoFrJ1uRtnBG1/9Mf/3LtElp+VwF2wcd7woJib1vUPwMH4GWQCQJJtBa/V9cPmFD8uQUpMdNGDhY8bNYrobh8acHu270/l0ImJWRt64Wn6WACN9z5gq2lXwPW8pfweT0icP/fH23vO9QLYq3/QKyLBmFQI3CUcT9NdESEEPItKsSN3r7MBaSJoxHWZERM6ZmMLy2gDP8/pd/og418dTL37hFSUpMUC5f+UiWZcnY9s5+ixCwUiCXx2iiJdDNx6f4pgkH8Q3lbxK7h8+enoHha1cRNdMp8axiHxo6+/5bVdk8DSROYIW1X7QEIom3wHD3gEf4vu1bVYEJZeWQ0zJQvmcfyiv2QZak6raG/QWfK4Ez9mTc5v8xPMJfuojoxXmIX/9DOMe+FCWbcHu4BJJ0YEwCx0824bFNW9HesB+CqYu+jepfPYcHF+aoPXS8sQl/+vU2bgmOU2C+qRc9/YrrPPbGBtzavd0nvCxLxui4pJrBm911PFwak4CYA80cj+JCAiGUzYkmxrSY4N2c3GLi6UEIFL/wRxxqkhmHnTEpDQcrfq6ea+hcE8bNy3GFzyq4H22HW1Kd4WMSkg1jmsSRpKj0Rzhy4gNUv/y8Gjrv8SJK3OWScA+fMn/ysVPPvTmeh6nh1TcxBUJ+jEaKYr7N36x7h+Edj0pB6+WrLokn87+BrTt/p4ZPzZ6MM7/8R2//h33vOcNzdwgBMwVMbGvySQmo4a0NqOZccU7YmGXLEfPQUlUid/XT6B8YdIU/99vjsPcOdEhDsfOd4QVCwKB8yp8SWuG1njbTl83DpMWz1PCKAswuWPDI0e8WebyAJBbxNdrF7cls+hBpAb3h3XtehL/3+4u7D35rQwpP4YFTwMJ91rHpQyQFQgmf9sAMNL9Ur4afv/FBjIuPVj+n4YVTwMD96tj0IVICoYYXv/q1VJ1Sl8UveQyaRwErvOB6B5SwKhqP00gI6A0vhsycJ7/KIzxhyHqGN0ADbnNAAYOicRfCFdAb/p50Gbfuc/wy5w1D5lOghk0fuG0USlgVr7sQjoDe8C8WxKGKPy2KjzlvAQb02/sCbh+FApngX1QUtyeSuwDi0hxFByV7L+LIf3r5kvpp4PBr07Hqvn71Y85bgOG6WS2ggA1+4D6eUKKQApVsqngI6KSkqh9HzsoM/3zg8Oz5VQ9E8wjf30YFDGdkeAsCwH18oYRZGXk7C4HuYxcwe6rjQsFovzaEvoFxqNkTOPzMjGikJso8wsF77XYkLx6dAwxWxvBmBIH7aUMJi8J3w0DnTVz7dyvX6KPzVBt+kL8cmzesRq9ps2Z48bRJmOIapS7E4zM2lXNt5CcU6ID7+ocSZkqY2NRN6ysnsHbJEpR8ZwV6t5Yg+iuLELf2KVd48VwXQf3BQGUMb4ZOuH9gKFEIYJfiNrEDcXZHHV4q3YRv5i7ikgM94RlETNgihrcgBHhccCiRCf7VhBK5rAPyr9I/Y/WKPEyfksH/9NjQ2dODhsYzwcLXsypkeBtCRGLRDUUMAMyKHxEx4dtrzyP97nQMygripiQiKi4aSbPvQmKW7+OXF69ntYvBa1iPCYklZEZECsGm4ja0Ops7EJsaj4SprlU+8IJiqIjAFga3Ikx4vvAYkTGALxyWFArlsnbBC9Sz6mI5zWKNRGh3JJY7mjte4GOz+r4tkRbxQQAAAABJRU5ErkJggg=="
+}
--- a/agent/templates/market_generate_seo_blog.json
+++ b/agent/templates/market_generate_seo_blog.json
@ -1,7 +1,11 @@
 {
    "id": 12,
-    "title": "Generate SEO Blog",
-    "description": "This workflow automatically generates a complete SEO-optimized blog article based on a simple user input. You don’t need any writing experience. Just provide a topic or short request — the system will handle the rest.",
+    "title": {
+        "en": "Generate SEO Blog",
+        "zh": "生成SEO博客"},
+    "description": {
+        "en": "This workflow automatically generates a complete SEO-optimized blog article based on a simple user input. You don’t need any writing experience. Just provide a topic or short request — the system will handle the rest.",
+        "zh": "此工作流根据简单的用户输入自动生成完整的SEO博客文章。你无需任何写作经验，只需提供一个主题或简短请求，系统将处理其余部分。"},
    "canvas_type": "Marketing",
    "dsl": {
            "components": {
--- a/agent/templates/seo_blog.json
+++ b/agent/templates/seo_blog.json
@ -1,7 +1,11 @@
 {
    "id": 4,
-    "title": "Generate SEO Blog",
-    "description": "This workflow automatically generates a complete SEO-optimized blog article based on a simple user input. You don’t need any writing experience. Just provide a topic or short request — the system will handle the rest.",
+    "title": {
+        "en": "Generate SEO Blog",
+        "zh": "生成SEO博客"},
+    "description": {
+        "en": "This workflow automatically generates a complete SEO-optimized blog article based on a simple user input. You don’t need any writing experience. Just provide a topic or short request — the system will handle the rest.",
+        "zh": "此工作流根据简单的用户输入自动生成完整的SEO博客文章。你无需任何写作经验，只需提供一个主题或简短请求，系统将处理其余部分。"},
    "canvas_type": "Recommended",
    "dsl": {
            "components": {
--- a/agent/templates/sql_assistant.json
+++ b/agent/templates/sql_assistant.json
@ -1,7 +1,11 @@
 {
    "id": 17,
-    "title": "SQL Assistant",
-    "description": "SQL Assistant is an AI-powered tool that lets business users turn plain-English questions into fully formed SQL queries. Simply type your question (e.g., “Show me last quarter’s top 10 products by revenue”) and SQL Assistant generates the exact SQL, runs it against your database, and returns the results in seconds. ",
+    "title": {
+        "en": "SQL Assistant",
+        "zh": "SQL助理"},
+    "description": {
+        "en": "SQL Assistant is an AI-powered tool that lets business users turn plain-English questions into fully formed SQL queries. Simply type your question (e.g., “Show me last quarter’s top 10 products by revenue”) and SQL Assistant generates the exact SQL, runs it against your database, and returns the results in seconds. ",
+        "zh": "用户能够将简单文本问题转化为完整的SQL查询并输出结果。只需输入您的问题（例如，“展示上个季度前十名按收入排序的产品”），SQL助理就会生成精确的SQL语句，对其运行您的数据库，并几秒钟内返回结果。"},
    "canvas_type": "Marketing",
    "dsl": {
            "components": {
--- a/agent/templates/technical_docs.json
+++ b/agent/templates/technical_docs.json
--- a/agent/templates/technical_docs_qa.json
+++ b/agent/templates/technical_docs_qa.json
@ -1,8 +1,12 @@

 {
    "id": 9,
-    "title": "Technical Docs QA",
-    "description": "This is a document question-and-answer system based on a knowledge base. When a user asks a question, it retrieves relevant document content to provide accurate answers.",
+    "title": {
+        "en": "Technical Docs QA",
+        "zh": "技术文档问答"},
+    "description": {
+        "en": "This is a document question-and-answer system based on a knowledge base. When a user asks a question, it retrieves relevant document content to provide accurate answers.",
+        "zh": "基于知识库的文档问答系统，当用户提出问题时，会检索相关本地文档并提供准确回答。"},
    "canvas_type": "Customer Support",
    "dsl": {
            "components": {
--- a/agent/templates/trip_planner.json
+++ b/agent/templates/trip_planner.json
@ -1,9 +1,13 @@

 {
    "id": 14,
-    "title": "Trip Planner",
-    "description": "This smart trip planner utilizes LLM technology to automatically generate customized travel itineraries, with optional tool integration for enhanced reliability.",
-    "canvas_type": "Consumer App",
+    "title": {
+		"en": "Trip Planner",
+		"zh": "旅行规划"},
+    "description": {
+		"en": "This smart trip planner utilizes LLM technology to automatically generate customized travel itineraries, with optional tool integration for enhanced reliability.",
+		"zh": "智能旅行规划将利用大模型自动生成定制化的旅行行程，附带可选工具集成，以增强可靠性。"},
+	"canvas_type": "Consumer App",
    "dsl": {
 		"components": {
 			"Agent:OddGuestsPump": {
--- a/agent/templates/web_search_assistant.json
+++ b/agent/templates/web_search_assistant.json
@ -1,9 +1,13 @@

 {
    "id": 16,
-    "title": "WebSearch Assistant",
-    "description": "A chat assistant template that integrates information extracted from a knowledge base and web searches to respond to queries. Let's start by setting up your knowledge base in 'Retrieval'!",
-    "canvas_type": "Other",
+    "title": {
+		"en": "WebSearch Assistant",
+		"zh": "网页搜索助手"},
+    "description": {
+		"en": "A chat assistant template that integrates information extracted from a knowledge base and web searches to respond to queries. Let's start by setting up your knowledge base in 'Retrieval'!",
+		"zh": "集成了从知识库和网络搜索中提取的信息回答用户问题。让我们从设置您的知识库开始检索！"},
+	"canvas_type": "Other",
    "dsl": {
 		"components": {
 			"Agent:SmartSchoolsCross": {
--- a/agent/tools/code_exec.py
+++ b/agent/tools/code_exec.py
@ -79,7 +79,7 @@ def main() -> dict:
    return {
        "result": fibonacci_recursive(100),
    }
-    
+
 Here's a code example for Javascript(`main` function MUST be included and exported):
 const axios = require('axios');
 async function main(args) {
@ -156,7 +156,7 @@ class CodeExec(ToolBase, ABC):
            self.set_output("_ERROR", "construct code request error: " + str(e))

        try:
-            resp = requests.post(url=f"http://{settings.SANDBOX_HOST}:9385/run", json=code_req, timeout=10)
+            resp = requests.post(url=f"http://{settings.SANDBOX_HOST}:9385/run", json=code_req, timeout=os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60))
            logging.info(f"http://{settings.SANDBOX_HOST}:9385/run", code_req, resp.status_code)
            if resp.status_code != 200:
                resp.raise_for_status()
--- a/agent/tools/crawler.py
+++ b/agent/tools/crawler.py
@ -16,9 +16,8 @@
 from abc import ABC
 import asyncio
 from crawl4ai import AsyncWebCrawler
-
 from agent.tools.base import ToolParamBase, ToolBase
-from api.utils.web_utils import is_valid_url
+


 class CrawlerParam(ToolParamBase):
@ -39,6 +38,7 @@ class Crawler(ToolBase, ABC):
    component_name = "Crawler"

    def _run(self, history, **kwargs):
+        from api.utils.web_utils import is_valid_url
        ans = self.get_input()
        ans = " - ".join(ans["content"]) if "content" in ans else ""
        if not is_valid_url(ans):
--- a/agent/tools/searxng.py
+++ b/agent/tools/searxng.py
@ -0,0 +1,156 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import logging
+import os
+import time
+from abc import ABC
+import requests
+from agent.tools.base import ToolMeta, ToolParamBase, ToolBase
+from api.utils.api_utils import timeout
+
+
+class SearXNGParam(ToolParamBase):
+    """
+    Define the SearXNG component parameters.
+    """
+
+    def __init__(self):
+        self.meta: ToolMeta = {
+            "name": "searxng_search",
+            "description": "SearXNG is a privacy-focused metasearch engine that aggregates results from multiple search engines without tracking users. It provides comprehensive web search capabilities.",
+            "parameters": {
+                "query": {
+                    "type": "string",
+                    "description": "The search keywords to execute with SearXNG. The keywords should be the most important words/terms(includes synonyms) from the original request.",
+                    "default": "{sys.query}",
+                    "required": True
+                },
+                "searxng_url": {
+                    "type": "string",
+                    "description": "The base URL of your SearXNG instance (e.g., http://localhost:4000). This is required to connect to your SearXNG server.",
+                    "required": False,
+                    "default": ""
+                }
+            }
+        }
+        super().__init__()
+        self.top_n = 10
+        self.searxng_url = ""
+
+    def check(self):
+        # Keep validation lenient so opening try-run panel won't fail without URL.
+        # Coerce top_n to int if it comes as string from UI.
+        try:
+            if isinstance(self.top_n, str):
+                self.top_n = int(self.top_n.strip())
+        except Exception:
+            pass
+        self.check_positive_integer(self.top_n, "Top N")
+
+    def get_input_form(self) -> dict[str, dict]:
+        return {
+            "query": {
+                "name": "Query",
+                "type": "line"
+            },
+            "searxng_url": {
+                "name": "SearXNG URL",
+                "type": "line",
+                "placeholder": "http://localhost:4000"
+            }
+        }
+
+
+class SearXNG(ToolBase, ABC):
+    component_name = "SearXNG"
+
+    @timeout(os.environ.get("COMPONENT_EXEC_TIMEOUT", 12))
+    def _invoke(self, **kwargs):
+        # Gracefully handle try-run without inputs
+        query = kwargs.get("query")
+        if not query or not isinstance(query, str) or not query.strip():
+            self.set_output("formalized_content", "")
+            return ""
+
+        searxng_url = (kwargs.get("searxng_url") or getattr(self._param, "searxng_url", "") or "").strip()
+        # In try-run, if no URL configured, just return empty instead of raising
+        if not searxng_url:
+            self.set_output("formalized_content", "")
+            return ""
+
+        last_e = ""
+        for _ in range(self._param.max_retries+1):
+            try:
+                # 构建搜索参数
+                search_params = {
+                    'q': query,
+                    'format': 'json',
+                    'categories': 'general',
+                    'language': 'auto',
+                    'safesearch': 1,
+                    'pageno': 1
+                }
+
+                # 发送搜索请求
+                response = requests.get(
+                    f"{searxng_url}/search",
+                    params=search_params,
+                    timeout=10
+                )
+                response.raise_for_status()
+                
+                data = response.json()
+                
+                # 验证响应数据
+                if not data or not isinstance(data, dict):
+                    raise ValueError("Invalid response from SearXNG")
+                
+                results = data.get("results", [])
+                if not isinstance(results, list):
+                    raise ValueError("Invalid results format from SearXNG")
+                
+                # 限制结果数量
+                results = results[:self._param.top_n]
+                
+                # 处理搜索结果
+                self._retrieve_chunks(results,
+                                      get_title=lambda r: r.get("title", ""),
+                                      get_url=lambda r: r.get("url", ""),
+                                      get_content=lambda r: r.get("content", ""))
+                
+                self.set_output("json", results)
+                return self.output("formalized_content")
+
+            except requests.RequestException as e:
+                last_e = f"Network error: {e}"
+                logging.exception(f"SearXNG network error: {e}")
+                time.sleep(self._param.delay_after_error)
+            except Exception as e:
+                last_e = str(e)
+                logging.exception(f"SearXNG error: {e}")
+                time.sleep(self._param.delay_after_error)
+
+        if last_e:
+            self.set_output("_ERROR", last_e)
+            return f"SearXNG error: {last_e}"
+
+        assert False, self.output()
+
+    def thoughts(self) -> str:
+        return """
+Keywords: {} 
+Searching with SearXNG for relevant results...
+                """.format(self.get_input().get("query", "-_-!"))
--- a/api/apps/chunk_app.py
+++ b/api/apps/chunk_app.py
@ -93,6 +93,7 @@ def list_chunk():
 def get():
    chunk_id = request.args["chunk_id"]
    try:
+        chunk = None
        tenants = UserTenantService.query(user_id=current_user.id)
        if not tenants:
            return get_data_error_result(message="Tenant not found!")
--- a/api/apps/dialog_app.py
+++ b/api/apps/dialog_app.py
@ -66,7 +66,7 @@ def set_dialog():

    if not is_create:
        if not req.get("kb_ids", []) and not prompt_config.get("tavily_api_key") and "{knowledge}" in prompt_config['system']:
-            return get_data_error_result(message="Please remove `{knowledge}` in system prompt since no knowledge base/Tavily used here.")
+            return get_data_error_result(message="Please remove `{knowledge}` in system prompt since no knowledge base / Tavily used here.")

        for p in prompt_config["parameters"]:
            if p["optional"]:
--- a/api/apps/llm_app.py
+++ b/api/apps/llm_app.py
@ -243,7 +243,7 @@ def add_llm():
                model_name=mdl_nm,
                base_url=llm["api_base"]
            )
-            arr, tc = mdl.similarity("Hello~ Ragflower!", ["Hi, there!", "Ohh, my friend!"])
+            arr, tc = mdl.similarity("Hello~ RAGFlower!", ["Hi, there!", "Ohh, my friend!"])
            if len(arr) == 0:
                raise Exception("Not known.")
        except KeyError:
@ -271,7 +271,7 @@ def add_llm():
            key=llm["api_key"], model_name=mdl_nm, base_url=llm["api_base"]
        )
        try:
-            for resp in mdl.tts("Hello~ Ragflower!"):
+            for resp in mdl.tts("Hello~ RAGFlower!"):
                pass
        except RuntimeError as e:
            msg += f"\nFail to access model({factory}/{mdl_nm})." + str(e)
--- a/api/apps/mcp_server_app.py
+++ b/api/apps/mcp_server_app.py
@ -82,7 +82,7 @@ def create() -> Response:

    server_name = req.get("name", "")
    if not server_name or len(server_name.encode("utf-8")) > 255:
-        return get_data_error_result(message=f"Invaild MCP name or length is {len(server_name)} which is large than 255.")
+        return get_data_error_result(message=f"Invalid MCP name or length is {len(server_name)} which is large than 255.")

    e, _ = MCPServerService.get_by_name_and_tenant(name=server_name, tenant_id=current_user.id)
    if e:
@ -90,7 +90,7 @@ def create() -> Response:

    url = req.get("url", "")
    if not url:
-        return get_data_error_result(message="Invaild url.")
+        return get_data_error_result(message="Invalid url.")

    headers = safe_json_parse(req.get("headers", {}))
    req["headers"] = headers
@ -141,10 +141,10 @@ def update() -> Response:
        return get_data_error_result(message="Unsupported MCP server type.")
    server_name = req.get("name", mcp_server.name)
    if server_name and len(server_name.encode("utf-8")) > 255:
-        return get_data_error_result(message=f"Invaild MCP name or length is {len(server_name)} which is large than 255.")
+        return get_data_error_result(message=f"Invalid MCP name or length is {len(server_name)} which is large than 255.")
    url = req.get("url", mcp_server.url)
    if not url:
-        return get_data_error_result(message="Invaild url.")
+        return get_data_error_result(message="Invalid url.")

    headers = safe_json_parse(req.get("headers", mcp_server.headers))
    req["headers"] = headers
@ -218,7 +218,7 @@ def import_multiple() -> Response:
                continue

            if not server_name or len(server_name.encode("utf-8")) > 255:
-                results.append({"server": server_name, "success": False, "message": f"Invaild MCP name or length is {len(server_name)} which is large than 255."})
+                results.append({"server": server_name, "success": False, "message": f"Invalid MCP name or length is {len(server_name)} which is large than 255."})
                continue

            base_name = server_name
@ -409,7 +409,7 @@ def test_mcp() -> Response:

    url = req.get("url", "")
    if not url:
-        return get_data_error_result(message="Invaild MCP url.")
+        return get_data_error_result(message="Invalid MCP url.")

    server_type = req.get("server_type", "")
    if server_type not in VALID_MCP_SERVER_TYPES:
--- a/api/apps/sdk/chat.py
+++ b/api/apps/sdk/chat.py
@ -150,10 +150,10 @@ def update(tenant_id, chat_id):
    if not DialogService.query(tenant_id=tenant_id, id=chat_id, status=StatusEnum.VALID.value):
        return get_error_data_result(message="You do not own the chat")
    req = request.json
-    ids = req.get("dataset_ids")
+    ids = req.get("dataset_ids", [])
    if "show_quotation" in req:
        req["do_refer"] = req.pop("show_quotation")
-    if ids is not None:
+    if ids:
        for kb_id in ids:
            kbs = KnowledgebaseService.accessible(kb_id=kb_id, user_id=tenant_id)
            if not kbs:
--- a/api/apps/sdk/dify_retrieval.py
+++ b/api/apps/sdk/dify_retrieval.py
@ -24,6 +24,7 @@ from api.db.services.llm_service import LLMBundle
 from api import settings
 from api.utils.api_utils import validate_request, build_error_result, apikey_required
 from rag.app.tag import label_question
+from api.db.services.dialog_service import meta_filter


@manager.route('/dify/retrieval', methods=['POST'])  # noqa: F821
@ -37,18 +38,23 @@ def retrieval(tenant_id):
    retrieval_setting = req.get("retrieval_setting", {})
    similarity_threshold = float(retrieval_setting.get("score_threshold", 0.0))
    top = int(retrieval_setting.get("top_k", 1024))
-
+    metadata_condition = req.get("metadata_condition",{})
+    metas = DocumentService.get_meta_by_kbs([kb_id])
+ 
+    doc_ids = []
    try:

        e, kb = KnowledgebaseService.get_by_id(kb_id)
        if not e:
            return build_error_result(message="Knowledgebase not found!", code=settings.RetCode.NOT_FOUND)

-        if kb.tenant_id != tenant_id:
-            return build_error_result(message="Knowledgebase not found!", code=settings.RetCode.NOT_FOUND)
-
        embd_mdl = LLMBundle(kb.tenant_id, LLMType.EMBEDDING.value, llm_name=kb.embd_id)
-
+        print(metadata_condition)
+        print("after",convert_conditions(metadata_condition))
+        doc_ids.extend(meta_filter(metas, convert_conditions(metadata_condition)))
+        print("doc_ids",doc_ids)
+        if not doc_ids and metadata_condition is not None:
+            doc_ids = ['-999']
        ranks = settings.retrievaler.retrieval(
            question,
            embd_mdl,
@ -59,6 +65,7 @@ def retrieval(tenant_id):
            similarity_threshold=similarity_threshold,
            vector_similarity_weight=0.3,
            top=top,
+            doc_ids=doc_ids,
            rank_feature=label_question(question, [kb])
        )

@ -93,3 +100,20 @@ def retrieval(tenant_id):
            )
        logging.exception(e)
        return build_error_result(message=str(e), code=settings.RetCode.SERVER_ERROR)
+
+def convert_conditions(metadata_condition):
+    if metadata_condition is None:
+        metadata_condition = {}
+    op_mapping = {
+        "is": "=",
+        "not is": "≠"
+    }
+    return [
+    {
+        "op": op_mapping.get(cond["comparison_operator"], cond["comparison_operator"]),
+        "key": cond["name"],
+        "value": cond["value"]
+    }
+    for cond in metadata_condition.get("conditions", [])
+]
+
--- a/api/apps/sdk/session.py
+++ b/api/apps/sdk/session.py
@ -16,8 +16,10 @@
 import json
 import re
 import time
+
 import tiktoken
 from flask import Response, jsonify, request
+
 from agent.canvas import Canvas
 from api import settings
 from api.db import LLMType, StatusEnum
@ -27,7 +29,8 @@ from api.db.services.canvas_service import UserCanvasService, completionOpenAI
 from api.db.services.canvas_service import completion as agent_completion
 from api.db.services.conversation_service import ConversationService, iframe_completion
 from api.db.services.conversation_service import completion as rag_completion
-from api.db.services.dialog_service import DialogService, ask, chat, gen_mindmap
+from api.db.services.dialog_service import DialogService, ask, chat, gen_mindmap, meta_filter
+from api.db.services.document_service import DocumentService
 from api.db.services.knowledgebase_service import KnowledgebaseService
 from api.db.services.llm_service import LLMBundle
 from api.db.services.search_service import SearchService
@ -37,7 +40,7 @@ from api.utils.api_utils import check_duplicate_ids, get_data_openai, get_error_
 from rag.app.tag import label_question
 from rag.prompts import chunks_format
 from rag.prompts.prompt_template import load_prompt
-from rag.prompts.prompts import cross_languages, keyword_extraction
+from rag.prompts.prompts import cross_languages, gen_meta_filter, keyword_extraction


@manager.route("/chats/<chat_id>/sessions", methods=["POST"])  # noqa: F821
@ -81,21 +84,13 @@ def create_agent_session(tenant_id, agent_id):
    if not isinstance(cvs.dsl, str):
        cvs.dsl = json.dumps(cvs.dsl, ensure_ascii=False)

-    session_id=get_uuid()
+    session_id = get_uuid()
    canvas = Canvas(cvs.dsl, tenant_id, agent_id)
    canvas.reset()
-    conv = {
-        "id": session_id,
-        "dialog_id": cvs.id,
-        "user_id": user_id,
-        "message": [],
-        "source": "agent",
-        "dsl": cvs.dsl
-    }
-    API4ConversationService.save(**conv)

    cvs.dsl = json.loads(str(canvas))
    conv = {"id": session_id, "dialog_id": cvs.id, "user_id": user_id, "message": [{"role": "assistant", "content": canvas.get_prologue()}], "source": "agent", "dsl": cvs.dsl}
+    API4ConversationService.save(**conv)
    conv["agent_id"] = conv.pop("dialog_id")
    return get_result(data=conv)

@ -419,7 +414,7 @@ def agents_completion_openai_compatibility(tenant_id, agent_id):
                tenant_id,
                agent_id,
                question,
-                session_id=req.get("id", req.get("metadata", {}).get("id", "")),
+                session_id=req.get("session_id", req.get("id", "") or req.get("metadata", {}).get("id", "")),
                stream=True,
                **req,
            ),
@ -437,7 +432,7 @@ def agents_completion_openai_compatibility(tenant_id, agent_id):
                tenant_id,
                agent_id,
                question,
-                session_id=req.get("id", req.get("metadata", {}).get("id", "")),
+                session_id=req.get("session_id", req.get("id", "") or req.get("metadata", {}).get("id", "")),
                stream=False,
                **req,
            )
@ -450,7 +445,6 @@ def agents_completion_openai_compatibility(tenant_id, agent_id):
 def agent_completions(tenant_id, agent_id):
    req = request.json

-    ans = {}
    if req.get("stream", True):

        def generate():
@ -461,7 +455,7 @@ def agent_completions(tenant_id, agent_id):
                    except Exception:
                        continue

-                if ans.get("event") != "message":
+                if ans.get("event") not in ["message", "message_end"]:
                    continue

                yield answer
@ -475,12 +469,25 @@ def agent_completions(tenant_id, agent_id):
        resp.headers.add_header("Content-Type", "text/event-stream; charset=utf-8")
        return resp

+    full_content = ""
+    reference = {}
+    final_ans = ""
    for answer in agent_completion(tenant_id=tenant_id, agent_id=agent_id, **req):
        try:
-            ans = json.loads(answer[5:])  # remove "data:"
+            ans = json.loads(answer[5:])
+
+            if ans["event"] == "message":
+                full_content += ans["data"]["content"]
+
+            if ans.get("data", {}).get("reference", None):
+                reference.update(ans["data"]["reference"])
+
+            final_ans = ans
        except Exception as e:
            return get_result(data=f"**ERROR**: {str(e)}")
-    return get_result(data=ans)
+    final_ans["data"]["content"] = full_content
+    final_ans["data"]["reference"] = reference
+    return get_result(data=final_ans)


@manager.route("/chats/<chat_id>/sessions", methods=["GET"])  # noqa: F821
@ -575,12 +582,12 @@ def list_agent_session(tenant_id, agent_id):
                if message_num != 0 and messages[message_num]["role"] != "user":
                    chunk_list = []
                    # Add boundary and type checks to prevent KeyError
-                    if (chunk_num < len(conv["reference"]) and
-                            conv["reference"][chunk_num] is not None and
-                            isinstance(conv["reference"][chunk_num], dict) and
-                            "chunks" in conv["reference"][chunk_num]):
+                    if chunk_num < len(conv["reference"]) and conv["reference"][chunk_num] is not None and isinstance(conv["reference"][chunk_num], dict) and "chunks" in conv["reference"][chunk_num]:
                        chunks = conv["reference"][chunk_num]["chunks"]
                        for chunk in chunks:
+                            # Ensure chunk is a dictionary before calling get method
+                            if not isinstance(chunk, dict):
+                                continue
                            new_chunk = {
                                "id": chunk.get("chunk_id", chunk.get("id")),
                                "content": chunk.get("content_with_weight", chunk.get("content")),
@ -876,14 +883,7 @@ def begin_inputs(agent_id):
        return get_error_data_result(f"Can't find agent by ID: {agent_id}")

    canvas = Canvas(json.dumps(cvs.dsl), objs[0].tenant_id)
-    return get_result(
-        data={
-            "title": cvs.title,
-            "avatar": cvs.avatar,
-            "inputs": canvas.get_component_input_form("begin"),
-            "prologue": canvas.get_prologue()
-        }
-    )
+    return get_result(data={"title": cvs.title, "avatar": cvs.avatar, "inputs": canvas.get_component_input_form("begin"), "prologue": canvas.get_prologue(), "mode": canvas.get_mode()})


@manager.route("/searchbots/ask", methods=["POST"])  # noqa: F821
@ -909,7 +909,7 @@ def ask_about_embedded():
    def stream():
        nonlocal req, uid
        try:
-            for ans in ask(req["question"], req["kb_ids"], uid, search_config):
+            for ans in ask(req["question"], req["kb_ids"], uid, search_config=search_config):
                yield "data:" + json.dumps({"code": 0, "message": "", "data": ans}, ensure_ascii=False) + "\n\n"
        except Exception as e:
            yield "data:" + json.dumps({"code": 500, "message": str(e), "data": {"answer": "**ERROR**: " + str(e), "reference": []}}, ensure_ascii=False) + "\n\n"
@ -923,7 +923,7 @@ def ask_about_embedded():
    return resp


-@manager.route("/searchbots/retrieval_test", methods=['POST'])  # noqa: F821
+@manager.route("/searchbots/retrieval_test", methods=["POST"])  # noqa: F821
@validate_request("kb_id", "question")
 def retrieval_test_embedded():
    token = request.headers.get("Authorization").split()
@ -953,18 +953,30 @@ def retrieval_test_embedded():
    if not tenant_id:
        return get_error_data_result(message="permission denined.")

+    if req.get("search_id", ""):
+        search_config = SearchService.get_detail(req.get("search_id", "")).get("search_config", {})
+        meta_data_filter = search_config.get("meta_data_filter", {})
+        metas = DocumentService.get_meta_by_kbs(kb_ids)
+        if meta_data_filter.get("method") == "auto":
+            chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_name=search_config.get("chat_id", ""))
+            filters = gen_meta_filter(chat_mdl, metas, question)
+            doc_ids.extend(meta_filter(metas, filters))
+            if not doc_ids:
+                doc_ids = None
+        elif meta_data_filter.get("method") == "manual":
+            doc_ids.extend(meta_filter(metas, meta_data_filter["manual"]))
+            if not doc_ids:
+                doc_ids = None
+
    try:
        tenants = UserTenantService.query(user_id=tenant_id)
        for kb_id in kb_ids:
            for tenant in tenants:
-                if KnowledgebaseService.query(
-                        tenant_id=tenant.tenant_id, id=kb_id):
+                if KnowledgebaseService.query(tenant_id=tenant.tenant_id, id=kb_id):
                    tenant_ids.append(tenant.tenant_id)
                    break
            else:
-                return get_json_result(
-                    data=False, message='Only owner of knowledgebase authorized for this operation.',
-                    code=settings.RetCode.OPERATING_ERROR)
+                return get_json_result(data=False, message="Only owner of knowledgebase authorized for this operation.", code=settings.RetCode.OPERATING_ERROR)

        e, kb = KnowledgebaseService.get_by_id(kb_ids[0])
        if not e:
@ -984,17 +996,11 @@ def retrieval_test_embedded():
            question += keyword_extraction(chat_mdl, question)

        labels = label_question(question, [kb])
-        ranks = settings.retrievaler.retrieval(question, embd_mdl, tenant_ids, kb_ids, page, size,
-                               similarity_threshold, vector_similarity_weight, top,
-                               doc_ids, rerank_mdl=rerank_mdl, highlight=req.get("highlight"),
-                               rank_feature=labels
-                               )
+        ranks = settings.retrievaler.retrieval(
+            question, embd_mdl, tenant_ids, kb_ids, page, size, similarity_threshold, vector_similarity_weight, top, doc_ids, rerank_mdl=rerank_mdl, highlight=req.get("highlight"), rank_feature=labels
+        )
        if use_kg:
-            ck = settings.kg_retrievaler.retrieval(question,
-                                                   tenant_ids,
-                                                   kb_ids,
-                                                   embd_mdl,
-                                                   LLMBundle(kb.tenant_id, LLMType.CHAT))
+            ck = settings.kg_retrievaler.retrieval(question, tenant_ids, kb_ids, embd_mdl, LLMBundle(kb.tenant_id, LLMType.CHAT))
            if ck["content_with_weight"]:
                ranks["chunks"].insert(0, ck)

@ -1005,8 +1011,7 @@ def retrieval_test_embedded():
        return get_json_result(data=ranks)
    except Exception as e:
        if str(e).find("not_found") > 0:
-            return get_json_result(data=False, message='No chunk found! Check the chunk status please!',
-                                   code=settings.RetCode.DATA_ERROR)
+            return get_json_result(data=False, message="No chunk found! Check the chunk status please!", code=settings.RetCode.DATA_ERROR)
        return server_error_response(e)


--- a/api/apps/search_app.py
+++ b/api/apps/search_app.py
@ -43,7 +43,7 @@ def create():
        return get_data_error_result(message=f"Search name length is {len(search_name)} which is large than 255.")
    e, _ = TenantService.get_by_id(current_user.id)
    if not e:
-        return get_data_error_result(message="Authorizationd identity.")
+        return get_data_error_result(message="Authorized identity.")

    search_name = search_name.strip()
    search_name = duplicate_name(SearchService.query, name=search_name, tenant_id=current_user.id, status=StatusEnum.VALID.value)
@ -78,7 +78,7 @@ def update():
    tenant_id = req["tenant_id"]
    e, _ = TenantService.get_by_id(tenant_id)
    if not e:
-        return get_data_error_result(message="Authorizationd identity.")
+        return get_data_error_result(message="Authorized identity.")

    search_id = req["search_id"]
    if not SearchService.accessible4deletion(search_id, current_user.id):
@ -155,8 +155,9 @@ def list_search_app():
    owner_ids = req.get("owner_ids", [])
    try:
        if not owner_ids:
-            tenants = TenantService.get_joined_tenants_by_user_id(current_user.id)
-            tenants = [m["tenant_id"] for m in tenants]
+            # tenants = TenantService.get_joined_tenants_by_user_id(current_user.id)
+            # tenants = [m["tenant_id"] for m in tenants]
+            tenants = []
            search_apps, total = SearchService.get_by_tenant_ids(tenants, current_user.id, page_number, items_per_page, orderby, desc, keywords)
        else:
            tenants = owner_ids
--- a/api/db/db_models.py
+++ b/api/db/db_models.py
@ -824,9 +824,8 @@ class UserCanvas(DataBaseModel):
 class CanvasTemplate(DataBaseModel):
    id = CharField(max_length=32, primary_key=True)
    avatar = TextField(null=True, help_text="avatar base64 string")
-    title = CharField(max_length=255, null=True, help_text="Canvas title")
-
-    description = TextField(null=True, help_text="Canvas description")
+    title = JSONField(null=True, default=dict, help_text="Canvas title")
+    description = JSONField(null=True, default=dict, help_text="Canvas description")
    canvas_type = CharField(max_length=32, null=True, help_text="Canvas type", index=True)
    dsl = JSONField(null=True, default={})

@ -1021,4 +1020,13 @@ def migrate_db():
        migrate(migrator.add_column("dialog", "meta_data_filter", JSONField(null=True, default={})))
    except Exception:
        pass
+
+    try:
+        migrate(migrator.alter_column_type("canvas_template", "title", JSONField(null=True, default=dict, help_text="Canvas title")))
+    except Exception:
+        pass
+    try:
+        migrate(migrator.alter_column_type("canvas_template", "description", JSONField(null=True, default=dict, help_text="Canvas description")))
+    except Exception:
+        pass
    logging.disable(logging.NOTSET)
--- a/api/db/services/canvas_service.py
+++ b/api/db/services/canvas_service.py
@ -134,6 +134,7 @@ class UserCanvasService(CommonService):
            return False
        return True

+
 def completion(tenant_id, agent_id, session_id=None, **kwargs):
    query = kwargs.get("query", "") or kwargs.get("question", "")
    files = kwargs.get("files", [])
@ -163,7 +164,8 @@ def completion(tenant_id, agent_id, session_id=None, **kwargs):
            "user_id": user_id,
            "message": [],
            "source": "agent",
-            "dsl": cvs.dsl
+            "dsl": cvs.dsl,
+            "reference": []
        }
        API4ConversationService.save(**conv)
        conv = API4Conversation(**conv)
@ -211,28 +213,33 @@ def completionOpenAI(tenant_id, agent_id, question, session_id=None, stream=True
                    except Exception as e:
                        logging.exception(f"Agent OpenAI-Compatible completionOpenAI parse answer failed: {e}")
                        continue
-
-                if ans.get("event") != "message":
+                if ans.get("event") not in ["message", "message_end"]:
                    continue

-                content_piece = ans["data"]["content"]
+                content_piece = ""
+                if ans["event"] == "message":
+                    content_piece = ans["data"]["content"]
+
                completion_tokens += len(tiktokenenc.encode(content_piece))

-                yield "data: " + json.dumps(
-                    get_data_openai(
+                openai_data = get_data_openai(
                        id=session_id or str(uuid4()),
                        model=agent_id,
                        content=content_piece,
                        prompt_tokens=prompt_tokens,
                        completion_tokens=completion_tokens,
                        stream=True
-                    ),
-                    ensure_ascii=False
-                ) + "\n\n"
+                    )
+
+                if ans.get("data", {}).get("reference", None):
+                    openai_data["choices"][0]["delta"]["reference"] = ans["data"]["reference"]
+
+                yield "data: " + json.dumps(openai_data, ensure_ascii=False) + "\n\n"

            yield "data: [DONE]\n\n"

        except Exception as e:
+            logging.exception(e)
            yield "data: " + json.dumps(
                get_data_openai(
                    id=session_id or str(uuid4()),
@ -250,6 +257,7 @@ def completionOpenAI(tenant_id, agent_id, question, session_id=None, stream=True
    else:
        try:
            all_content = ""
+            reference = {}
            for ans in completion(
                tenant_id=tenant_id,
                agent_id=agent_id,
@ -260,13 +268,18 @@ def completionOpenAI(tenant_id, agent_id, question, session_id=None, stream=True
            ):
                if isinstance(ans, str):
                    ans = json.loads(ans[5:])
-                if ans.get("event") != "message":
+                if ans.get("event") not in ["message", "message_end"]:
                    continue
-                all_content += ans["data"]["content"]
+
+                if ans["event"] == "message":
+                    all_content += ans["data"]["content"]
+
+                if ans.get("data", {}).get("reference", None):
+                    reference.update(ans["data"]["reference"])

            completion_tokens = len(tiktokenenc.encode(all_content))

-            yield get_data_openai(
+            openai_data = get_data_openai(
                id=session_id or str(uuid4()),
                model=agent_id,
                prompt_tokens=prompt_tokens,
@ -276,7 +289,12 @@ def completionOpenAI(tenant_id, agent_id, question, session_id=None, stream=True
                param=None
            )

+            if reference:
+                openai_data["choices"][0]["message"]["reference"] = reference
+
+            yield openai_data
        except Exception as e:
+            logging.exception(e)
            yield get_data_openai(
                id=session_id or str(uuid4()),
                model=agent_id,
--- a/api/db/services/dialog_service.py
+++ b/api/db/services/dialog_service.py
@ -256,10 +256,10 @@ def repair_bad_citation_formats(answer: str, kbinfos: dict, idx: set):


 def meta_filter(metas: dict, filters: list[dict]):
-    doc_ids = []
+    doc_ids = set([])

    def filter_out(v2docs, operator, value):
-        nonlocal doc_ids
+        ids = []
        for input, docids in v2docs.items():
            try:
                input = float(input)
@ -284,16 +284,24 @@ def meta_filter(metas: dict, filters: list[dict]):
                ]:
                try:
                    if all(conds):
-                        doc_ids.extend(docids)
+                        ids.extend(docids)
+                        break
                except Exception:
                    pass
+        return ids

    for k, v2docs in metas.items():
        for f in filters:
            if k != f["key"]:
                continue
-            filter_out(v2docs, f["op"], f["value"])
-    return doc_ids
+            ids = filter_out(v2docs, f["op"], f["value"])
+            if not doc_ids:
+                doc_ids = set(ids)
+            else:
+                doc_ids = doc_ids & set(ids)
+            if not doc_ids:
+                return []
+    return list(doc_ids)


 def chat(dialog, messages, stream=True, **kwargs):
--- a/api/db/services/llm_service.py
+++ b/api/db/services/llm_service.py
@ -152,7 +152,7 @@ class LLMBundle(LLM4Tenant):

    def describe_with_prompt(self, image, prompt):
        if self.langfuse:
-            generation = self.language.start_generation(trace_context=self.trace_context, name="describe_with_prompt", metadata={"model": self.llm_name, "prompt": prompt})
+            generation = self.langfuse.start_generation(trace_context=self.trace_context, name="describe_with_prompt", metadata={"model": self.llm_name, "prompt": prompt})

        txt, used_tokens = self.mdl.describe_with_prompt(image, prompt)
        if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, used_tokens):
--- a/api/db/services/user_service.py
+++ b/api/db/services/user_service.py
@ -133,6 +133,13 @@ class UserService(CommonService):
                cls.model.update(user_dict).where(
                    cls.model.id == user_id).execute()

+    @classmethod
+    @DB.connection_context()
+    def is_admin(cls, user_id):
+        return cls.model.select().where(
+            cls.model.id == user_id,
+            cls.model.is_superuser == 1).count() > 0
+

 class TenantService(CommonService):
    """Service class for managing tenant-related database operations.
--- a/api/utils/api_utils.py
+++ b/api/utils/api_utils.py
@ -17,6 +17,7 @@ import asyncio
 import functools
 import json
 import logging
+import os
 import queue
 import random
 import threading
@ -353,7 +354,7 @@ def get_parser_config(chunk_method, parser_config):
    if not chunk_method:
        chunk_method = "naive"

-    # Define default configurations for each chunk method
+    # Define default configurations for each chunking method
    key_mapping = {
        "naive": {"chunk_token_num": 512, "delimiter": r"\n", "html4excel": False, "layout_recognize": "DeepDOC", "raptor": {"use_raptor": False}, "graphrag": {"use_graphrag": False}},
        "qa": {"raptor": {"use_raptor": False}, "graphrag": {"use_graphrag": False}},
@ -667,7 +668,10 @@ def timeout(seconds: float | int = None, attempts: int = 2, *, exception: Option

            for a in range(attempts):
                try:
-                    result = result_queue.get(timeout=seconds)
+                    if os.environ.get("ENABLE_TIMEOUT_ASSERTION"):
+                        result = result_queue.get(timeout=seconds)
+                    else:
+                        result = result_queue.get()
                    if isinstance(result, Exception):
                        raise result
                    return result
@ -682,7 +686,10 @@ def timeout(seconds: float | int = None, attempts: int = 2, *, exception: Option

            for a in range(attempts):
                try:
-                    with trio.fail_after(seconds):
+                    if os.environ.get("ENABLE_TIMEOUT_ASSERTION"):
+                        with trio.fail_after(seconds):
+                            return await func(*args, **kwargs)
+                    else:
                        return await func(*args, **kwargs)
                except trio.TooSlowError:
                    if a < attempts - 1:
--- a/conf/llm_factories.json
+++ b/conf/llm_factories.json
@ -532,23 +532,65 @@
            "tags": "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION",
            "status": "1",
            "llm": [
+                {
+                    "llm_name": "glm-4.5",
+                    "tags": "LLM,CHAT,128K",
+                    "max_tokens": 128000,
+                    "model_type": "chat",
+                    "is_tools": true
+                },
+                {
+                    "llm_name": "glm-4.5-x",
+                    "tags": "LLM,CHAT,128k",
+                    "max_tokens": 128000,
+                    "model_type": "chat",
+                    "is_tools": true
+                },
+                {
+                    "llm_name": "glm-4.5-air",
+                    "tags": "LLM,CHAT,128K",
+                    "max_tokens": 128000,
+                    "model_type": "chat",
+                    "is_tools": true
+                },
+                {
+                    "llm_name": "glm-4.5-airx",
+                    "tags": "LLM,CHAT,128k",
+                    "max_tokens": 128000,
+                    "model_type": "chat",
+                    "is_tools": true
+                },
+                {
+                    "llm_name": "glm-4.5-flash",
+                    "tags": "LLM,CHAT,128k",
+                    "max_tokens": 128000,
+                    "model_type": "chat",
+                    "is_tools": true
+                },
+                {
+                    "llm_name": "glm-4.5v",
+                    "tags": "LLM,IMAGE2TEXT,64,",
+                    "max_tokens": 64000,
+                    "model_type": "image2text",
+                    "is_tools": false
+                },
                {
                    "llm_name": "glm-4-plus",
-                    "tags": "LLM,CHAT,",
+                    "tags": "LLM,CHAT,128K",
                    "max_tokens": 128000,
                    "model_type": "chat",
                    "is_tools": true
                },
                {
                    "llm_name": "glm-4-0520",
-                    "tags": "LLM,CHAT,",
+                    "tags": "LLM,CHAT,128K",
                    "max_tokens": 128000,
                    "model_type": "chat",
                    "is_tools": true
                },
                {
                    "llm_name": "glm-4",
-                    "tags": "LLM,CHAT,",
+                    "tags":"LLM,CHAT,128K",
                    "max_tokens": 128000,
                    "model_type": "chat",
                    "is_tools": true
--- a/deepdoc/parser/init.py
+++ b/deepdoc/parser/init.py
@ -14,13 +14,15 @@
 #  limitations under the License.
 #

-from .pdf_parser import RAGFlowPdfParser as PdfParser, PlainParser
 from .docx_parser import RAGFlowDocxParser as DocxParser
 from .excel_parser import RAGFlowExcelParser as ExcelParser
-from .ppt_parser import RAGFlowPptParser as PptParser
 from .html_parser import RAGFlowHtmlParser as HtmlParser
 from .json_parser import RAGFlowJsonParser as JsonParser
+from .markdown_parser import MarkdownElementExtractor
 from .markdown_parser import RAGFlowMarkdownParser as MarkdownParser
+from .pdf_parser import PlainParser
+from .pdf_parser import RAGFlowPdfParser as PdfParser
+from .ppt_parser import RAGFlowPptParser as PptParser
 from .txt_parser import RAGFlowTxtParser as TxtParser

 __all__ = [
@ -33,4 +35,6 @@ __all__ = [
    "JsonParser",
    "MarkdownParser",
    "TxtParser",
-]
+    "MarkdownElementExtractor",
+]
+
--- a/deepdoc/parser/excel_parser.py
+++ b/deepdoc/parser/excel_parser.py
@ -131,6 +131,12 @@ class RAGFlowExcelParser:

        return tb_chunks

+    def markdown(self, fnm):
+        import pandas as pd
+        file_like_object = BytesIO(fnm) if not isinstance(fnm, str) else fnm
+        df = pd.read_excel(file_like_object)
+        return df.to_markdown(index=False)
+
    def __call__(self, fnm):
        file_like_object = BytesIO(fnm) if not isinstance(fnm, str) else fnm
        wb = RAGFlowExcelParser._load_excel_to_workbook(file_like_object)
--- a/deepdoc/parser/html_parser.py
+++ b/deepdoc/parser/html_parser.py
@ -15,35 +15,200 @@
 #  limitations under the License.
 #

-from rag.nlp import find_codec
-import readability
-import html_text
+from rag.nlp import find_codec, rag_tokenizer
+import uuid
 import chardet
-
+from bs4 import BeautifulSoup, NavigableString, Tag, Comment
+import html

 def get_encoding(file):
    with open(file,'rb') as f:
        tmp = chardet.detect(f.read())
        return tmp['encoding']

+BLOCK_TAGS = [
+    "h1", "h2", "h3", "h4", "h5", "h6",
+    "p", "div", "article", "section", "aside",
+    "ul", "ol", "li",
+    "table", "pre", "code", "blockquote",
+    "figure", "figcaption"
+]
+TITLE_TAGS = {"h1": "#", "h2": "##", "h3": "###", "h4": "#####", "h5": "#####", "h6": "######"}
+

 class RAGFlowHtmlParser:
-    def __call__(self, fnm, binary=None):
+    def __call__(self, fnm, binary=None, chunk_token_num=None):
        if binary:
            encoding = find_codec(binary)
            txt = binary.decode(encoding, errors="ignore")
        else:
            with open(fnm, "r",encoding=get_encoding(fnm)) as f:
                txt = f.read()
-        return self.parser_txt(txt)
+        return self.parser_txt(txt, chunk_token_num)

    @classmethod
-    def parser_txt(cls, txt):
+    def parser_txt(cls, txt, chunk_token_num):
        if not isinstance(txt, str):
            raise TypeError("txt type should be string!")
-        html_doc = readability.Document(txt)
-        title = html_doc.title()
-        content = html_text.extract_text(html_doc.summary(html_partial=True))
-        txt = f"{title}\n{content}"
-        sections = txt.split("\n")
+
+        temp_sections = []
+        soup = BeautifulSoup(txt, "html5lib")
+        # delete <style> tag
+        for style_tag in soup.find_all(["style", "script"]):
+            style_tag.decompose()
+        # delete <script> tag in <div>
+        for div_tag in soup.find_all("div"):
+            for script_tag in div_tag.find_all("script"):
+                script_tag.decompose()
+        # delete inline style
+        for tag in soup.find_all(True):
+            if 'style' in tag.attrs:
+                del tag.attrs['style']
+        # delete HTML comment
+        for comment in soup.find_all(string=lambda text: isinstance(text, Comment)):
+            comment.extract()
+
+        cls.read_text_recursively(soup.body, temp_sections, chunk_token_num=chunk_token_num)
+        block_txt_list, table_list = cls.merge_block_text(temp_sections)
+        sections = cls.chunk_block(block_txt_list, chunk_token_num=chunk_token_num)
+        for table in table_list:
+            sections.append(table.get("content", ""))
        return sections
+
+    @classmethod
+    def split_table(cls, html_table, chunk_token_num=512):
+        soup = BeautifulSoup(html_table, "html.parser")
+        rows = soup.find_all("tr")
+        tables = []
+        current_table = []
+        current_count = 0
+        table_str_list = []
+        for row in rows:
+            tks_str = rag_tokenizer.tokenize(str(row))
+            token_count = len(tks_str.split(" ")) if tks_str else 0
+            if current_count + token_count > chunk_token_num:
+                tables.append(current_table)
+                current_table = []
+                current_count = 0
+            current_table.append(row)
+            current_count += token_count
+        if current_table:
+            tables.append(current_table)
+
+        for table_rows in tables:
+            new_table = soup.new_tag("table")
+            for row in table_rows:
+                new_table.append(row)
+            table_str_list.append(str(new_table))
+
+        return table_str_list
+
+    @classmethod
+    def read_text_recursively(cls, element, parser_result, chunk_token_num=512, parent_name=None, block_id=None):
+        if isinstance(element, NavigableString):
+            content = element.strip()
+
+            def is_valid_html(content):
+                try:
+                    soup = BeautifulSoup(content, "html.parser")
+                    return bool(soup.find())
+                except Exception:
+                    return False
+
+            return_info = []
+            if content:
+                if is_valid_html(content):
+                    soup = BeautifulSoup(content, "html.parser")
+                    child_info = cls.read_text_recursively(soup, parser_result, chunk_token_num, element.name, block_id)
+                    parser_result.extend(child_info)
+                else:
+                    info = {"content": element.strip(), "tag_name": "inner_text", "metadata": {"block_id": block_id}}
+                    if parent_name:
+                        info["tag_name"] = parent_name
+                    return_info.append(info)
+            return return_info
+        elif isinstance(element, Tag):
+
+            if str.lower(element.name) == "table":
+                table_info_list = []
+                table_id = str(uuid.uuid1())
+                table_list = [html.unescape(str(element))]
+                for t in table_list:
+                    table_info_list.append({"content": t, "tag_name": "table",
+                                            "metadata": {"table_id": table_id, "index": table_list.index(t)}})
+                return table_info_list
+            else:
+                block_id = None
+                if str.lower(element.name) in BLOCK_TAGS:
+                    block_id = str(uuid.uuid1())
+                for child in element.children:
+                    child_info = cls.read_text_recursively(child, parser_result, chunk_token_num, element.name,
+                                                           block_id)
+                    parser_result.extend(child_info)
+        return []
+
+    @classmethod
+    def merge_block_text(cls, parser_result):
+        block_content = []
+        current_content = ""
+        table_info_list = []
+        lask_block_id = None
+        for item in parser_result:
+            content = item.get("content")
+            tag_name = item.get("tag_name")
+            title_flag = tag_name in TITLE_TAGS
+            block_id = item.get("metadata", {}).get("block_id")
+            if block_id:
+                if title_flag:
+                    content = f"{TITLE_TAGS[tag_name]} {content}"
+                if lask_block_id != block_id:
+                    if lask_block_id is not None:
+                        block_content.append(current_content)
+                    current_content = content
+                    lask_block_id = block_id
+                else:
+                    current_content += (" " if current_content else "") + content
+            else:
+                if tag_name == "table":
+                    table_info_list.append(item)
+                else:
+                    current_content += (" " if current_content else "" + content)
+        if current_content:
+            block_content.append(current_content)
+        return block_content, table_info_list
+
+    @classmethod
+    def chunk_block(cls, block_txt_list, chunk_token_num=512):
+        chunks = []
+        current_block = ""
+        current_token_count = 0
+
+        for block in block_txt_list:
+            tks_str = rag_tokenizer.tokenize(block)
+            block_token_count = len(tks_str.split(" ")) if tks_str else 0
+            if block_token_count > chunk_token_num:
+                if current_block:
+                    chunks.append(current_block)
+                start = 0
+                tokens = tks_str.split(" ")
+                while start < len(tokens):
+                    end = start + chunk_token_num
+                    split_tokens = tokens[start:end]
+                    chunks.append(" ".join(split_tokens))
+                    start = end
+                current_block = ""
+                current_token_count = 0
+            else:
+                if current_token_count + block_token_count <= chunk_token_num:
+                    current_block += ("\n" if current_block else "") + block
+                    current_token_count += block_token_count
+                else:
+                    chunks.append(current_block)
+                    current_block = block
+                    current_token_count = block_token_count
+
+        if current_block:
+            chunks.append(current_block)
+
+        return chunks
+
--- a/deepdoc/parser/markdown_parser.py
+++ b/deepdoc/parser/markdown_parser.py
@ -17,8 +17,10 @@

 import re

+import mistune
 from markdown import markdown

+
 class RAGFlowMarkdownParser:
    def __init__(self, chunk_token_num=128):
        self.chunk_token_num = int(chunk_token_num)
@ -35,40 +37,44 @@ class RAGFlowMarkdownParser:
                table_list.append(raw_table)
                if separate_tables:
                    # Skip this match (i.e., remove it)
-                    new_text += working_text[last_end:match.start()] + "\n\n"
+                    new_text += working_text[last_end : match.start()] + "\n\n"
                else:
                    # Replace with rendered HTML
-                    html_table = markdown(raw_table, extensions=['markdown.extensions.tables']) if render else raw_table
-                    new_text += working_text[last_end:match.start()] + html_table + "\n\n"
+                    html_table = markdown(raw_table, extensions=["markdown.extensions.tables"]) if render else raw_table
+                    new_text += working_text[last_end : match.start()] + html_table + "\n\n"
                last_end = match.end()
            new_text += working_text[last_end:]
            return new_text

-        if "|" in markdown_text: # for optimize performance
+        if "|" in markdown_text:  # for optimize performance
            # Standard Markdown table
            border_table_pattern = re.compile(
-                r'''
+                r"""
                (?:\n|^)
                (?:\|.*?\|.*?\|.*?\n)
                (?:\|(?:\s*[:-]+[-| :]*\s*)\|.*?\n)
                (?:\|.*?\|.*?\|.*?\n)+
-            ''', re.VERBOSE)
+            """,
+                re.VERBOSE,
+            )
            working_text = replace_tables_with_rendered_html(border_table_pattern, tables)

            # Borderless Markdown table
            no_border_table_pattern = re.compile(
-                r'''
+                r"""
                (?:\n|^)
                (?:\S.*?\|.*?\n)
                (?:(?:\s*[:-]+[-| :]*\s*).*?\n)
                (?:\S.*?\|.*?\n)+
-                ''', re.VERBOSE)
+                """,
+                re.VERBOSE,
+            )
            working_text = replace_tables_with_rendered_html(no_border_table_pattern, tables)

-        if "<table>" in working_text.lower(): # for optimize performance
-            #HTML table extraction - handle possible html/body wrapper tags
+        if "<table>" in working_text.lower():  # for optimize performance
+            # HTML table extraction - handle possible html/body wrapper tags
            html_table_pattern = re.compile(
-            r'''
+                r"""
            (?:\n|^)
            \s*
            (?:
@ -83,9 +89,10 @@ class RAGFlowMarkdownParser:
            )
            \s*
            (?=\n|$)
-            ''',
-            re.VERBOSE | re.DOTALL | re.IGNORECASE
+            """,
+                re.VERBOSE | re.DOTALL | re.IGNORECASE,
            )
+
            def replace_html_tables():
                nonlocal working_text
                new_text = ""
@ -94,9 +101,9 @@ class RAGFlowMarkdownParser:
                    raw_table = match.group()
                    tables.append(raw_table)
                    if separate_tables:
-                        new_text += working_text[last_end:match.start()] + "\n\n"
+                        new_text += working_text[last_end : match.start()] + "\n\n"
                    else:
-                        new_text += working_text[last_end:match.start()] + raw_table + "\n\n"
+                        new_text += working_text[last_end : match.start()] + raw_table + "\n\n"
                    last_end = match.end()
                new_text += working_text[last_end:]
                working_text = new_text
@ -104,3 +111,163 @@ class RAGFlowMarkdownParser:
            replace_html_tables()

        return working_text, tables
+
+
+class MarkdownElementExtractor:
+    def __init__(self, markdown_content):
+        self.markdown_content = markdown_content
+        self.lines = markdown_content.split("\n")
+        self.ast_parser = mistune.create_markdown(renderer="ast")
+        self.ast_nodes = self.ast_parser(markdown_content)
+
+    def extract_elements(self):
+        """Extract individual elements (headers, code blocks, lists, etc.)"""
+        sections = []
+
+        i = 0
+        while i < len(self.lines):
+            line = self.lines[i]
+
+            if re.match(r"^#{1,6}\s+.*$", line):
+                # header
+                element = self._extract_header(i)
+                sections.append(element["content"])
+                i = element["end_line"] + 1
+            elif line.strip().startswith("```"):
+                # code block
+                element = self._extract_code_block(i)
+                sections.append(element["content"])
+                i = element["end_line"] + 1
+            elif re.match(r"^\s*[-*+]\s+.*$", line) or re.match(r"^\s*\d+\.\s+.*$", line):
+                # list block
+                element = self._extract_list_block(i)
+                sections.append(element["content"])
+                i = element["end_line"] + 1
+            elif line.strip().startswith(">"):
+                # blockquote
+                element = self._extract_blockquote(i)
+                sections.append(element["content"])
+                i = element["end_line"] + 1
+            elif line.strip():
+                # text block (paragraphs and inline elements until next block element)
+                element = self._extract_text_block(i)
+                sections.append(element["content"])
+                i = element["end_line"] + 1
+            else:
+                i += 1
+
+        sections = [section for section in sections if section.strip()]
+        return sections
+
+    def _extract_header(self, start_pos):
+        return {
+            "type": "header",
+            "content": self.lines[start_pos],
+            "start_line": start_pos,
+            "end_line": start_pos,
+        }
+
+    def _extract_code_block(self, start_pos):
+        end_pos = start_pos
+        content_lines = [self.lines[start_pos]]
+
+        # Find the end of the code block
+        for i in range(start_pos + 1, len(self.lines)):
+            content_lines.append(self.lines[i])
+            end_pos = i
+            if self.lines[i].strip().startswith("```"):
+                break
+
+        return {
+            "type": "code_block",
+            "content": "\n".join(content_lines),
+            "start_line": start_pos,
+            "end_line": end_pos,
+        }
+
+    def _extract_list_block(self, start_pos):
+        end_pos = start_pos
+        content_lines = []
+
+        i = start_pos
+        while i < len(self.lines):
+            line = self.lines[i]
+            # check if this line is a list item or continuation of a list
+            if (
+                re.match(r"^\s*[-*+]\s+.*$", line)
+                or re.match(r"^\s*\d+\.\s+.*$", line)
+                or (i > start_pos and not line.strip())
+                or (i > start_pos and re.match(r"^\s{2,}[-*+]\s+.*$", line))
+                or (i > start_pos and re.match(r"^\s{2,}\d+\.\s+.*$", line))
+                or (i > start_pos and re.match(r"^\s+\w+.*$", line))
+            ):
+                content_lines.append(line)
+                end_pos = i
+                i += 1
+            else:
+                break
+
+        return {
+            "type": "list_block",
+            "content": "\n".join(content_lines),
+            "start_line": start_pos,
+            "end_line": end_pos,
+        }
+
+    def _extract_blockquote(self, start_pos):
+        end_pos = start_pos
+        content_lines = []
+
+        i = start_pos
+        while i < len(self.lines):
+            line = self.lines[i]
+            if line.strip().startswith(">") or (i > start_pos and not line.strip()):
+                content_lines.append(line)
+                end_pos = i
+                i += 1
+            else:
+                break
+
+        return {
+            "type": "blockquote",
+            "content": "\n".join(content_lines),
+            "start_line": start_pos,
+            "end_line": end_pos,
+        }
+
+    def _extract_text_block(self, start_pos):
+        """Extract a text block (paragraphs, inline elements) until next block element"""
+        end_pos = start_pos
+        content_lines = [self.lines[start_pos]]
+
+        i = start_pos + 1
+        while i < len(self.lines):
+            line = self.lines[i]
+            # stop if we encounter a block element
+            if re.match(r"^#{1,6}\s+.*$", line) or line.strip().startswith("```") or re.match(r"^\s*[-*+]\s+.*$", line) or re.match(r"^\s*\d+\.\s+.*$", line) or line.strip().startswith(">"):
+                break
+            elif not line.strip():
+                # check if the next line is a block element
+                if i + 1 < len(self.lines) and (
+                    re.match(r"^#{1,6}\s+.*$", self.lines[i + 1])
+                    or self.lines[i + 1].strip().startswith("```")
+                    or re.match(r"^\s*[-*+]\s+.*$", self.lines[i + 1])
+                    or re.match(r"^\s*\d+\.\s+.*$", self.lines[i + 1])
+                    or self.lines[i + 1].strip().startswith(">")
+                ):
+                    break
+                else:
+                    content_lines.append(line)
+                    end_pos = i
+                    i += 1
+            else:
+                content_lines.append(line)
+                end_pos = i
+                i += 1
+
+        return {
+            "type": "text_block",
+            "content": "\n".join(content_lines),
+            "start_line": start_pos,
+            "end_line": end_pos,
+        }
--- a/deepdoc/parser/pdf_parser.py
+++ b/deepdoc/parser/pdf_parser.py
@ -93,6 +93,7 @@ class RAGFlowPdfParser:
                model_dir, "updown_concat_xgb.model"))

        self.page_from = 0
+        self.column_num = 1

    def __char_width(self, c):
        return (c["x1"] - c["x0"]) // max(len(c["text"]), 1)
@ -427,10 +428,18 @@ class RAGFlowPdfParser:
            i += 1
        self.boxes = bxs

-    def _naive_vertical_merge(self):
+    def _naive_vertical_merge(self, zoomin=3):
        bxs = Recognizer.sort_Y_firstly(
            self.boxes, np.median(
                self.mean_height) / 3)
+
+        column_width = np.median([b["x1"] - b["x0"] for b in self.boxes])
+        self.column_num = int(self.page_images[0].size[0] / zoomin / column_width)
+        if column_width < self.page_images[0].size[0] / zoomin / self.column_num:
+            logging.info("Multi-column................... {} {}".format(column_width,
+                  self.page_images[0].size[0] / zoomin / self.column_num))
+            self.boxes = self.sort_X_by_page(self.boxes, column_width / self.column_num)
+
        i = 0
        while i + 1 < len(bxs):
            b = bxs[i]
@ -1139,20 +1148,94 @@ class RAGFlowPdfParser:
            need_image, zoomin, return_html, False)
        return self.__filterout_scraps(deepcopy(self.boxes), zoomin), tbls

+    def parse_into_bboxes(self, fnm, callback=None, zoomin=3):
+        start = timer()
+        self.__images__(fnm, zoomin)
+        if callback:
+            callback(0.40, "OCR finished ({:.2f}s)".format(timer() - start))
+
+        start = timer()
+        self._layouts_rec(zoomin)
+        if callback:
+            callback(0.63, "Layout analysis ({:.2f}s)".format(timer() - start))
+
+        start = timer()
+        self._table_transformer_job(zoomin)
+        if callback:
+            callback(0.83, "Table analysis ({:.2f}s)".format(timer() - start))
+
+        start = timer()
+        self._text_merge()
+        self._concat_downward()
+        self._naive_vertical_merge(zoomin)
+        if callback:
+            callback(0.92, "Text merged ({:.2f}s)".format(timer() - start))
+
+        start = timer()
+        tbls, figs = self._extract_table_figure(True, zoomin, True, True, True)
+
+        def insert_table_figures(tbls_or_figs, layout_type):
+            def min_rectangle_distance(rect1, rect2):
+                import math
+                pn1, left1, right1, top1, bottom1 = rect1
+                pn2, left2, right2, top2, bottom2 = rect2
+                if (right1 >= left2 and right2 >= left1 and
+                        bottom1 >= top2 and bottom2 >= top1):
+                    return 0 + (pn1-pn2)*10000
+                if right1 < left2:
+                    dx = left2 - right1
+                elif right2 < left1:
+                    dx = left1 - right2
+                else:
+                    dx = 0
+                if bottom1 < top2:
+                    dy = top2 - bottom1
+                elif bottom2 < top1:
+                    dy = top1 - bottom2
+                else:
+                    dy = 0
+                return math.sqrt(dx*dx + dy*dy) + (pn1-pn2)*10000
+
+            for (img, txt), poss in tbls_or_figs:
+                bboxes = [(i, (b["page_number"], b["x0"], b["x1"], b["top"], b["bottom"])) for i, b in enumerate(self.boxes)]
+                dists = [(min_rectangle_distance((pn, left, right, top, bott), rect),i) for i, rect in bboxes for pn, left, right, top, bott in poss]
+                min_i = np.argmin(dists, axis=0)[0]
+                min_i, rect = bboxes[dists[min_i][-1]]
+                if isinstance(txt, list):
+                    txt = "\n".join(txt)
+                self.boxes.insert(min_i, {
+                    "page_number": rect[0], "x0": rect[1], "x1": rect[2], "top": rect[3], "bottom": rect[4], "layout_type": layout_type, "text": txt, "image": img
+                })
+
+        for b in self.boxes:
+            b["position_tag"] = self._line_tag(b, zoomin)
+            b["image"] = self.crop(b["position_tag"], zoomin)
+
+        insert_table_figures(tbls, "table")
+        insert_table_figures(figs, "figure")
+        if callback:
+            callback(1, "Structured ({:.2f}s)".format(timer() - start))
+        return deepcopy(self.boxes)
+
    @staticmethod
    def remove_tag(txt):
        return re.sub(r"@@[\t0-9.-]+?##", "", txt)

-    def crop(self, text, ZM=3, need_position=False):
-        imgs = []
+    @staticmethod
+    def extract_positions(txt):
        poss = []
-        for tag in re.findall(r"@@[0-9-]+\t[0-9.\t]+##", text):
+        for tag in re.findall(r"@@[0-9-]+\t[0-9.\t]+##", txt):
            pn, left, right, top, bottom = tag.strip(
                "#").strip("@").split("\t")
            left, right, top, bottom = float(left), float(
                right), float(top), float(bottom)
            poss.append(([int(p) - 1 for p in pn.split("-")],
                         left, right, top, bottom))
+        return poss
+
+    def crop(self, text, ZM=3, need_position=False):
+        imgs = []
+        poss = self.extract_positions(text)
        if not poss:
            if need_position:
                return None, None
@ -1296,8 +1379,8 @@ class VisionParser(RAGFlowPdfParser):

    def __call__(self, filename, from_page=0, to_page=100000, **kwargs):
        callback = kwargs.get("callback", lambda prog, msg: None)
-
-        self.__images__(fnm=filename, zoomin=3, page_from=from_page, page_to=to_page, **kwargs)
+        zoomin = kwargs.get("zoomin", 3)
+        self.__images__(fnm=filename, zoomin=zoomin, page_from=from_page, page_to=to_page, callback=callback)

        total_pdf_pages = self.total_page

@ -1311,16 +1394,19 @@ class VisionParser(RAGFlowPdfParser):
            if pdf_page_num < start_page or pdf_page_num >= end_page:
                continue

-            docs = picture_vision_llm_chunk(
+            text = picture_vision_llm_chunk(
                binary=img_binary,
                vision_model=self.vision_model,
                prompt=vision_llm_describe_prompt(page=pdf_page_num+1),
                callback=callback,
            )
+            if kwargs.get("callback"):
+                kwargs["callback"](idx*1./len(self.page_images), f"Processed: {idx+1}/{len(self.page_images)}")

-            if docs:
-                all_docs.append(docs)
-        return [(doc, "") for doc in all_docs], []
+            if text:
+                width, height = self.page_images[idx].size
+                all_docs.append((text, f"{pdf_page_num+1} 0 {width/zoomin} 0 {height/zoomin}"))
+        return all_docs, []


 if __name__ == "__main__":
--- a/deepdoc/vision/seeit.py
+++ b/deepdoc/vision/seeit.py
@ -31,11 +31,11 @@ def save_results(image_list, results, labels, output_dir='output/', threshold=0.
        logging.debug("save result to: " + out_path)


-def draw_box(im, result, lables, threshold=0.5):
+def draw_box(im, result, labels, threshold=0.5):
    draw_thickness = min(im.size) // 320
    draw = ImageDraw.Draw(im)
-    color_list = get_color_map_list(len(lables))
-    clsid2color = {n.lower():color_list[i] for i,n in enumerate(lables)}
+    color_list = get_color_map_list(len(labels))
+    clsid2color = {n.lower():color_list[i] for i,n in enumerate(labels)}
    result = [r for r in result if r["score"] >= threshold]

    for dt in result:
--- a/docker/.env
+++ b/docker/.env
@ -93,13 +93,13 @@ REDIS_PASSWORD=infini_rag_flow
 SVR_HTTP_PORT=9380

 # The RAGFlow Docker image to download.
-# Defaults to the v0.20.1-slim edition, which is the RAGFlow Docker image without embedding models.
-RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2-slim
+# Defaults to the v0.20.4-slim edition, which is the RAGFlow Docker image without embedding models.
+RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4-slim
 #
 # To download the RAGFlow Docker image with embedding models, uncomment the following line instead:
-# RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.1
+# RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4
 #
-# The Docker image of the v0.20.1 edition includes built-in embedding models:
+# The Docker image of the v0.20.4 edition includes built-in embedding models:
 #   - BAAI/bge-large-zh-v1.5
 #   - maidalun1020/bce-embedding-base_v1
 #
--- a/docker/README.md
+++ b/docker/README.md
@ -79,8 +79,8 @@ The [.env](./.env) file contains important environment variables for Docker.
 - `RAGFLOW-IMAGE`  
  The Docker image edition. Available editions:  
  
-  - `infiniflow/ragflow:v0.20.2-slim` (default): The RAGFlow Docker image without embedding models.  
-  - `infiniflow/ragflow:v0.20.2`: The RAGFlow Docker image with embedding models including:
+  - `infiniflow/ragflow:v0.20.4-slim` (default): The RAGFlow Docker image without embedding models.  
+  - `infiniflow/ragflow:v0.20.4`: The RAGFlow Docker image with embedding models including:
    - Built-in embedding models:
      - `BAAI/bge-large-zh-v1.5` 
      - `maidalun1020/bce-embedding-base_v1`
--- a/docs/configurations.md
+++ b/docs/configurations.md
@ -99,8 +99,8 @@ RAGFlow utilizes MinIO as its object storage solution, leveraging its scalabilit
 - `RAGFLOW-IMAGE`  
  The Docker image edition. Available editions:  
  
-  - `infiniflow/ragflow:v0.20.2-slim` (default): The RAGFlow Docker image without embedding models.  
-  - `infiniflow/ragflow:v0.20.2`: The RAGFlow Docker image with embedding models including:
+  - `infiniflow/ragflow:v0.20.4-slim` (default): The RAGFlow Docker image without embedding models.  
+  - `infiniflow/ragflow:v0.20.4`: The RAGFlow Docker image with embedding models including:
    - Built-in embedding models:
      - `BAAI/bge-large-zh-v1.5` 
      - `maidalun1020/bce-embedding-base_v1`
--- a/docs/develop/acquire_ragflow_api_key.md
+++ b/docs/develop/acquire_ragflow_api_key.md
@ -11,7 +11,7 @@ An API key is required for the RAGFlow server to authenticate your HTTP/Python o
 2. Click **API** to switch to the **API** page.
 3. Obtain a RAGFlow API key:

-![ragflow_api_key](https://github.com/user-attachments/assets/f461ed61-04c6-4faf-b3d8-6b5fa56be4e7)
+![ragflow_api_key](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/ragflow_api_key.jpg)

 :::tip NOTE
 See the [RAGFlow HTTP API reference](../references/http_api_reference.md) or the [RAGFlow Python API reference](../references/python_api_reference.md) for a complete reference of RAGFlow's HTTP or Python APIs.
--- a/docs/develop/build_docker_image.mdx
+++ b/docs/develop/build_docker_image.mdx
@ -77,7 +77,7 @@ After building the infiniflow/ragflow:nightly-slim image, you are ready to launc

 1. Edit Docker Compose Configuration

-Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.20.2-slim` to `infiniflow/ragflow:nightly-slim` to use the pre-built image.
+Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.20.4-slim` to `infiniflow/ragflow:nightly-slim` to use the pre-built image.


 2. Launch the Service
--- a/docs/faq.mdx
+++ b/docs/faq.mdx
@ -30,17 +30,17 @@ The "garbage in garbage out" status quo remains unchanged despite the fact that

 Each RAGFlow release is available in two editions:

- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.2-slim`
- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.2`
+- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.4-slim`
+- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.4`

 ---

 ### Which embedding models can be deployed locally?

-RAGFlow offers two Docker image editions, `v0.20.2-slim` and `v0.20.2`:  
+RAGFlow offers two Docker image editions, `v0.20.4-slim` and `v0.20.4`:  
  
- `infiniflow/ragflow:v0.20.2-slim` (default): The RAGFlow Docker image without embedding models.  
- `infiniflow/ragflow:v0.20.2`: The RAGFlow Docker image with embedding models including:
+- `infiniflow/ragflow:v0.20.4-slim` (default): The RAGFlow Docker image without embedding models.  
+- `infiniflow/ragflow:v0.20.4`: The RAGFlow Docker image with embedding models including:
  - Built-in embedding models:
    - `BAAI/bge-large-zh-v1.5`
    - `maidalun1020/bce-embedding-base_v1`
--- a/docs/guides/agent/agent_component_reference/agent.mdx
+++ b/docs/guides/agent/agent_component_reference/agent.mdx
@ -9,7 +9,7 @@ The component equipped with reasoning, tool usage, and multi-agent collaboration

 ---

-An **Agent** component fine-tunes the LLM and sets its prompt. From v0.20.2 onwards, an **Agent** component is able to work independently and with the following capabilities:
+An **Agent** component fine-tunes the LLM and sets its prompt. From v0.20.4 onwards, an **Agent** component is able to work independently and with the following capabilities:

 - Autonomous reasoning with reflection and adjustment based on environmental feedback.
 - Use of tools or subagents to complete tasks.
--- a/docs/guides/agent/agent_component_reference/retrieval.mdx
+++ b/docs/guides/agent/agent_component_reference/retrieval.mdx
@ -9,7 +9,7 @@ A component that retrieves information from specified datasets.

 ## Scenarios

-A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated knowledge bases before being sent to the LLM for content generation. As of v0.20.2, a **Retrieval** component can operate either as a workflow component or as a tool of an **Agent**, enabling the Agent to control its invocation and search queries.
+A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated knowledge bases before being sent to the LLM for content generation. As of v0.20.4, a **Retrieval** component can operate either as a workflow component or as a tool of an **Agent**, enabling the Agent to control its invocation and search queries.

 ## Configurations

--- a/docs/guides/chat/start_chat.md
+++ b/docs/guides/chat/start_chat.md
@ -48,7 +48,7 @@ You start an AI conversation by creating an assistant.
     - If no target language is selected, the system will search only in the language of your query, which may cause relevant information in other languages to be missed.
   - **Variable** refers to the variables (keys) to be used in the system prompt. `{knowledge}` is a reserved variable. Click **Add** to add more variables for the system prompt.
      - If you are uncertain about the logic behind **Variable**, leave it *as-is*.
-      - As of v0.20.2, if you add custom variables here, the only way you can pass in their values is to call:
+      - As of v0.20.4, if you add custom variables here, the only way you can pass in their values is to call:
         - HTTP method [Converse with chat assistant](../../references/http_api_reference.md#converse-with-chat-assistant), or
         - Python method [Converse with chat assistant](../../references/python_api_reference.md#converse-with-chat-assistant).

--- a/docs/guides/dataset/configure_knowledge_base.md
+++ b/docs/guides/dataset/configure_knowledge_base.md
@ -128,7 +128,7 @@ See [Run retrieval test](./run_retrieval_test.md) for details.

 ## Search for knowledge base

-As of RAGFlow v0.20.2, the search feature is still in a rudimentary form, supporting only knowledge base search by name.
+As of RAGFlow v0.20.4, the search feature is still in a rudimentary form, supporting only knowledge base search by name.

 ![search knowledge base](https://github.com/infiniflow/ragflow/assets/93570324/836ae94c-2438-42be-879e-c7ad2a59693e)

--- a/docs/guides/manage_files.md
+++ b/docs/guides/manage_files.md
@ -87,4 +87,4 @@ RAGFlow's file management allows you to download an uploaded file:

 ![download_file](https://github.com/infiniflow/ragflow/assets/93570324/cf3b297f-7d9b-4522-bf5f-4f45743e4ed5)

-> As of RAGFlow v0.20.2, bulk download is not supported, nor can you download an entire folder. 
+> As of RAGFlow v0.20.4, bulk download is not supported, nor can you download an entire folder. 
--- a/docs/guides/tracing.mdx
+++ b/docs/guides/tracing.mdx
@ -18,7 +18,7 @@ RAGFlow ships with a built-in [Langfuse](https://langfuse.com) integration so th
 Langfuse stores traces, spans and prompt payloads in a purpose-built observability backend and offers filtering and visualisations on top.  

 :::info NOTE
-• RAGFlow **≥ 0.20.2** (contains the Langfuse connector)  
+• RAGFlow **≥ 0.20.4** (contains the Langfuse connector)  
 • A Langfuse workspace (cloud or self-hosted) with a _Project Public Key_ and _Secret Key_
 :::

--- a/docs/guides/upgrade_ragflow.mdx
+++ b/docs/guides/upgrade_ragflow.mdx
@ -66,10 +66,10 @@ To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker imag
   git clone https://github.com/infiniflow/ragflow.git
   ```

-2. Switch to the latest, officially published release, e.g., `v0.20.2`:
+2. Switch to the latest, officially published release, e.g., `v0.20.4`:

   ```bash
-   git checkout -f v0.20.2
+   git checkout -f v0.20.4
   ```

 3. Update **ragflow/docker/.env**:
@ -83,14 +83,14 @@ To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker imag
  <TabItem value="slim">

 ```bash
-RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2-slim
+RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4-slim
 ```

  </TabItem>
  <TabItem value="full">

 ```bash
-RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2
+RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4
 ```

  </TabItem>
@ -114,10 +114,10 @@ No, you do not need to. Upgrading RAGFlow in itself will *not* remove your uploa
 1. From an environment with Internet access, pull the required Docker image.
 2. Save the Docker image to a **.tar** file.
   ```bash
-   docker save -o ragflow.v0.20.2.tar infiniflow/ragflow:v0.20.2
+   docker save -o ragflow.v0.20.4.tar infiniflow/ragflow:v0.20.4
   ```
 3. Copy the **.tar** file to the target server.
 4. Load the **.tar** file into Docker:
   ```bash
-   docker load -i ragflow.v0.20.2.tar
+   docker load -i ragflow.v0.20.4.tar
   ```
--- a/docs/quickstart.mdx
+++ b/docs/quickstart.mdx
@ -44,7 +44,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If

   `vm.max_map_count`. This value sets the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abnormal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.

-   RAGFlow v0.20.2 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.
+   RAGFlow v0.20.4 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.

 <Tabs
  defaultValue="linux"
@ -184,13 +184,13 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
   ```bash
   $ git clone https://github.com/infiniflow/ragflow.git
   $ cd ragflow/docker
-   $ git checkout -f v0.20.2
+   $ git checkout -f v0.20.4
   ```

 3. Use the pre-built Docker images and start up the server:

   :::tip NOTE
-   The command below downloads the `v0.20.2-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.2-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.2` for the full edition `v0.20.2`.
+   The command below downloads the `v0.20.4-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.4-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.4` for the full edition `v0.20.4`.
   :::

   ```bash
@ -207,8 +207,8 @@ This section provides instructions on setting up the RAGFlow server on Linux. If

 | RAGFlow image tag   | Image size (GB) | Has embedding models and Python packages? | Stable?                  |
 | ------------------- | --------------- | ----------------------------------------- | ------------------------ |
-| `v0.20.2`           | &approx;9       | :heavy_check_mark:                        | Stable release           |
-| `v0.20.2-slim`      | &approx;2       | ❌                                        | Stable release           |
+| `v0.20.4`           | &approx;9       | :heavy_check_mark:                        | Stable release           |
+| `v0.20.4-slim`      | &approx;2       | ❌                                        | Stable release           |
 | `nightly`           | &approx;9       | :heavy_check_mark:                        | *Unstable* nightly build |
 | `nightly-slim`      | &approx;2       | ❌                                        | *Unstable* nightly build |

@ -217,7 +217,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
 ```

 :::danger IMPORTANT
-The embedding models included in `v0.20.2` and `nightly` are:
+The embedding models included in `v0.20.4` and `nightly` are:

 - BAAI/bge-large-zh-v1.5
 - maidalun1020/bce-embedding-base_v1
--- a/docs/references/glossary.mdx
+++ b/docs/references/glossary.mdx
@ -19,7 +19,7 @@ import TOCInline from '@theme/TOCInline';

 ### Cross-language search

-Cross-language search (also known as cross-lingual retrieval) is a feature introduced in version 0.20.2. It enables users to submit queries in one language (for example, English) and retrieve relevant documents written in other languages such as Chinese or Spanish. This feature is enabled by the system’s default chat model, which translates queries to ensure accurate matching of semantic meaning across languages.
+Cross-language search (also known as cross-lingual retrieval) is a feature introduced in version 0.20.4. It enables users to submit queries in one language (for example, English) and retrieve relevant documents written in other languages such as Chinese or Spanish. This feature is enabled by the system’s default chat model, which translates queries to ensure accurate matching of semantic meaning across languages.

 By enabling cross-language search, users can effortlessly access a broader range of information regardless of language barriers, significantly enhancing the system’s usability and inclusiveness.

--- a/docs/references/http_api_reference.md
+++ b/docs/references/http_api_reference.md
@ -5,7 +5,7 @@ slug: /http_api_reference

 # HTTP API

-A complete reference for RAGFlow's RESTful API. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](../develop/acquire_ragflow_api_key.md).
+A complete reference for RAGFlow's RESTful API. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](https://ragflow.io/docs/dev/acquire_ragflow_api_key).

 ---

@ -143,7 +143,6 @@ Non-stream:
 }
 ```

-
 Failure:

 ```json
@ -200,19 +199,24 @@ curl --request POST \
 - `stream` (*Body parameter*) `boolean`  
  Whether to receive the response as a stream. Set this to `false` explicitly if you prefer to receive the entire response in one go instead of as a stream.

+- `session_id` (*Body parameter*) `string`  
+  Agent session id.
+
 #### Response

 Stream:

 ```json
+...
+
 data: {
-    "id": "5fa65c94-e316-4954-800a-06dfd5827052",
+    "id": "c39f6f9c83d911f0858253708ecb6573",
    "object": "chat.completion.chunk",
-    "model": "99ee29d6783511f09c921a6272e682d8",
+    "model": "d1f79142831f11f09cc51795b9eb07c0",
    "choices": [
        {
            "delta": {
-                "content": "Hello"
+                "content": " terminal"
            },
            "finish_reason": null,
            "index": 0
@ -220,21 +224,83 @@ data: {
    ]
 }

-data: {"id": "518022d9-545b-4100-89ed-ecd9e46fa753", "object": "chat.completion.chunk", "model": "99ee29d6783511f09c921a6272e682d8", "choices": [{"delta": {"content": "!"}, "finish_reason": null, "index": 0}]}
+data: {
+    "id": "c39f6f9c83d911f0858253708ecb6573",
+    "object": "chat.completion.chunk",
+    "model": "d1f79142831f11f09cc51795b9eb07c0",
+    "choices": [
+        {
+            "delta": {
+                "content": "."
+            },
+            "finish_reason": null,
+            "index": 0
+        }
+    ]
+}

-data: {"id": "f37c4af0-8187-4c86-8186-048c3c6ffe4e", "object": "chat.completion.chunk", "model": "99ee29d6783511f09c921a6272e682d8", "choices": [{"delta": {"content": " How"}, "finish_reason": null, "index": 0}]}
-
-data: {"id": "3ebc0fcb-0f85-4024-b4a5-3b03234a16df", "object": "chat.completion.chunk", "model": "99ee29d6783511f09c921a6272e682d8", "choices": [{"delta": {"content": " can"}, "finish_reason": null, "index": 0}]}
-
-data: {"id": "efa1f3cf-7bc4-47a4-8e53-cd696f290587", "object": "chat.completion.chunk", "model": "99ee29d6783511f09c921a6272e682d8", "choices": [{"delta": {"content": " I"}, "finish_reason": null, "index": 0}]}
-
-data: {"id": "2eb6f741-50a3-4d3d-8418-88be27895611", "object": "chat.completion.chunk", "model": "99ee29d6783511f09c921a6272e682d8", "choices": [{"delta": {"content": " assist"}, "finish_reason": null, "index": 0}]}
-
-data: {"id": "f1227e4f-bf8b-462c-8632-8f5269492ce9", "object": "chat.completion.chunk", "model": "99ee29d6783511f09c921a6272e682d8", "choices": [{"delta": {"content": " you"}, "finish_reason": null, "index": 0}]}
-
-data: {"id": "35b669d0-b2be-4c0c-88d8-17ff98592b21", "object": "chat.completion.chunk", "model": "99ee29d6783511f09c921a6272e682d8", "choices": [{"delta": {"content": " today"}, "finish_reason": null, "index": 0}]}
-
-data: {"id": "f00d8a39-af60-4f32-924f-d64106a7fdf1", "object": "chat.completion.chunk", "model": "99ee29d6783511f09c921a6272e682d8", "choices": [{"delta": {"content": "?"}, "finish_reason": null, "index": 0}]}
+data: {
+    "id": "c39f6f9c83d911f0858253708ecb6573",
+    "object": "chat.completion.chunk",
+    "model": "d1f79142831f11f09cc51795b9eb07c0",
+    "choices": [
+        {
+            "delta": {
+                "content": "",
+                "reference": {
+                    "chunks": {
+                        "20": {
+                            "id": "4b8935ac0a22deb1",
+                            "content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.",
+                            "document_id": "4bdd2ff65e1511f0907f09f583941b45",
+                            "document_name": "INSTALL22.md",
+                            "dataset_id": "456ce60c5e1511f0907f09f583941b45",
+                            "image_id": "",
+                            "positions": [
+                                [
+                                    12,
+                                    11,
+                                    11,
+                                    11,
+                                    11
+                                ]
+                            ],
+                            "url": null,
+                            "similarity": 0.5697155305154673,
+                            "vector_similarity": 0.7323851005515574,
+                            "term_similarity": 0.5000000005,
+                            "doc_type": ""
+                        }
+                    },
+                    "doc_aggs": {
+                        "INSTALL22.md": {
+                            "doc_name": "INSTALL22.md",
+                            "doc_id": "4bdd2ff65e1511f0907f09f583941b45",
+                            "count": 3
+                        },
+                        "INSTALL.md": {
+                            "doc_name": "INSTALL.md",
+                            "doc_id": "4bd7fdd85e1511f0907f09f583941b45",
+                            "count": 2
+                        },
+                        "INSTALL(1).md": {
+                            "doc_name": "INSTALL(1).md",
+                            "doc_id": "4bdfb42e5e1511f0907f09f583941b45",
+                            "count": 2
+                        },
+                        "INSTALL3.md": {
+                            "doc_name": "INSTALL3.md",
+                            "doc_id": "4bdab5825e1511f0907f09f583941b45",
+                            "count": 1
+                        }
+                    }
+                }
+            },
+            "finish_reason": null,
+            "index": 0
+        }
+    ]
+}

 data: [DONE]
 ```
@ -249,30 +315,77 @@ Non-stream:
            "index": 0,
            "logprobs": null,
            "message": {
-                "content": "Hello! How can I assist you today?",
+                "content": "\nTo install Neovim, the process varies depending on your operating system:\n\n### For Windows:\n1. **Download from GitHub**: \n   - Visit the [Neovim releases page](https://github.com/neovim/neovim/releases)\n   - Download the latest Windows installer (nvim-win64.msi)\n   - Run the installer and follow the prompts\n\n2. **Using winget** (Windows Package Manager):\n...",
+                "reference": {
+                    "chunks": {
+                        "20": {
+                            "content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.",
+                            "dataset_id": "456ce60c5e1511f0907f09f583941b45",
+                            "doc_type": "",
+                            "document_id": "4bdd2ff65e1511f0907f09f583941b45",
+                            "document_name": "INSTALL22.md",
+                            "id": "4b8935ac0a22deb1",
+                            "image_id": "",
+                            "positions": [
+                                [
+                                    12,
+                                    11,
+                                    11,
+                                    11,
+                                    11
+                                ]
+                            ],
+                            "similarity": 0.5697155305154673,
+                            "term_similarity": 0.5000000005,
+                            "url": null,
+                            "vector_similarity": 0.7323851005515574
+                        }
+                    },
+                    "doc_aggs": {
+                        "INSTALL(1).md": {
+                            "count": 2,
+                            "doc_id": "4bdfb42e5e1511f0907f09f583941b45",
+                            "doc_name": "INSTALL(1).md"
+                        },
+                        "INSTALL.md": {
+                            "count": 2,
+                            "doc_id": "4bd7fdd85e1511f0907f09f583941b45",
+                            "doc_name": "INSTALL.md"
+                        },
+                        "INSTALL22.md": {
+                            "count": 3,
+                            "doc_id": "4bdd2ff65e1511f0907f09f583941b45",
+                            "doc_name": "INSTALL22.md"
+                        },
+                        "INSTALL3.md": {
+                            "count": 1,
+                            "doc_id": "4bdab5825e1511f0907f09f583941b45",
+                            "doc_name": "INSTALL3.md"
+                        }
+                    }
+                },
                "role": "assistant"
            }
        }
    ],
    "created": null,
-    "id": "17aa4ec5-6d36-40c6-9a96-1b069c216d59",
-    "model": "99ee29d6783511f09c921a6272e682d8",
+    "id": "c39f6f9c83d911f0858253708ecb6573",
+    "model": "d1f79142831f11f09cc51795b9eb07c0",
    "object": "chat.completion",
    "param": null,
    "usage": {
-        "completion_tokens": 9,
+        "completion_tokens": 415,
        "completion_tokens_details": {
            "accepted_prediction_tokens": 0,
            "reasoning_tokens": 0,
            "rejected_prediction_tokens": 0
        },
-        "prompt_tokens": 1,
-        "total_tokens": 10
+        "prompt_tokens": 6,
+        "total_tokens": 421
    }
 }
 ```

-
 Failure:

 ```json
@ -383,7 +496,7 @@ curl --request POST \
    - `"layout_recognize"`: `string`
      - Defaults to `DeepDOC`
    - `"tag_kb_ids"`: `array<string>` refer to [Use tag set](https://ragflow.io/docs/dev/use_tag_sets)
-      - Must include a list of dataset IDs, where each dataset is parsed using the Tag Chunk Method
+      - Must include a list of dataset IDs, where each dataset is parsed using the Tag Chunking Method
    - `"task_page_size"`: `int` For PDF only.
      - Defaults to `12`
      - Minimum: `1`
@ -604,7 +717,7 @@ curl --request PUT \
    - `"layout_recognize"`: `string`
      - Defaults to `DeepDOC`
    - `"tag_kb_ids"`: `array<string>` refer to [Use tag set](https://ragflow.io/docs/dev/use_tag_sets)
-      - Must include a list of dataset IDs, where each dataset is parsed using the Tag Chunk Method
+      - Must include a list of dataset IDs, where each dataset is parsed using the Tag Chunking Method
    - `"task_page_size"`: `int` For PDF only.
      - Defaults to `12`
      - Minimum: `1`
@ -729,9 +842,10 @@ Failure:
    "message": "The dataset doesn't exist"
 }
 ```
+
 ---

-## Get dataset's knowledge graph
+### Get knowledge graph

 **GET** `/api/v1/datasets/{dataset_id}/knowledge_graph`

@ -808,9 +922,10 @@ Failure:
    "message": "The dataset doesn't exist"
 }
 ```
+
 ---

-## Delete dataset's knowledge graph
+### Delete knowledge graph

 **DELETE** `/api/v1/datasets/{dataset_id}/knowledge_graph`

@ -855,6 +970,7 @@ Failure:
    "message": "The dataset doesn't exist"
 }
 ```
+
 ---

 ## FILE MANAGEMENT WITHIN DATASET
@ -3017,41 +3133,88 @@ success without `session_id` provided and with no variables specified in the **B
 Stream:

 ```json
-data:{
-    "event": "message",
-    "message_id": "eb0c0a5e783511f0b9b61a6272e682d8",
-    "created_at": 1755083342,
-    "task_id": "99ee29d6783511f09c921a6272e682d8",
-    "data": {
-        "content": "Hello"
-    },
-    "session_id": "eaf19a8e783511f0b9b61a6272e682d8"
-}
-
-data:{
-    "event": "message",
-    "message_id": "eb0c0a5e783511f0b9b61a6272e682d8",
-    "created_at": 1755083342,
-    "task_id": "99ee29d6783511f09c921a6272e682d8",
-    "data": {
-        "content": "!"
-    },
-    "session_id": "eaf19a8e783511f0b9b61a6272e682d8"
-}
-
-data:{
-    "event": "message",
-    "message_id": "eb0c0a5e783511f0b9b61a6272e682d8",
-    "created_at": 1755083342,
-    "task_id": "99ee29d6783511f09c921a6272e682d8",
-    "data": {
-        "content": " How"
-    },
-    "session_id": "eaf19a8e783511f0b9b61a6272e682d8"
-}
-
 ...

+data: {
+    "event": "message",
+    "message_id": "cecdcb0e83dc11f0858253708ecb6573",
+    "created_at": 1756364483,
+    "task_id": "d1f79142831f11f09cc51795b9eb07c0",
+    "data": {
+        "content": " themes"
+    },
+    "session_id": "cd097ca083dc11f0858253708ecb6573"
+}
+
+data: {
+    "event": "message",
+    "message_id": "cecdcb0e83dc11f0858253708ecb6573",
+    "created_at": 1756364483,
+    "task_id": "d1f79142831f11f09cc51795b9eb07c0",
+    "data": {
+        "content": "."
+    },
+    "session_id": "cd097ca083dc11f0858253708ecb6573"
+}
+
+data: {
+    "event": "message_end",
+    "message_id": "cecdcb0e83dc11f0858253708ecb6573",
+    "created_at": 1756364483,
+    "task_id": "d1f79142831f11f09cc51795b9eb07c0",
+    "data": {
+        "reference": {
+            "chunks": {
+                "20": {
+                    "id": "4b8935ac0a22deb1",
+                    "content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.",
+                    "document_id": "4bdd2ff65e1511f0907f09f583941b45",
+                    "document_name": "INSTALL22.md",
+                    "dataset_id": "456ce60c5e1511f0907f09f583941b45",
+                    "image_id": "",
+                    "positions": [
+                        [
+                            12,
+                            11,
+                            11,
+                            11,
+                            11
+                        ]
+                    ],
+                    "url": null,
+                    "similarity": 0.5705525104787287,
+                    "vector_similarity": 0.7351750337624289,
+                    "term_similarity": 0.5000000005,
+                    "doc_type": ""
+                }
+            },
+            "doc_aggs": {
+                "INSTALL22.md": {
+                    "doc_name": "INSTALL22.md",
+                    "doc_id": "4bdd2ff65e1511f0907f09f583941b45",
+                    "count": 3
+                },
+                "INSTALL.md": {
+                    "doc_name": "INSTALL.md",
+                    "doc_id": "4bd7fdd85e1511f0907f09f583941b45",
+                    "count": 2
+                },
+                "INSTALL(1).md": {
+                    "doc_name": "INSTALL(1).md",
+                    "doc_id": "4bdfb42e5e1511f0907f09f583941b45",
+                    "count": 2
+                },
+                "INSTALL3.md": {
+                    "doc_name": "INSTALL3.md",
+                    "doc_id": "4bdab5825e1511f0907f09f583941b45",
+                    "count": 1
+                }
+            }
+        }
+    },
+    "session_id": "cd097ca083dc11f0858253708ecb6573"
+}
+
 data:[DONE]
 ```

@ -3061,21 +3224,77 @@ Non-stream:
 {
    "code": 0,
    "data": {
-        "created_at": 1755083440,
+        "created_at": 1756363177,
        "data": {
-            "created_at": 547061.147866385,
-            "elapsed_time": 2.595433341921307,
-            "inputs": {},
+            "content": "\nTo install Neovim, the process varies depending on your operating system:\n\n### For macOS:\nUsing Homebrew:\n```bash\nbrew install neovim\n```\n\n### For Linux (Debian/Ubuntu):\n```bash\nsudo apt update\nsudo apt install neovim\n```\n\nFor other Linux distributions, you can use their respective package managers or build from source.\n\n### For Windows:\n1. Download the latest Windows installer from the official Neovim GitHub releases page\n2. Run the installer and follow the prompts\n3. Add Neovim to your PATH if not done automatically\n\n### From source (Unix-like systems):\n```bash\ngit clone https://github.com/neovim/neovim.git\ncd neovim\nmake CMAKE_BUILD_TYPE=Release\nsudo make install\n```\n\nAfter installation, you can verify it by running `nvim --version` in your terminal.",
+            "created_at": 18129.044975627,
+            "elapsed_time": 10.0157331670016,
+            "inputs": {
+                "var1": {
+                    "value": "I am var1"
+                },
+                "var2": {
+                    "value": "I am var2"
+                }
+            },
            "outputs": {
-                "_created_time": 547061.149137775,
-                "_elapsed_time": 8.720310870558023e-05,
-                "content": "Hello! How can I assist you today?"
+                "_created_time": 18129.502422278,
+                "_elapsed_time": 0.00013378599760471843,
+                "content": "\nTo install Neovim, the process varies depending on your operating system:\n\n### For macOS:\nUsing Homebrew:\n```bash\nbrew install neovim\n```\n\n### For Linux (Debian/Ubuntu):\n```bash\nsudo apt update\nsudo apt install neovim\n```\n\nFor other Linux distributions, you can use their respective package managers or build from source.\n\n### For Windows:\n1. Download the latest Windows installer from the official Neovim GitHub releases page\n2. Run the installer and follow the prompts\n3. Add Neovim to your PATH if not done automatically\n\n### From source (Unix-like systems):\n```bash\ngit clone https://github.com/neovim/neovim.git\ncd neovim\nmake CMAKE_BUILD_TYPE=Release\nsudo make install\n```\n\nAfter installation, you can verify it by running `nvim --version` in your terminal."
+            },
+            "reference": {
+                "chunks": {
+                    "20": {
+                        "content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.",
+                        "dataset_id": "456ce60c5e1511f0907f09f583941b45",
+                        "doc_type": "",
+                        "document_id": "4bdd2ff65e1511f0907f09f583941b45",
+                        "document_name": "INSTALL22.md",
+                        "id": "4b8935ac0a22deb1",
+                        "image_id": "",
+                        "positions": [
+                            [
+                                12,
+                                11,
+                                11,
+                                11,
+                                11
+                            ]
+                        ],
+                        "similarity": 0.5705525104787287,
+                        "term_similarity": 0.5000000005,
+                        "url": null,
+                        "vector_similarity": 0.7351750337624289
+                    }
+                },
+                "doc_aggs": {
+                    "INSTALL(1).md": {
+                        "count": 2,
+                        "doc_id": "4bdfb42e5e1511f0907f09f583941b45",
+                        "doc_name": "INSTALL(1).md"
+                    },
+                    "INSTALL.md": {
+                        "count": 2,
+                        "doc_id": "4bd7fdd85e1511f0907f09f583941b45",
+                        "doc_name": "INSTALL.md"
+                    },
+                    "INSTALL22.md": {
+                        "count": 3,
+                        "doc_id": "4bdd2ff65e1511f0907f09f583941b45",
+                        "doc_name": "INSTALL22.md"
+                    },
+                    "INSTALL3.md": {
+                        "count": 1,
+                        "doc_id": "4bdab5825e1511f0907f09f583941b45",
+                        "doc_name": "INSTALL3.md"
+                    }
+                }
            }
        },
        "event": "workflow_finished",
-        "message_id": "25807f94783611f095171a6272e682d8",
-        "session_id": "25663198783611f095171a6272e682d8",
-        "task_id": "99ee29d6783511f09c921a6272e682d8"
+        "message_id": "c4692a2683d911f0858253708ecb6573",
+        "session_id": "c39f6f9c83d911f0858253708ecb6573",
+        "task_id": "d1f79142831f11f09cc51795b9eb07c0"
    }
 }
 ```
@ -3501,7 +3720,7 @@ Failure:

 ### Generate related questions

-**POST** `/v1/sessions/related_questions`
+**POST** `/api/v1/sessions/related_questions`

 Generates five to ten alternative question strings from the user's original query to retrieve more relevant search results.

@ -3516,7 +3735,7 @@ The chat model autonomously determines the number of questions to generate based
 #### Request

 - Method: POST
- URL: `/v1/sessions/related_questions`
+- URL: `/api/v1/sessions/related_questions`
 - Headers:
  - `'content-Type: application/json'`
  - `'Authorization: Bearer <YOUR_LOGIN_TOKEN>'`
@ -3528,7 +3747,7 @@ The chat model autonomously determines the number of questions to generate based

 ```bash
 curl --request POST \
-     --url http://{address}/v1/sessions/related_questions \
+     --url http://{address}/api/v1/sessions/related_questions \
     --header 'Content-Type: application/json' \
     --header 'Authorization: Bearer <YOUR_LOGIN_TOKEN>' \
     --data '
--- a/docs/references/python_api_reference.md
+++ b/docs/references/python_api_reference.md
@ -5,7 +5,7 @@ slug: /python_api_reference

 # Python API

-A complete reference for RAGFlow's Python APIs. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](../develop/acquire_ragflow_api_key.md).
+A complete reference for RAGFlow's Python APIs. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](https://ragflow.io/docs/dev/acquire_ragflow_api_key).

 :::tip NOTE
 Run the following command to download the Python SDK:
--- a/docs/release_notes.md
+++ b/docs/release_notes.md
@ -9,8 +9,8 @@ Key features, improvements and bug fixes in the latest releases.

 :::info
 Each RAGFlow release is available in two editions:
- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.1-slim`
- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.1`
+- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.4-slim`
+- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.4`
 :::

 :::danger IMPORTANT
@ -22,9 +22,41 @@ The embedding models included in a full edition are:
 These two embedding models are optimized specifically for English and Chinese, so performance may be compromised if you use them to embed documents in other languages.
 :::

-## v0.20.2
+## v0.20.4

-Released on August 19, 2025.
+Released on August 27, 2025.
+
+### Improvements
+
+- Agent component: Completes Chinese localization for the Agent component.
+- Introduces the `ENABLE_TIMEOUT_ASSERTION` environment variable to enable or disable timeout assertions for file parsing tasks.
+- Dataset:
+  - Improves Markdown file parsing, with AST support to avoid unintended chunking.
+  - Enhances HTML parsing, supporting bs4-based HTML tag traversal.
+
+### Added models
+
+ZHIPU GLM-4.5
+
+### New Agent templates
+
+Ecommerce Customer Service Workflow: A template designed to handle enquiries about product features and multi-product comparisons using the internal knowledge base, as well as to manage installation appointment bookings.
+
+### Fixed issues
+
+- Dataset:  
+  - Unable to share resources with the team.
+  - Inappropriate restrictions on the number and size of uploaded files.
+- Chat:
+  - Unable to preview referenced files in responses.
+  - Unable to send out messages after file uploads.
+- An OAuth2 authentication failure.
+- A logical error in multi-conditioned metadata searches within a dataset.
+- Citations infinitely increased in multi-turn conversations.
+
+## v0.20.3
+
+Released on August 20, 2025.

 ### Improvements

--- a/graphrag/entity_resolution.py
+++ b/graphrag/entity_resolution.py
@ -15,6 +15,7 @@
 #
 import logging
 import itertools
+import os
 import re
 from dataclasses import dataclass
 from typing import Any, Callable
@ -106,7 +107,8 @@ class EntityResolution(Extractor):
            nonlocal remain_candidates_to_resolve, callback
            async with semaphore:
                try:
-                    with trio.move_on_after(280) as cancel_scope:
+                    enable_timeout_assertion = os.environ.get("ENABLE_TIMEOUT_ASSERTION")
+                    with trio.move_on_after(280 if enable_timeout_assertion else 1000000000) as cancel_scope:
                        await self._resolve_candidate(candidate_batch, result_set, result_lock)
                        remain_candidates_to_resolve = remain_candidates_to_resolve - len(candidate_batch[1])
                        callback(msg=f"Resolved {len(candidate_batch[1])} pairs, {remain_candidates_to_resolve} are remained to resolve. ")
@ -169,7 +171,8 @@ class EntityResolution(Extractor):
        logging.info(f"Created resolution prompt {len(text)} bytes for {len(candidate_resolution_i[1])} entity pairs of type {candidate_resolution_i[0]}")
        async with chat_limiter:
            try:
-                with trio.move_on_after(240) as cancel_scope:
+                enable_timeout_assertion = os.environ.get("ENABLE_TIMEOUT_ASSERTION")
+                with trio.move_on_after(280 if enable_timeout_assertion else 1000000000) as cancel_scope:
                    response = await trio.to_thread.run_sync(self._chat, text, [{"role": "user", "content": "Output:"}], {})
                if cancel_scope.cancelled_caught:
                    logging.warning("_resolve_candidate._chat timeout, skipping...")
--- a/graphrag/general/community_reports_extractor.py
+++ b/graphrag/general/community_reports_extractor.py
@ -7,6 +7,7 @@ Reference:

 import logging
 import json
+import os
 import re
 from typing import Callable
 from dataclasses import dataclass
@ -51,6 +52,7 @@ class CommunityReportsExtractor(Extractor):
        self._max_report_length = max_report_length or 1500

    async def __call__(self, graph: nx.Graph, callback: Callable | None = None):
+        enable_timeout_assertion = os.environ.get("ENABLE_TIMEOUT_ASSERTION")
        for node_degree in graph.degree:
            graph.nodes[str(node_degree[0])]["rank"] = int(node_degree[1])

@ -92,7 +94,7 @@ class CommunityReportsExtractor(Extractor):
            text = perform_variable_replacements(self._extraction_prompt, variables=prompt_variables)
            async with chat_limiter:
                try:
-                    with trio.move_on_after(180) as cancel_scope:
+                    with trio.move_on_after(180 if enable_timeout_assertion else 1000000000) as cancel_scope:
                        response = await trio.to_thread.run_sync( self._chat, text, [{"role": "user", "content": "Output:"}], {})
                    if cancel_scope.cancelled_caught:
                        logging.warning("extract_community_report._chat timeout, skipping...")
--- a/graphrag/general/extractor.py
+++ b/graphrag/general/extractor.py
@ -47,7 +47,7 @@ class Extractor:
        self._language = language
        self._entity_types = entity_types or DEFAULT_ENTITY_TYPES

-    @timeout(60*5)
+    @timeout(60*20)
    def _chat(self, system, history, gen_conf={}):
        hist = deepcopy(history)
        conf = deepcopy(gen_conf)
--- a/graphrag/general/index.py
+++ b/graphrag/general/index.py
@ -15,6 +15,8 @@
 #
 import json
 import logging
+import os
+
 import networkx as nx
 import trio

@ -49,6 +51,7 @@ async def run_graphrag(
    embedding_model,
    callback,
 ):
+    enable_timeout_assertion=os.environ.get("ENABLE_TIMEOUT_ASSERTION")
    start = trio.current_time()
    tenant_id, kb_id, doc_id = row["tenant_id"], str(row["kb_id"]), row["doc_id"]
    chunks = []
@ -57,7 +60,7 @@ async def run_graphrag(
    ):
        chunks.append(d["content_with_weight"])

-    with trio.fail_after(max(120, len(chunks)*120)):
+    with trio.fail_after(max(120, len(chunks)*60*10) if enable_timeout_assertion else 10000000000):
        subgraph = await generate_subgraph(
            LightKGExt
            if "method" not in row["kb_parser_config"].get("graphrag", {}) or row["kb_parser_config"]["graphrag"]["method"] != "general"
--- a/graphrag/light/graph_prompt.py
+++ b/graphrag/light/graph_prompt.py
@ -130,7 +130,36 @@ Output:

 PROMPTS[
    "entiti_continue_extraction"
-] = """MANY entities were missed in the last extraction.  Add them below using the same format:
+] = """
+MANY entities and relationships were missed in the last extraction. Please find only the missing entities and relationships from previous text.
+
+---Remember Steps---
+
+1. Identify all entities. For each identified entity, extract the following information:
+- entity_name: Name of the entity, use same language as input text. If English, capitalized the name
+- entity_type: One of the following types: [{entity_types}]
+- entity_description: Provide a comprehensive description of the entity's attributes and activities *based solely on the information present in the input text*. **Do not infer or hallucinate information not explicitly stated.** If the text provides insufficient information to create a comprehensive description, state "Description not available in text."
+Format each entity as ("entity"{tuple_delimiter}<entity_name>{tuple_delimiter}<entity_type>{tuple_delimiter}<entity_description>)
+
+2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.
+For each pair of related entities, extract the following information:
+- source_entity: name of the source entity, as identified in step 1
+- target_entity: name of the target entity, as identified in step 1
+- relationship_description: explanation as to why you think the source entity and the target entity are related to each other
+- relationship_strength: a numeric score indicating strength of the relationship between the source entity and target entity
+- relationship_keywords: one or more high-level key words that summarize the overarching nature of the relationship, focusing on concepts or themes rather than specific details
+Format each relationship as ("relationship"{tuple_delimiter}<source_entity>{tuple_delimiter}<target_entity>{tuple_delimiter}<relationship_description>{tuple_delimiter}<relationship_keywords>{tuple_delimiter}<relationship_strength>)
+
+3. Identify high-level key words that summarize the main concepts, themes, or topics of the entire text. These should capture the overarching ideas present in the document.
+Format the content-level key words as ("content_keywords"{tuple_delimiter}<high_level_keywords>)
+
+4. Return output in {language} as a single list of all the entities and relationships identified in steps 1 and 2. Use **{record_delimiter}** as the list delimiter.
+
+5. When finished, output {completion_delimiter}
+
+---Output---
+
+Add new entities and relations below using the same format, and do not include entities and relations that have been previously extracted. :
 """

 PROMPTS[
@ -252,4 +281,4 @@ When handling information with timestamps:
 - List up to 5 most important reference sources at the end under "References", clearly indicating whether each source is from Knowledge Graph (KG) or Vector Data (VD)
  Format: [KG/VD] Source content

-Add sections and commentary to the response as appropriate for the length and format. If the provided information is insufficient to answer the question, clearly state that you don't know or cannot provide an answer in the same language as the user's question."""
+Add sections and commentary to the response as appropriate for the length and format. If the provided information is insufficient to answer the question, clearly state that you don't know or cannot provide an answer in the same language as the user's question."""
--- a/graphrag/utils.py
+++ b/graphrag/utils.py
@ -307,6 +307,7 @@ def chunk_id(chunk):

 async def graph_node_to_chunk(kb_id, embd_mdl, ent_name, meta, chunks):
    global chat_limiter
+    enable_timeout_assertion=os.environ.get("ENABLE_TIMEOUT_ASSERTION")
    chunk = {
        "id": get_uuid(),
        "important_kwd": [ent_name],
@ -324,7 +325,7 @@ async def graph_node_to_chunk(kb_id, embd_mdl, ent_name, meta, chunks):
    ebd = get_embed_cache(embd_mdl.llm_name, ent_name)
    if ebd is None:
        async with chat_limiter:
-            with trio.fail_after(3):
+            with trio.fail_after(3 if enable_timeout_assertion else 30000000):
                ebd, _ = await trio.to_thread.run_sync(lambda: embd_mdl.encode([ent_name]))
        ebd = ebd[0]
        set_embed_cache(embd_mdl.llm_name, ent_name, ebd)
@ -362,6 +363,7 @@ def get_relation(tenant_id, kb_id, from_ent_name, to_ent_name, size=1):


 async def graph_edge_to_chunk(kb_id, embd_mdl, from_ent_name, to_ent_name, meta, chunks):
+    enable_timeout_assertion=os.environ.get("ENABLE_TIMEOUT_ASSERTION")
    chunk = {
        "id": get_uuid(),
        "from_entity_kwd": from_ent_name,
@ -380,7 +382,7 @@ async def graph_edge_to_chunk(kb_id, embd_mdl, from_ent_name, to_ent_name, meta,
    ebd = get_embed_cache(embd_mdl.llm_name, txt)
    if ebd is None:
        async with chat_limiter:
-            with trio.fail_after(3):
+            with trio.fail_after(3 if enable_timeout_assertion else 300000000):
                ebd, _ = await trio.to_thread.run_sync(lambda: embd_mdl.encode([txt+f": {meta['description']}"]))
        ebd = ebd[0]
        set_embed_cache(embd_mdl.llm_name, txt, ebd)
@ -514,9 +516,10 @@ async def set_graph(tenant_id: str, kb_id: str, embd_mdl, graph: nx.Graph, chang
        callback(msg=f"set_graph converted graph change to {len(chunks)} chunks in {now - start:.2f}s.")
    start = now

+    enable_timeout_assertion=os.environ.get("ENABLE_TIMEOUT_ASSERTION")
    es_bulk_size = 4
    for b in range(0, len(chunks), es_bulk_size):
-        with trio.fail_after(3):
+        with trio.fail_after(3 if enable_timeout_assertion else 30000000):
            doc_store_result = await trio.to_thread.run_sync(lambda: settings.docStoreConn.insert(chunks[b:b + es_bulk_size], search.index_name(tenant_id), kb_id))
        if b % 100 == es_bulk_size and callback:
            callback(msg=f"Insert chunks: {b}/{len(chunks)}")
--- a/helm/values.yaml
+++ b/helm/values.yaml
@ -56,7 +56,7 @@ env:
 ragflow:
  image:
    repository: infiniflow/ragflow
-    tag: v0.20.2-slim
+    tag: v0.20.4-slim
    pullPolicy: IfNotPresent
    pullSecrets: []
  # Optional service configuration overrides
--- a/mcp/server/server.py
+++ b/mcp/server/server.py
@ -16,6 +16,9 @@

 import json
 import logging
+import random
+import time
+from collections import OrderedDict
 from collections.abc import AsyncIterator
 from contextlib import asynccontextmanager
 from functools import wraps
@ -53,6 +56,13 @@ JSON_RESPONSE = True


 class RAGFlowConnector:
+    _MAX_DATASET_CACHE = 32
+    _MAX_DOCUMENT_CACHE = 128
+    _CACHE_TTL = 300
+
+    _dataset_metadata_cache: OrderedDict[str, tuple[dict, float | int]] = OrderedDict()  # "dataset_id" -> (metadata, expiry_ts)
+    _document_metadata_cache: OrderedDict[str, tuple[list[tuple[str, dict]], float | int]] = OrderedDict()  # "dataset_id" -> ([(document_id, doc_metadata)], expiry_ts)
+
    def __init__(self, base_url: str, version="v1"):
        self.base_url = base_url
        self.version = version
@ -72,6 +82,43 @@ class RAGFlowConnector:
        res = requests.get(url=self.api_url + path, params=params, headers=self.authorization_header, json=json)
        return res

+    def _is_cache_valid(self, ts):
+        return time.time() < ts
+
+    def _get_expiry_timestamp(self):
+        offset = random.randint(-30, 30)
+        return time.time() + self._CACHE_TTL + offset
+
+    def _get_cached_dataset_metadata(self, dataset_id):
+        entry = self._dataset_metadata_cache.get(dataset_id)
+        if entry:
+            data, ts = entry
+            if self._is_cache_valid(ts):
+                self._dataset_metadata_cache.move_to_end(dataset_id)
+                return data
+        return None
+
+    def _set_cached_dataset_metadata(self, dataset_id, metadata):
+        self._dataset_metadata_cache[dataset_id] = (metadata, self._get_expiry_timestamp())
+        self._dataset_metadata_cache.move_to_end(dataset_id)
+        if len(self._dataset_metadata_cache) > self._MAX_DATASET_CACHE:
+            self._dataset_metadata_cache.popitem(last=False)
+
+    def _get_cached_document_metadata_by_dataset(self, dataset_id):
+        entry = self._document_metadata_cache.get(dataset_id)
+        if entry:
+            data_list, ts = entry
+            if self._is_cache_valid(ts):
+                self._document_metadata_cache.move_to_end(dataset_id)
+                return {doc_id: doc_meta for doc_id, doc_meta in data_list}
+        return None
+
+    def _set_cached_document_metadata_by_dataset(self, dataset_id, doc_id_meta_list):
+        self._document_metadata_cache[dataset_id] = (doc_id_meta_list, self._get_expiry_timestamp())
+        self._document_metadata_cache.move_to_end(dataset_id)
+        if len(self._document_metadata_cache) > self._MAX_DOCUMENT_CACHE:
+            self._document_metadata_cache.popitem(last=False)
+
    def list_datasets(self, page: int = 1, page_size: int = 1000, orderby: str = "create_time", desc: bool = True, id: str | None = None, name: str | None = None):
        res = self._get("/datasets", {"page": page, "page_size": page_size, "orderby": orderby, "desc": desc, "id": id, "name": name})
        if not res:
@ -87,10 +134,38 @@ class RAGFlowConnector:
        return ""

    def retrieval(
-        self, dataset_ids, document_ids=None, question="", page=1, page_size=30, similarity_threshold=0.2, vector_similarity_weight=0.3, top_k=1024, rerank_id: str | None = None, keyword: bool = False
+        self,
+        dataset_ids,
+        document_ids=None,
+        question="",
+        page=1,
+        page_size=30,
+        similarity_threshold=0.2,
+        vector_similarity_weight=0.3,
+        top_k=1024,
+        rerank_id: str | None = None,
+        keyword: bool = False,
+        force_refresh: bool = False,
    ):
        if document_ids is None:
            document_ids = []
+        
+        # If no dataset_ids provided or empty list, get all available dataset IDs
+        if not dataset_ids:
+            dataset_list_str = self.list_datasets()
+            dataset_ids = []
+            
+            # Parse the dataset list to extract IDs
+            if dataset_list_str:
+                for line in dataset_list_str.strip().split('\n'):
+                    if line.strip():
+                        try:
+                            dataset_info = json.loads(line.strip())
+                            dataset_ids.append(dataset_info["id"])
+                        except (json.JSONDecodeError, KeyError):
+                            # Skip malformed lines
+                            continue
+        
        data_json = {
            "page": page,
            "page_size": page_size,
@ -110,12 +185,127 @@ class RAGFlowConnector:

        res = res.json()
        if res.get("code") == 0:
+            data = res["data"]
            chunks = []
-            for chunk_data in res["data"].get("chunks"):
-                chunks.append(json.dumps(chunk_data, ensure_ascii=False))
-            return [types.TextContent(type="text", text="\n".join(chunks))]
+
+            # Cache document metadata and dataset information
+            document_cache, dataset_cache = self._get_document_metadata_cache(dataset_ids, force_refresh=force_refresh)
+
+            # Process chunks with enhanced field mapping including per-chunk metadata
+            for chunk_data in data.get("chunks", []):
+                enhanced_chunk = self._map_chunk_fields(chunk_data, dataset_cache, document_cache)
+                chunks.append(enhanced_chunk)
+
+            # Build structured response (no longer need response-level document_metadata)
+            response = {
+                "chunks": chunks,
+                "pagination": {
+                    "page": data.get("page", page),
+                    "page_size": data.get("page_size", page_size),
+                    "total_chunks": data.get("total", len(chunks)),
+                    "total_pages": (data.get("total", len(chunks)) + page_size - 1) // page_size,
+                },
+                "query_info": {
+                    "question": question,
+                    "similarity_threshold": similarity_threshold,
+                    "vector_weight": vector_similarity_weight,
+                    "keyword_search": keyword,
+                    "dataset_count": len(dataset_ids),
+                },
+            }
+
+            return [types.TextContent(type="text", text=json.dumps(response, ensure_ascii=False))]
+
        raise Exception([types.TextContent(type="text", text=res.get("message"))])

+    def _get_document_metadata_cache(self, dataset_ids, force_refresh=False):
+        """Cache document metadata for all documents in the specified datasets"""
+        document_cache = {}
+        dataset_cache = {}
+
+        try:
+            for dataset_id in dataset_ids:
+                dataset_meta = None if force_refresh else self._get_cached_dataset_metadata(dataset_id)
+                if not dataset_meta:
+                    # First get dataset info for name
+                    dataset_res = self._get("/datasets", {"id": dataset_id, "page_size": 1})
+                    if dataset_res and dataset_res.status_code == 200:
+                        dataset_data = dataset_res.json()
+                        if dataset_data.get("code") == 0 and dataset_data.get("data"):
+                            dataset_info = dataset_data["data"][0]
+                            dataset_meta = {"name": dataset_info.get("name", "Unknown"), "description": dataset_info.get("description", "")}
+                            self._set_cached_dataset_metadata(dataset_id, dataset_meta)
+                if dataset_meta:
+                    dataset_cache[dataset_id] = dataset_meta
+
+                docs = None if force_refresh else self._get_cached_document_metadata_by_dataset(dataset_id)
+                if docs is None:
+                    docs_res = self._get(f"/datasets/{dataset_id}/documents")
+                    docs_data = docs_res.json()
+                    if docs_data.get("code") == 0 and docs_data.get("data", {}).get("docs"):
+                        doc_id_meta_list = []
+                        docs = {}
+                        for doc in docs_data["data"]["docs"]:
+                            doc_id = doc.get("id")
+                            if not doc_id:
+                                continue
+                            doc_meta = {
+                                "document_id": doc_id,
+                                "name": doc.get("name", ""),
+                                "location": doc.get("location", ""),
+                                "type": doc.get("type", ""),
+                                "size": doc.get("size"),
+                                "chunk_count": doc.get("chunk_count"),
+                                # "chunk_method": doc.get("chunk_method", ""),
+                                "create_date": doc.get("create_date", ""),
+                                "update_date": doc.get("update_date", ""),
+                                # "process_begin_at": doc.get("process_begin_at", ""),
+                                # "process_duration": doc.get("process_duration"),
+                                # "progress": doc.get("progress"),
+                                # "progress_msg": doc.get("progress_msg", ""),
+                                # "status": doc.get("status", ""),
+                                # "run": doc.get("run", ""),
+                                "token_count": doc.get("token_count"),
+                                # "source_type": doc.get("source_type", ""),
+                                "thumbnail": doc.get("thumbnail", ""),
+                                "dataset_id": doc.get("dataset_id", dataset_id),
+                                "meta_fields": doc.get("meta_fields", {}),
+                                # "parser_config": doc.get("parser_config", {})
+                            }
+                            doc_id_meta_list.append((doc_id, doc_meta))
+                            docs[doc_id] = doc_meta
+                        self._set_cached_document_metadata_by_dataset(dataset_id, doc_id_meta_list)
+                if docs:
+                    document_cache.update(docs)
+
+        except Exception:
+            # Gracefully handle metadata cache failures
+            pass
+
+        return document_cache, dataset_cache
+
+    def _map_chunk_fields(self, chunk_data, dataset_cache, document_cache):
+        """Preserve all original API fields and add per-chunk document metadata"""
+        # Start with ALL raw data from API (preserve everything like original version)
+        mapped = dict(chunk_data)
+
+        # Add dataset name enhancement
+        dataset_id = chunk_data.get("dataset_id") or chunk_data.get("kb_id")
+        if dataset_id and dataset_id in dataset_cache:
+            mapped["dataset_name"] = dataset_cache[dataset_id]["name"]
+        else:
+            mapped["dataset_name"] = "Unknown"
+
+        # Add document name convenience field
+        mapped["document_name"] = chunk_data.get("document_keyword", "")
+
+        # Add per-chunk document metadata
+        document_id = chunk_data.get("document_id")
+        if document_id and document_id in document_cache:
+            mapped["document_metadata"] = document_cache[document_id]
+
+        return mapped
+

 class RAGFlowCtx:
    def __init__(self, connector: RAGFlowConnector):
@ -195,7 +385,58 @@ async def list_tools(*, connector) -> list[types.Tool]:
                        "items": {"type": "string"},
                        "description": "Optional array of document IDs to search within."
                    },
-                    "question": {"type": "string", "description": "The question or query to search for."},
+                    "question": {
+                        "type": "string",
+                        "description": "The question or query to search for."
+                    },
+                    "page": {
+                        "type": "integer",
+                        "description": "Page number for pagination",
+                        "default": 1,
+                        "minimum": 1,
+                    },
+                    "page_size": {
+                        "type": "integer",
+                        "description": "Number of results to return per page (default: 10, max recommended: 50 to avoid token limits)",
+                        "default": 10,
+                        "minimum": 1,
+                        "maximum": 100,
+                    },
+                    "similarity_threshold": {
+                        "type": "number",
+                        "description": "Minimum similarity threshold for results",
+                        "default": 0.2,
+                        "minimum": 0.0,
+                        "maximum": 1.0,
+                    },
+                    "vector_similarity_weight": {
+                        "type": "number",
+                        "description": "Weight for vector similarity vs term similarity",
+                        "default": 0.3,
+                        "minimum": 0.0,
+                        "maximum": 1.0,
+                    },
+                    "keyword": {
+                        "type": "boolean",
+                        "description": "Enable keyword-based search",
+                        "default": False,
+                    },
+                    "top_k": {
+                        "type": "integer",
+                        "description": "Maximum results to consider before ranking",
+                        "default": 1024,
+                        "minimum": 1,
+                        "maximum": 1024,
+                    },
+                    "rerank_id": {
+                        "type": "string",
+                        "description": "Optional reranking model identifier",
+                    },
+                    "force_refresh": {
+                        "type": "boolean",
+                        "description": "Set to true only if fresh dataset and document metadata is explicitly required. Otherwise, cached metadata is used (default: false).",
+                        "default": False,
+                    },
                },
                "required": ["question"],
            },
@ -209,6 +450,16 @@ async def call_tool(name: str, arguments: dict, *, connector) -> list[types.Text
    if name == "ragflow_retrieval":
        document_ids = arguments.get("document_ids", [])
        dataset_ids = arguments.get("dataset_ids", [])
+        question = arguments.get("question", "")
+        page = arguments.get("page", 1)
+        page_size = arguments.get("page_size", 10)
+        similarity_threshold = arguments.get("similarity_threshold", 0.2)
+        vector_similarity_weight = arguments.get("vector_similarity_weight", 0.3)
+        keyword = arguments.get("keyword", False)
+        top_k = arguments.get("top_k", 1024)
+        rerank_id = arguments.get("rerank_id")
+        force_refresh = arguments.get("force_refresh", False)
+
        
        # If no dataset_ids provided or empty list, get all available dataset IDs
        if not dataset_ids:
@ -229,7 +480,15 @@ async def call_tool(name: str, arguments: dict, *, connector) -> list[types.Text
        return connector.retrieval(
            dataset_ids=dataset_ids,
            document_ids=document_ids,
-            question=arguments["question"],
+            question=question,
+            page=page,
+            page_size=page_size,
+            similarity_threshold=similarity_threshold,
+            vector_similarity_weight=vector_similarity_weight,
+            keyword=keyword,
+            top_k=top_k,
+            rerank_id=rerank_id,
+            force_refresh=force_refresh,
        )
    raise ValueError(f"Tool not found: {name}")

--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "ragflow"
-version = "0.20.2"
+version = "0.20.4"
 description = "[RAGFlow](https://ragflow.io/) is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data."
 authors = [{ name = "Zhichang Yu", email = "yuzhichang@gmail.com" }]
 license-files = ["LICENSE"]
@ -45,7 +45,7 @@ dependencies = [
    "html-text==0.6.2",
    "httpx[socks]==0.27.2",
    "huggingface-hub>=0.25.0,<0.26.0",
-    "infinity-sdk==0.6.0-dev4",
+    "infinity-sdk==0.6.0.dev5",
    "infinity-emb>=0.0.66,<0.0.67",
    "itsdangerous==2.1.2",
    "json-repair==0.35.0",
--- a/rag/app/naive.py
+++ b/rag/app/naive.py
@ -30,7 +30,7 @@ from tika import parser

 from api.db import LLMType
 from api.db.services.llm_service import LLMBundle
-from deepdoc.parser import DocxParser, ExcelParser, HtmlParser, JsonParser, MarkdownParser, PdfParser, TxtParser
+from deepdoc.parser import DocxParser, ExcelParser, HtmlParser, JsonParser, MarkdownElementExtractor, MarkdownParser, PdfParser, TxtParser
 from deepdoc.parser.figure_parser import VisionFigureParser, vision_figure_parser_figure_data_wrapper
 from deepdoc.parser.pdf_parser import PlainParser, VisionParser
 from rag.nlp import concat_img, find_codec, naive_merge, naive_merge_with_images, naive_merge_docx, rag_tokenizer, tokenize_chunks, tokenize_chunks_with_images, tokenize_table
@ -289,7 +289,7 @@ class Pdf(PdfParser):
            return [(b["text"], self._line_tag(b, zoomin)) for b in self.boxes], tbls, figures
        else:
            tbls = self._extract_table_figure(True, zoomin, True, True)
-            # self._naive_vertical_merge()
+            self._naive_vertical_merge()
            self._concat_downward()
            # self._filter_forpages()
            logging.info("layouts cost: {}s".format(timer() - first_start))
@ -350,17 +350,14 @@ class Markdown(MarkdownParser):
        else:
            with open(filename, "r") as f:
                txt = f.read()
+
        remainder, tables = self.extract_tables_and_remainder(f'{txt}\n', separate_tables=separate_tables)
-        sections = []
+
+        extractor = MarkdownElementExtractor(txt)
+        element_sections = extractor.extract_elements()
+        sections = [(element, "") for element in element_sections]
+
        tbls = []
-        for sec in remainder.split("\n"):
-            if sec.strip().find("#") == 0:
-                sections.append((sec, ""))
-            elif sections and sections[-1][0].strip().find("#") == 0:
-                sec_, _ = sections.pop(-1)
-                sections.append((sec_ + "\n" + sec, ""))
-            else:
-                sections.append((sec, ""))
        for table in tables:
            tbls.append(((None, markdown(table, extensions=['markdown.extensions.tables'])), ""))
        return sections, tbls
@ -520,7 +517,8 @@ def chunk(filename, binary=None, from_page=0, to_page=100000,

    elif re.search(r"\.(htm|html)$", filename, re.IGNORECASE):
        callback(0.1, "Start to parse.")
-        sections = HtmlParser()(filename, binary)
+        chunk_token_num = int(parser_config.get("chunk_token_num", 128))
+        sections = HtmlParser()(filename, binary, chunk_token_num)
        sections = [(_, "") for _ in sections if _]
        callback(0.8, "Finish parsing.")

--- a/rag/flow/init.py
+++ b/rag/flow/init.py
@ -0,0 +1,49 @@
+#
+#  Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+
+import os
+import importlib
+import inspect
+from types import ModuleType
+from typing import Dict, Type
+
+_package_path = os.path.dirname(__file__)
+__all_classes: Dict[str, Type] = {}
+
+def _import_submodules() -> None:
+    for filename in os.listdir(_package_path): # noqa: F821
+        if filename.startswith("__") or not filename.endswith(".py") or filename.startswith("base"):
+            continue
+        module_name = filename[:-3]
+
+        try:
+            module = importlib.import_module(f".{module_name}", package=__name__)
+            _extract_classes_from_module(module)  # noqa: F821
+        except ImportError as e:
+            print(f"Warning: Failed to import module {module_name}: {str(e)}")
+
+def _extract_classes_from_module(module: ModuleType) -> None:
+    for name, obj in inspect.getmembers(module):
+        if (inspect.isclass(obj) and
+                obj.__module__ == module.__name__ and not name.startswith("_")):
+            __all_classes[name] = obj
+            globals()[name] = obj
+
+_import_submodules()
+
+__all__ = list(__all_classes.keys()) + ["__all_classes"]
+
+del _package_path, _import_submodules, _extract_classes_from_module
--- a/rag/flow/base.py
+++ b/rag/flow/base.py
@ -0,0 +1,59 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import time
+import os
+import logging
+from functools import partial
+from typing import Any
+import trio
+from agent.component.base import ComponentParamBase, ComponentBase
+from api.utils.api_utils import timeout
+
+
+class ProcessParamBase(ComponentParamBase):
+    def __init__(self):
+        super().__init__()
+        self.timeout = 100000000
+        self.persist_logs = True
+
+
+class ProcessBase(ComponentBase):
+
+    def __init__(self, pipeline, id, param: ProcessParamBase):
+        super().__init__(pipeline, id, param)
+        self.callback = partial(self._canvas.callback, self.component_name)
+
+    async def invoke(self, **kwargs) -> dict[str, Any]:
+        self.set_output("_created_time", time.perf_counter())
+        for k,v in kwargs.items():
+            self.set_output(k, v)
+        try:
+            with trio.fail_after(self._param.timeout):
+                await self._invoke(**kwargs)
+                self.callback(1, "Done")
+        except Exception as e:
+            if self.get_exception_default_value():
+                self.set_exception_default_value()
+            else:
+                self.set_output("_ERROR", str(e))
+            logging.exception(e)
+            self.callback(-1, str(e))
+        self.set_output("_elapsed_time", time.perf_counter() - self.output("_created_time"))
+        return self.output()
+
+    @timeout(os.environ.get("COMPONENT_EXEC_TIMEOUT", 10*60))
+    async def _invoke(self, **kwargs):
+        raise NotImplementedError()
--- a/rag/flow/begin.py
+++ b/rag/flow/begin.py
@ -0,0 +1,47 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+from api.db.services.document_service import DocumentService
+from api.db.services.file2document_service import File2DocumentService
+from api.db.services.file_service import FileService
+from rag.flow.base import ProcessBase, ProcessParamBase
+from rag.utils.storage_factory import STORAGE_IMPL
+
+
+class FileParam(ProcessParamBase):
+    def __init__(self):
+        super().__init__()
+
+    def check(self):
+        pass
+
+
+class File(ProcessBase):
+    component_name = "File"
+
+    async def _invoke(self, **kwargs):
+        if self._canvas._doc_id:
+            e, doc = DocumentService.get_by_id(self._canvas._doc_id)
+            if not e:
+                self.set_output("_ERROR", f"Document({self._canvas._doc_id}) not found!")
+                return
+
+            b, n = File2DocumentService.get_storage_address(doc_id=self._canvas._doc_id)
+            self.set_output("blob", STORAGE_IMPL.get(b, n))
+            self.set_output("name", doc.name)
+        else:
+            file = kwargs.get("file")
+            self.set_output("name", file["name"])
+            self.set_output("blob", FileService.get_blob(file["created_by"], file["id"]))
--- a/rag/flow/chunker.py
+++ b/rag/flow/chunker.py
@ -0,0 +1,160 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+import random
+import trio
+from api.db import LLMType
+from api.db.services.llm_service import LLMBundle
+from deepdoc.parser.pdf_parser import RAGFlowPdfParser
+from graphrag.utils import get_llm_cache, chat_limiter, set_llm_cache
+from rag.flow.base import ProcessBase, ProcessParamBase
+from rag.nlp import naive_merge, naive_merge_with_images
+from rag.prompts.prompts import keyword_extraction, question_proposal
+
+
+class ChunkerParam(ProcessParamBase):
+    def __init__(self):
+        super().__init__()
+        self.method_options = ["general", "q&a", "resume", "manual", "table", "paper", "book", "laws", "presentation", "one"]
+        self.method = "general"
+        self.chunk_token_size = 512
+        self.delimiter = "\n"
+        self.overlapped_percent = 0
+        self.page_rank = 0
+        self.auto_keywords = 0
+        self.auto_questions = 0
+        self.tag_sets = []
+        self.llm_setting = {
+            "llm_name": "",
+            "lang": "Chinese"
+        }
+
+    def check(self):
+        self.check_valid_value(self.method.lower(), "Chunk method abnormal.", self.method_options)
+        self.check_positive_integer(self.chunk_token_size, "Chunk token size.")
+        self.check_nonnegative_number(self.page_rank, "Page rank value: (0, 10]")
+        self.check_nonnegative_number(self.auto_keywords, "Auto-keyword value: (0, 10]")
+        self.check_nonnegative_number(self.auto_questions, "Auto-question value: (0, 10]")
+        self.check_decimal_float(self.overlapped_percent, "Overlapped percentage: [0, 1)")
+
+
+class Chunker(ProcessBase):
+    component_name = "Chunker"
+
+    def _general(self, **kwargs):
+        self.callback(random.randint(1,5)/100., "Start to chunk via `General`.")
+        if kwargs.get("output_format") in ["markdown", "text"]:
+            cks = naive_merge(kwargs.get(kwargs["output_format"]), self._param.chunk_token_size, self._param.delimiter, self._param.overlapped_percent)
+            return [{"text": c} for c in cks]
+
+        sections, section_images = [], []
+        for o in kwargs["json"]:
+            sections.append((o["text"], o.get("position_tag","")))
+            section_images.append(o.get("image"))
+
+        chunks, images = naive_merge_with_images(sections, section_images,self._param.chunk_token_size, self._param.delimiter, self._param.overlapped_percent)
+        return [{
+            "text": RAGFlowPdfParser.remove_tag(c),
+            "image": img,
+            "positions": RAGFlowPdfParser.extract_positions(c)
+        } for c,img in zip(chunks,images)]
+
+    def _q_and_a(self, **kwargs):
+        pass
+
+    def _resume(self, **kwargs):
+        pass
+
+    def _manual(self, **kwargs):
+        pass
+
+    def _table(self, **kwargs):
+        pass
+
+    def _paper(self, **kwargs):
+        pass
+
+    def _book(self, **kwargs):
+        pass
+
+    def _laws(self, **kwargs):
+        pass
+
+    def _presentation(self, **kwargs):
+        pass
+
+    def _one(self, **kwargs):
+        pass
+
+    async def _invoke(self, **kwargs):
+        function_map = {
+            "general": self._general,
+            "q&a": self._q_and_a,
+            "resume": self._resume,
+            "manual": self._manual,
+            "table": self._table,
+            "paper": self._paper,
+            "book": self._book,
+            "laws": self._laws,
+            "presentation": self._presentation,
+            "one": self._one,
+        }
+        chunks = function_map[self._param.method](**kwargs)
+        llm_setting = self._param.llm_setting
+
+        async def auto_keywords():
+            nonlocal chunks, llm_setting
+            chat_mdl = LLMBundle(self._canvas._tenant_id, LLMType.CHAT, llm_name=llm_setting["llm_name"], lang=llm_setting["lang"])
+
+            async def doc_keyword_extraction(chat_mdl, ck, topn):
+                cached = get_llm_cache(chat_mdl.llm_name, ck["text"], "keywords", {"topn": topn})
+                if not cached:
+                    async with chat_limiter:
+                        cached = await trio.to_thread.run_sync(lambda: keyword_extraction(chat_mdl, ck["text"], topn))
+                    set_llm_cache(chat_mdl.llm_name, ck["text"], cached, "keywords", {"topn": topn})
+                if cached:
+                    ck["keywords"] = cached.split(",")
+
+            async with trio.open_nursery() as nursery:
+                for ck in chunks:
+                    nursery.start_soon(doc_keyword_extraction, chat_mdl, ck, self._param.auto_keywords)
+
+        async def auto_questions():
+            nonlocal chunks, llm_setting
+            chat_mdl = LLMBundle(self._canvas._tenant_id, LLMType.CHAT, llm_name=llm_setting["llm_name"], lang=llm_setting["lang"])
+
+            async def doc_question_proposal(chat_mdl, d, topn):
+                cached = get_llm_cache(chat_mdl.llm_name, ck["text"], "question", {"topn": topn})
+                if not cached:
+                    async with chat_limiter:
+                        cached = await trio.to_thread.run_sync(lambda: question_proposal(chat_mdl, ck["text"], topn))
+                    set_llm_cache(chat_mdl.llm_name, ck["text"], cached, "question", {"topn": topn})
+                if cached:
+                    d["questions"] = cached.split("\n")
+
+            async with trio.open_nursery() as nursery:
+                for ck in chunks:
+                    nursery.start_soon(doc_question_proposal, chat_mdl, ck, self._param.auto_questions)
+
+        async with trio.open_nursery() as nursery:
+            if self._param.auto_questions:
+                nursery.start_soon(auto_questions)
+            if self._param.auto_keywords:
+                nursery.start_soon(auto_keywords)
+
+        if self._param.page_rank:
+            for ck in chunks:
+                ck["page_rank"] = self._param.page_rank
+
+        self.set_output("chunks", chunks)
--- a/rag/flow/parser.py
+++ b/rag/flow/parser.py
@ -0,0 +1,107 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+import random
+import trio
+from api.db import LLMType
+from api.db.services.llm_service import LLMBundle
+from deepdoc.parser.pdf_parser import RAGFlowPdfParser, PlainParser, VisionParser
+from rag.flow.base import ProcessBase, ProcessParamBase
+from rag.llm.cv_model import Base as VLM
+from deepdoc.parser import ExcelParser
+
+
+class ParserParam(ProcessParamBase):
+    def __init__(self):
+        super().__init__()
+        self.setups = {
+            "pdf": {
+                "parse_method": "deepdoc", # deepdoc/plain_text/vlm
+                "vlm_name": "",
+                "lang": "Chinese",
+                "suffix": ["pdf"],
+                "output_format": "json"
+            },
+            "excel": {
+                "output_format": "html"
+            },
+            "ppt": {},
+            "image": {
+                "parse_method": "ocr"
+            },
+            "email": {},
+            "text": {},
+            "audio": {},
+            "video": {},
+        }
+
+    def check(self):
+        if self.setups["pdf"].get("parse_method") not in ["deepdoc", "plain_text"]:
+            assert self.setups["pdf"].get("vlm_name"), "No VLM specified."
+            assert self.setups["pdf"].get("lang"), "No language specified."
+
+
+class Parser(ProcessBase):
+    component_name = "Parser"
+
+    def _pdf(self, blob):
+        self.callback(random.randint(1,5)/100., "Start to work on a PDF.")
+        conf = self._param.setups["pdf"]
+        self.set_output("output_format", conf["output_format"])
+        if conf.get("parse_method") == "deepdoc":
+            bboxes = RAGFlowPdfParser().parse_into_bboxes(blob, callback=self.callback)
+        elif conf.get("parse_method") == "plain_text":
+            lines,_ = PlainParser()(blob)
+            bboxes = [{"text": t} for t,_ in lines]
+        else:
+            assert conf.get("vlm_name")
+            vision_model = LLMBundle(self._canvas.tenant_id, LLMType.IMAGE2TEXT, llm_name=conf.get("vlm_name"), lang=self.setups["pdf"].get("lang"))
+            lines, _ = VisionParser(vision_model=vision_model)(bin, callback=self.callback)
+            bboxes = []
+            for t, poss in lines:
+                pn, x0, x1, top, bott = poss.split(" ")
+                bboxes.append({"page_number": int(pn), "x0": int(x0), "x1": int(x1), "top": int(top), "bottom": int(bott), "text": t})
+
+        self.set_output("json", bboxes)
+        mkdn = ""
+        for b in bboxes:
+            if b.get("layout_type", "") == "title":
+                mkdn += "\n## "
+            if b.get("layout_type", "") == "figure":
+                mkdn += "\n![Image]({})".format(VLM.image2base64(b["image"]))
+                continue
+            mkdn += b.get("text", "") + "\n"
+        self.set_output("markdown", mkdn)
+
+    def _excel(self, blob):
+        self.callback(random.randint(1,5)/100., "Start to work on a Excel.")
+        conf = self._param.setups["excel"]
+        excel_parser = ExcelParser()
+        if conf.get("output_format") == "html":
+            html = excel_parser.html(blob,1000000000)
+            self.set_output("html", html)
+        elif conf.get("output_format") == "json":
+            self.set_output("json", [{"text": txt} for txt in excel_parser(blob) if txt])
+        elif conf.get("output_format") == "markdown":
+            self.set_output("markdown", excel_parser.markdown(blob))
+
+    async def _invoke(self, **kwargs):
+        function_map = {
+            "pdf": self._pdf,
+        }
+        for p_type, conf in self._param.setups.items():
+            if kwargs.get("name", "").split(".")[-1].lower() not in conf.get("suffix", []):
+                continue
+            await trio.to_thread.run_sync(function_map[p_type], kwargs["blob"])
+            break
--- a/rag/flow/pipeline.py
+++ b/rag/flow/pipeline.py
@ -0,0 +1,121 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import datetime
+import json
+import logging
+import random
+import time
+import trio
+from agent.canvas import Graph
+from api.db.services.document_service import DocumentService
+from rag.utils.redis_conn import REDIS_CONN
+
+
+class Pipeline(Graph):
+
+    def __init__(self, dsl: str, tenant_id=None, doc_id=None, task_id=None, flow_id=None):
+        super().__init__(dsl, tenant_id, task_id)
+        self._doc_id = doc_id
+        self._flow_id = flow_id
+        self._kb_id = None
+        if doc_id:
+            self._kb_id = DocumentService.get_knowledgebase_id(doc_id)
+            assert self._kb_id, f"Can't find KB of this document: {doc_id}"
+
+    def callback(self, component_name: str, progress: float|int|None=None, message: str = "") -> None:
+        log_key = f"{self._flow_id}-{self.task_id}-logs"
+        try:
+            bin = REDIS_CONN.get(log_key)
+            obj = json.loads(bin.encode("utf-8"))
+            if obj:
+                if obj[-1]["component_name"] == component_name:
+                    obj[-1]["trace"].append({"progress": progress, "message": message, "datetime": datetime.datetime.now().strftime("%H:%M:%S")})
+                else:
+                    obj.append({
+                    "component_name": component_name,
+                    "trace": [{"progress": progress, "message": message, "datetime": datetime.datetime.now().strftime("%H:%M:%S")}]
+                })
+            else:
+                obj = [{
+                    "component_name": component_name,
+                    "trace": [{"progress": progress, "message": message, "datetime": datetime.datetime.now().strftime("%H:%M:%S")}]
+                }]
+            REDIS_CONN.set_obj(log_key, obj, 60*10)
+        except Exception as e:
+            logging.exception(e)
+
+    def fetch_logs(self):
+        log_key = f"{self._flow_id}-{self.task_id}-logs"
+        try:
+            bin = REDIS_CONN.get(log_key)
+            if bin:
+                return json.loads(bin.encode("utf-8"))
+        except Exception as e:
+            logging.exception(e)
+        return []
+
+    def reset(self):
+        super().reset()
+        log_key = f"{self._flow_id}-{self.task_id}-logs"
+        try:
+            REDIS_CONN.set_obj(log_key, [], 60*10)
+        except Exception as e:
+            logging.exception(e)
+
+    async def run(self, **kwargs):
+        st = time.perf_counter()
+        if not self.path:
+            self.path.append("begin")
+
+        if self._doc_id:
+            DocumentService.update_by_id(self._doc_id, {
+                "progress": random.randint(0,5)/100.,
+                "progress_msg": "Start the pipeline...",
+                "process_begin_at": datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+            })
+
+        self.error = ""
+        idx = len(self.path) - 1
+        if idx == 0:
+            cpn_obj = self.get_component_obj(self.path[0])
+            await cpn_obj.invoke(**kwargs)
+            if cpn_obj.error():
+                self.error = "[ERROR]" + cpn_obj.error()
+            else:
+                idx += 1
+                self.path.extend(cpn_obj.get_downstream())
+
+        while idx < len(self.path) and not self.error:
+            last_cpn = self.get_component_obj(self.path[idx-1])
+            cpn_obj = self.get_component_obj(self.path[idx])
+            async def invoke():
+                nonlocal last_cpn, cpn_obj
+                await cpn_obj.invoke(**last_cpn.output())
+            async with trio.open_nursery() as nursery:
+                nursery.start_soon(invoke)
+            if cpn_obj.error():
+                self.error = "[ERROR]" + cpn_obj.error()
+                break
+            idx += 1
+            self.path.extend(cpn_obj.get_downstream())
+
+        if self._doc_id:
+            DocumentService.update_by_id(self._doc_id, {
+                "progress": 1 if not self.error else -1,
+                "progress_msg": "Pipeline finished...\n" + self.error,
+                "process_duration": time.perf_counter() - st
+            })
+
--- a/rag/flow/tests/client.py
+++ b/rag/flow/tests/client.py
@ -0,0 +1,57 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import argparse
+import json
+import os
+import time
+from concurrent.futures import ThreadPoolExecutor
+import trio
+from api import settings
+from rag.flow.pipeline import Pipeline
+
+
+def print_logs(pipeline):
+    last_logs = "[]"
+    while True:
+        time.sleep(5)
+        logs = pipeline.fetch_logs()
+        logs_str = json.dumps(logs)
+        if logs_str != last_logs:
+            print(logs_str)
+        last_logs = logs_str
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser()
+    dsl_default_path = os.path.join(
+        os.path.dirname(os.path.realpath(__file__)),
+        "dsl_examples",
+        "general_pdf_all.json",
+    )
+    parser.add_argument('-s', '--dsl', default=dsl_default_path, help="input dsl", action='store', required=True)
+    parser.add_argument('-d', '--doc_id', default=False, help="Document ID", action='store', required=True)
+    parser.add_argument('-t', '--tenant_id', default=False, help="Tenant ID", action='store', required=True)
+    args = parser.parse_args()
+
+    settings.init_settings()
+    pipeline = Pipeline(open(args.dsl, "r").read(), tenant_id=args.tenant_id, doc_id=args.doc_id, task_id="xxxx", flow_id="xxx")
+    pipeline.reset()
+
+    exe = ThreadPoolExecutor(max_workers=5)
+    thr = exe.submit(print_logs, pipeline)
+
+    trio.run(pipeline.run)
+    thr.result()
--- a/rag/flow/tests/dsl_examples/general_pdf_all.json
+++ b/rag/flow/tests/dsl_examples/general_pdf_all.json
@ -0,0 +1,54 @@
+{
+  "components": {
+    "begin": {
+        "obj":{
+            "component_name": "File",
+            "params": {
+            }
+        },
+        "downstream": ["parser:0"],
+        "upstream": []
+    },
+    "parser:0": {
+        "obj": {
+            "component_name": "Parser",
+            "params": {
+              "setups": {
+                "pdf": {
+                  "parse_method": "deepdoc",
+                  "vlm_name": "",
+                  "lang": "Chinese",
+                  "suffix": [
+                    "pdf"
+                  ],
+                  "output_format": "json"
+                }
+              }
+            }
+        },
+        "downstream": ["chunker:0"],
+        "upstream": ["begin"]
+    },
+    "chunker:0": {
+        "obj": {
+            "component_name": "Chunker",
+            "params": {
+              "method": "general",
+              "auto_keywords": 5
+            }
+        },
+        "downstream": ["tokenizer:0"],
+        "upstream": ["chunker:0"]
+    },
+    "tokenizer:0": {
+        "obj": {
+            "component_name": "Tokenizer",
+            "params": {
+            }
+        },
+        "downstream": [],
+        "upstream": ["chunker:0"]
+    }
+  },
+  "path": []
+}
--- a/rag/flow/tokenizer.py
+++ b/rag/flow/tokenizer.py
@ -0,0 +1,134 @@
+#
+#  Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+import random
+import re
+
+import numpy as np
+import trio
+
+from api.db import LLMType
+from api.db.services.knowledgebase_service import KnowledgebaseService
+from api.db.services.llm_service import LLMBundle
+from api.db.services.user_service import TenantService
+from api.utils.api_utils import timeout
+from rag.flow.base import ProcessBase, ProcessParamBase
+from rag.nlp import rag_tokenizer
+from rag.settings import EMBEDDING_BATCH_SIZE
+from rag.svr.task_executor import embed_limiter
+from rag.utils import truncate
+
+
+class TokenizerParam(ProcessParamBase):
+    def __init__(self):
+        super().__init__()
+        self.search_method = ["full_text", "embedding"]
+        self.filename_embd_weight = 0.1
+
+    def check(self):
+        for v in self.search_method:
+            self.check_valid_value(v.lower(), "Chunk method abnormal.", ["full_text", "embedding"])
+
+
+class Tokenizer(ProcessBase):
+    component_name = "Tokenizer"
+
+    async def _embedding(self, name, chunks):
+        parts = sum(["full_text" in self._param.search_method, "embedding" in self._param.search_method])
+        token_count = 0
+        if self._canvas._kb_id:
+            e, kb = KnowledgebaseService.get_by_id(self._canvas._kb_id)
+            embedding_id = kb.embd_id
+        else:
+            e, ten = TenantService.get_by_id(self._canvas._tenant_id)
+            embedding_id = ten.embd_id
+        embedding_model = LLMBundle(self._canvas._tenant_id, LLMType.EMBEDDING, llm_name=embedding_id)
+        texts = []
+        for c in chunks:
+            if c.get("questions"):
+                texts.append("\n".join(c["questions"]))
+            else:
+                texts.append(re.sub(r"</?(table|td|caption|tr|th)( [^<>]{0,12})?>", " ", c["text"]))
+        vts, c = embedding_model.encode([name])
+        token_count += c
+        tts = np.concatenate([vts[0] for _ in range(len(texts))], axis=0)
+
+        @timeout(60)
+        def batch_encode(txts):
+            nonlocal embedding_model
+            return embedding_model.encode([truncate(c, embedding_model.max_length-10) for c in txts])
+
+        cnts_ = np.array([])
+        for i in range(0, len(texts), EMBEDDING_BATCH_SIZE):
+            async with embed_limiter:
+                vts, c = await trio.to_thread.run_sync(lambda: batch_encode(texts[i: i + EMBEDDING_BATCH_SIZE]))
+            if len(cnts_) == 0:
+                cnts_ = vts
+            else:
+                cnts_ = np.concatenate((cnts_, vts), axis=0)
+            token_count += c
+            if i % 33 == 32:
+                self.callback(i*1./len(texts)/parts/EMBEDDING_BATCH_SIZE + 0.5*(parts-1))
+
+        cnts = cnts_
+        title_w = float(self._param.filename_embd_weight)
+        vects = (title_w * tts + (1 - title_w) * cnts) if len(tts) == len(cnts) else cnts
+
+        assert len(vects) == len(chunks)
+        for i, ck in enumerate(chunks):
+            v = vects[i].tolist()
+            ck["q_%d_vec" % len(v)] = v
+        return chunks, token_count
+
+    async def _invoke(self, **kwargs):
+        parts = sum(["full_text" in self._param.search_method, "embedding" in self._param.search_method])
+        if "full_text" in self._param.search_method:
+            self.callback(random.randint(1,5)/100., "Start to tokenize.")
+            if kwargs.get("chunks"):
+                chunks = kwargs["chunks"]
+                for i, ck in enumerate(chunks):
+                    if ck.get("questions"):
+                        ck["question_tks"] = rag_tokenizer.tokenize("\n".join(ck["questions"]))
+                    if ck.get("keywords"):
+                        ck["important_tks"] = rag_tokenizer.tokenize("\n".join(ck["keywords"]))
+                    ck["content_ltks"] = rag_tokenizer.tokenize(ck["text"])
+                    ck["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(ck["content_ltks"])
+                    if i % 100 == 99:
+                        self.callback(i*1./len(chunks)/parts)
+            elif kwargs.get("output_format") in ["markdown", "text"]:
+                ck = {
+                    "text": kwargs.get(kwargs["output_format"], "")
+                }
+                if "full_text"  in self._param.search_method:
+                    ck["content_ltks"] = rag_tokenizer.tokenize(kwargs.get(kwargs["output_format"], ""))
+                    ck["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(ck["content_ltks"])
+                chunks = [ck]
+            else:
+                chunks = kwargs["json"]
+                for i, ck in enumerate(chunks):
+                    ck["content_ltks"] = rag_tokenizer.tokenize(ck["text"])
+                    ck["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(ck["content_ltks"])
+                    if i % 100 == 99:
+                        self.callback(i*1./len(chunks)/parts)
+
+            self.callback(1./parts, "Finish tokenizing.")
+
+        if "embedding" in self._param.search_method:
+            self.callback(random.randint(1,5)/100. + 0.5*(parts-1), "Start embedding inference.")
+            chunks, token_count = await self._embedding(kwargs.get("name", ""), chunks)
+            self.set_output("embedding_token_consumption", token_count)
+
+            self.callback(1., "Finish embedding.")
+
+        self.set_output("chunks", chunks)
--- a/rag/llm/init.py
+++ b/rag/llm/init.py
@ -36,12 +36,14 @@ class SupportedLiteLLMProvider(StrEnum):
    Nvidia = "NVIDIA"
    TogetherAI = "TogetherAI"
    Anthropic = "Anthropic"
+    Ollama = "Ollama"


 FACTORY_DEFAULT_BASE_URL = {
    SupportedLiteLLMProvider.Tongyi_Qianwen: "https://dashscope.aliyuncs.com/compatible-mode/v1",
    SupportedLiteLLMProvider.Dashscope: "https://dashscope.aliyuncs.com/compatible-mode/v1",
    SupportedLiteLLMProvider.Moonshot: "https://api.moonshot.cn/v1",
+    SupportedLiteLLMProvider.Ollama: "",
 }


@ -59,6 +61,7 @@ LITELLM_PROVIDER_PREFIX = {
    SupportedLiteLLMProvider.Nvidia: "nvidia_nim/",
    SupportedLiteLLMProvider.TogetherAI: "together_ai/",
    SupportedLiteLLMProvider.Anthropic: "",  # don't need a prefix
+    SupportedLiteLLMProvider.Ollama: "ollama_chat/",
 }

 ChatModel = globals().get("ChatModel", {})
--- a/rag/llm/chat_model.py
+++ b/rag/llm/chat_model.py
@ -29,7 +29,6 @@ import json_repair
 import litellm
 import openai
 import requests
-from ollama import Client
 from openai import OpenAI
 from openai.lib.azure import AzureOpenAI
 from strenum import StrEnum
@ -112,6 +111,32 @@ class Base(ABC):
    def _clean_conf(self, gen_conf):
        if "max_tokens" in gen_conf:
            del gen_conf["max_tokens"]
+
+        allowed_conf = {
+            "temperature",
+            "max_completion_tokens",
+            "top_p",
+            "stream",
+            "stream_options",
+            "stop",
+            "n",
+            "presence_penalty",
+            "frequency_penalty",
+            "functions",
+            "function_call",
+            "logit_bias",
+            "user",
+            "response_format",
+            "seed",
+            "tools",
+            "tool_choice",
+            "logprobs",
+            "top_logprobs",
+            "extra_headers",
+        }
+
+        gen_conf = {k: v for k, v in gen_conf.items() if k in allowed_conf}
+
        return gen_conf

    def _chat(self, history, gen_conf, **kwargs):
@ -657,73 +682,6 @@ class ZhipuChat(Base):
        return super().chat_streamly_with_tools(system, history, gen_conf)


-class OllamaChat(Base):
-    _FACTORY_NAME = "Ollama"
-
-    def __init__(self, key, model_name, base_url=None, **kwargs):
-        super().__init__(key, model_name, base_url=base_url, **kwargs)
-
-        self.client = Client(host=base_url) if not key or key == "x" else Client(host=base_url, headers={"Authorization": f"Bearer {key}"})
-        self.model_name = model_name
-        self.keep_alive = kwargs.get("ollama_keep_alive", int(os.environ.get("OLLAMA_KEEP_ALIVE", -1)))
-
-    def _clean_conf(self, gen_conf):
-        options = {}
-        if "max_tokens" in gen_conf:
-            options["num_predict"] = gen_conf["max_tokens"]
-        for k in ["temperature", "top_p", "presence_penalty", "frequency_penalty"]:
-            if k not in gen_conf:
-                continue
-            options[k] = gen_conf[k]
-        return options
-
-    def _chat(self, history, gen_conf={}, **kwargs):
-        # Calculate context size
-        ctx_size = self._calculate_dynamic_ctx(history)
-
-        gen_conf["num_ctx"] = ctx_size
-        response = self.client.chat(model=self.model_name, messages=history, options=gen_conf, keep_alive=self.keep_alive)
-        ans = response["message"]["content"].strip()
-        token_count = response.get("eval_count", 0) + response.get("prompt_eval_count", 0)
-        return ans, token_count
-
-    def chat_streamly(self, system, history, gen_conf={}, **kwargs):
-        if system:
-            history.insert(0, {"role": "system", "content": system})
-        if "max_tokens" in gen_conf:
-            del gen_conf["max_tokens"]
-        try:
-            # Calculate context size
-            ctx_size = self._calculate_dynamic_ctx(history)
-            options = {"num_ctx": ctx_size}
-            if "temperature" in gen_conf:
-                options["temperature"] = gen_conf["temperature"]
-            if "max_tokens" in gen_conf:
-                options["num_predict"] = gen_conf["max_tokens"]
-            if "top_p" in gen_conf:
-                options["top_p"] = gen_conf["top_p"]
-            if "presence_penalty" in gen_conf:
-                options["presence_penalty"] = gen_conf["presence_penalty"]
-            if "frequency_penalty" in gen_conf:
-                options["frequency_penalty"] = gen_conf["frequency_penalty"]
-
-            ans = ""
-            try:
-                response = self.client.chat(model=self.model_name, messages=history, stream=True, options=options, keep_alive=self.keep_alive)
-                for resp in response:
-                    if resp["done"]:
-                        token_count = resp.get("prompt_eval_count", 0) + resp.get("eval_count", 0)
-                        yield token_count
-                    ans = resp["message"]["content"]
-                    yield ans
-            except Exception as e:
-                yield ans + "\n**ERROR**: " + str(e)
-            yield 0
-        except Exception as e:
-            yield "**ERROR**: " + str(e)
-            yield 0
-
-
 class LocalAIChat(Base):
    _FACTORY_NAME = "LocalAI"

@ -1396,7 +1354,7 @@ class Ai302Chat(Base):


 class LiteLLMBase(ABC):
-    _FACTORY_NAME = ["Tongyi-Qianwen", "Bedrock", "Moonshot", "xAI", "DeepInfra", "Groq", "Cohere", "Gemini", "DeepSeek", "NVIDIA", "TogetherAI", "Anthropic"]
+    _FACTORY_NAME = ["Tongyi-Qianwen", "Bedrock", "Moonshot", "xAI", "DeepInfra", "Groq", "Cohere", "Gemini", "DeepSeek", "NVIDIA", "TogetherAI", "Anthropic", "Ollama"]

    def __init__(self, key, model_name, base_url=None, **kwargs):
        self.timeout = int(os.environ.get("LM_TIMEOUT_SECONDS", 600))
@ -1404,7 +1362,7 @@ class LiteLLMBase(ABC):
        self.prefix = LITELLM_PROVIDER_PREFIX.get(self.provider, "")
        self.model_name = f"{self.prefix}{model_name}"
        self.api_key = key
-        self.base_url = base_url or FACTORY_DEFAULT_BASE_URL.get(self.provider, "")
+        self.base_url = (base_url or FACTORY_DEFAULT_BASE_URL.get(self.provider, "")).rstrip('/')
        # Configure retry parameters
        self.max_retries = kwargs.get("max_retries", int(os.environ.get("LLM_MAX_RETRIES", 5)))
        self.base_delay = kwargs.get("retry_interval", float(os.environ.get("LLM_BASE_DELAY", 2.0)))
--- a/rag/llm/rerank_model.py
+++ b/rag/llm/rerank_model.py
@ -44,14 +44,17 @@ class Base(ABC):
        raise NotImplementedError("Please implement encode method!")

    def total_token_count(self, resp):
-        try:
-            return resp.usage.total_tokens
-        except Exception:
-            pass
-        try:
-            return resp["usage"]["total_tokens"]
-        except Exception:
-            pass
+        if hasattr(resp, "usage") and hasattr(resp.usage, "total_tokens"):
+            try:
+                return resp.usage.total_tokens
+            except Exception:
+                pass
+
+        if 'usage' in resp and 'total_tokens' in resp['usage']:
+            try:
+                return resp["usage"]["total_tokens"]
+            except Exception:
+                pass
        return 0


--- a/rag/nlp/init.py
+++ b/rag/nlp/init.py
@ -554,8 +554,8 @@ def naive_merge(sections, chunk_token_num=128, delimiter="\n。；！？", overl
        if num_tokens_from_string(sec) < chunk_token_num:
            add_chunk(sec, pos)
            continue
-        splited_sec = re.split(r"(%s)" % dels, sec, flags=re.DOTALL)
-        for sub_sec in splited_sec:
+        split_sec = re.split(r"(%s)" % dels, sec, flags=re.DOTALL)
+        for sub_sec in split_sec:
            if re.match(f"^{dels}$", sub_sec):
                continue
            add_chunk(sub_sec, pos)
@ -563,7 +563,8 @@ def naive_merge(sections, chunk_token_num=128, delimiter="\n。；！？", overl
    return cks


-def naive_merge_with_images(texts, images, chunk_token_num=128, delimiter="\n。；！？"):
+def naive_merge_with_images(texts, images, chunk_token_num=128, delimiter="\n。；！？", overlapped_percent=0):
+    from deepdoc.parser.pdf_parser import RAGFlowPdfParser
    if not texts or len(texts) != len(images):
        return [], []
    cks = [""]
@ -578,7 +579,10 @@ def naive_merge_with_images(texts, images, chunk_token_num=128, delimiter="\n。
        if tnum < 8:
            pos = ""
        # Ensure that the length of the merged chunk does not exceed chunk_token_num
-        if cks[-1] == "" or tk_nums[-1] > chunk_token_num:
+        if cks[-1] == "" or tk_nums[-1] > chunk_token_num * (100 - overlapped_percent)/100.:
+            if cks:
+                overlapped = RAGFlowPdfParser.remove_tag(cks[-1])
+                t = overlapped[int(len(overlapped)*(100-overlapped_percent)/100.):] + t
            if t.find(pos) < 0:
                t += pos
            cks.append(t)
@ -600,14 +604,14 @@ def naive_merge_with_images(texts, images, chunk_token_num=128, delimiter="\n。
        if isinstance(text, tuple):
            text_str = text[0]
            text_pos = text[1] if len(text) > 1 else ""
-            splited_sec = re.split(r"(%s)" % dels, text_str)
-            for sub_sec in splited_sec:
+            split_sec = re.split(r"(%s)" % dels, text_str)
+            for sub_sec in split_sec:
                if re.match(f"^{dels}$", sub_sec):
                    continue
                add_chunk(sub_sec, image, text_pos)
        else:
-            splited_sec = re.split(r"(%s)" % dels, text)
-            for sub_sec in splited_sec:
+            split_sec = re.split(r"(%s)" % dels, text)
+            for sub_sec in split_sec:
                if re.match(f"^{dels}$", sub_sec):
                    continue
                add_chunk(sub_sec, image)
@ -684,8 +688,8 @@ def naive_merge_docx(sections, chunk_token_num=128, delimiter="\n。；！？"):

    dels = get_delimiters(delimiter)
    for sec, image in sections:
-        splited_sec = re.split(r"(%s)" % dels, sec)
-        for sub_sec in splited_sec:
+        split_sec = re.split(r"(%s)" % dels, sec)
+        for sub_sec in split_sec:
            if re.match(f"^{dels}$", sub_sec):
                continue
            add_chunk(sub_sec, image,"")
--- a/rag/prompts/prompts.py
+++ b/rag/prompts/prompts.py
@ -114,6 +114,8 @@ def kb_prompt(kbinfos, max_tokens, hash_id=False):
    docs = {d.id: d.meta_fields for d in docs}

    def draw_node(k, line):
+        if line is not None and not isinstance(line, str):
+            line = str(line)
        if not line:
            return ""
        return f"\n├── {k}: " + re.sub(r"\n+", " ", line, flags=re.DOTALL)
--- a/rag/raptor.py
+++ b/rag/raptor.py
@ -42,9 +42,12 @@ class RecursiveAbstractiveProcessing4TreeOrganizedRetrieval:
        self._prompt = prompt
        self._max_token = max_token

-    @timeout(60)
+    @timeout(60*20)
    async def _chat(self, system, history, gen_conf):
-        response = get_llm_cache(self._llm_model.llm_name, system, history, gen_conf)
+        response = await trio.to_thread.run_sync(
+            lambda: get_llm_cache(self._llm_model.llm_name, system, history, gen_conf)
+        )
+
        if response:
            return response
        response = await trio.to_thread.run_sync(
@ -53,19 +56,23 @@ class RecursiveAbstractiveProcessing4TreeOrganizedRetrieval:
        response = re.sub(r"^.*</think>", "", response, flags=re.DOTALL)
        if response.find("**ERROR**") >= 0:
            raise Exception(response)
-        set_llm_cache(self._llm_model.llm_name, system, response, history, gen_conf)
+        await trio.to_thread.run_sync(
+            lambda: set_llm_cache(self._llm_model.llm_name, system, response, history, gen_conf)
+        )
        return response

-    @timeout(2)
+    @timeout(20)
    async def _embedding_encode(self, txt):
-        response = get_embed_cache(self._embd_model.llm_name, txt)
+        response = await trio.to_thread.run_sync(
+            lambda: get_embed_cache(self._embd_model.llm_name, txt)
+        )
        if response is not None:
            return response
        embds, _ = await trio.to_thread.run_sync(lambda: self._embd_model.encode([txt]))
        if len(embds) < 1 or len(embds[0]) < 1:
            raise Exception("Embedding error: ")
        embds = embds[0]
-        set_embed_cache(self._embd_model.llm_name, txt, embds)
+        await trio.to_thread.run_sync(lambda: set_embed_cache(self._embd_model.llm_name, txt, embds))
        return embds

    def _get_optimal_clusters(self, embeddings: np.ndarray, random_state: int):
@ -86,7 +93,7 @@ class RecursiveAbstractiveProcessing4TreeOrganizedRetrieval:
        layers = [(0, len(chunks))]
        start, end = 0, len(chunks)

-        @timeout(60)
+        @timeout(60*20)
        async def summarize(ck_idx: list[int]):
            nonlocal chunks
            texts = [chunks[i][0] for i in ck_idx]
--- a/rag/svr/task_executor.py
+++ b/rag/svr/task_executor.py
@ -21,7 +21,7 @@ import sys
 import threading
 import time

-from api.utils.api_utils import timeout, is_strong_enough
+from api.utils.api_utils import timeout
 from api.utils.log_utils import init_root_logger, get_project_base_directory
 from graphrag.general.index import run_graphrag
 from graphrag.utils import get_llm_cache, set_llm_cache, get_tags_from_cache, set_tags_to_cache
@ -293,8 +293,7 @@ async def build_chunks(task, progress_callback):
                docs.append(d)
                return

-            output_buffer = BytesIO()
-            try:
+            with BytesIO() as output_buffer:
                if isinstance(d["image"], bytes):
                    output_buffer.write(d["image"])
                    output_buffer.seek(0)
@ -317,8 +316,6 @@ async def build_chunks(task, progress_callback):
                    d["image"].close()
                del d["image"]  # Remove image reference
                docs.append(d)
-            finally:
-                output_buffer.close()  # Ensure BytesIO is always closed
        except Exception:
            logging.exception(
                "Saving image of chunk {}/{}/{} got exception".format(task["location"], task["name"], d["id"]))
@ -478,8 +475,6 @@ async def embedding(docs, mdl, parser_config=None, callback=None):

@timeout(3600)
 async def run_raptor(row, chat_mdl, embd_mdl, vector_size, callback=None):
-    # Pressure test for GraphRAG task
-    await is_strong_enough(chat_mdl, embd_mdl)
    chunks = []
    vctr_nm = "q_%d_vec"%vector_size
    for d in settings.retrievaler.chunk_list(row["doc_id"], row["tenant_id"], [str(row["kb_id"])],
@ -553,7 +548,6 @@ async def do_handle_task(task):
    try:
        # bind embedding model
        embedding_model = LLMBundle(task_tenant_id, LLMType.EMBEDDING, llm_name=task_embedding_id, lang=task_language)
-        await is_strong_enough(None, embedding_model)
        vts, _ = embedding_model.encode(["ok"])
        vector_size = len(vts[0])
    except Exception as e:
@ -568,7 +562,6 @@ async def do_handle_task(task):
    if task.get("task_type", "") == "raptor":
        # bind LLM for raptor
        chat_model = LLMBundle(task_tenant_id, LLMType.CHAT, llm_name=task_llm_id, lang=task_language)
-        await is_strong_enough(chat_model, None)
        # run RAPTOR
        async with kg_limiter:
            chunks, token_count = await run_raptor(task, chat_model, embedding_model, vector_size, progress_callback)
@ -580,7 +573,6 @@ async def do_handle_task(task):
        graphrag_conf = task["kb_parser_config"].get("graphrag", {})
        start_ts = timer()
        chat_model = LLMBundle(task_tenant_id, LLMType.CHAT, llm_name=task_llm_id, lang=task_language)
-        await is_strong_enough(chat_model, None)
        with_resolution = graphrag_conf.get("resolution", False)
        with_community = graphrag_conf.get("community", False)
        async with kg_limiter:
--- a/Show More
+++ b/Show More