Fix admin: can't read config and empty line error (#10574 )

### What problem does this PR solve? As title. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Fix: Optimize metadata filters, add Ingestion pipeline options to agent templates page #9869 (#10572 )
2026-01-04 03:25:30 +08:00 · 2025-10-15 13:07:16 +08:00 · 2025-10-15 12:31:05 +08:00 · 2025-10-15 12:22:41 +08:00 · 2025-10-15 11:46:24 +08:00 · 2025-10-15 11:22:37 +08:00
253 changed files with 7066 additions and 2180 deletions
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@ -120,3 +120,17 @@ jobs:
          packages-dir: sdk/python/dist/
          password: ${{ secrets.PYPI_API_TOKEN }}
          verbose: true
+
+      - name: Build ragflow-cli
+        if: startsWith(github.ref, 'refs/tags/v')
+        run: |
+          cd admin/client && \
+          uv build
+
+      - name: Publish client package distributions to PyPI
+        if: startsWith(github.ref, 'refs/tags/v')
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          packages-dir: admin/client/dist/
+          password: ${{ secrets.PYPI_API_TOKEN }}
+          verbose: true
--- a/.gitignore
+++ b/.gitignore
@ -149,7 +149,7 @@ out
 # Nuxt.js build / generate output
 .nuxt
 dist
-
+ragflow_cli.egg-info
 # Gatsby files
 .cache/
 # Comment in the public line in if your project uses Gatsby and not Next.js
--- a/1
+++ b/1
@ -191,6 +191,7 @@ ENV PATH="${VIRTUAL_ENV}/bin:${PATH}"
 ENV PYTHONPATH=/ragflow/

 COPY web web
+COPY admin admin
 COPY api api
 COPY conf conf
 COPY deepdoc deepdoc
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 <div align="center">
 <a href="https://demo.ragflow.io/">
-<img src="web/src/assets/logo-with-text.png" width="520" alt="ragflow logo">
+<img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
 </a>
 </div>

@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.5">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.21.0">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -84,8 +84,8 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).

 ## 🔥 Latest Updates

+- 2025-10-15 Supports orchestrable ingestion pipeline.
 - 2025-08-08 Supports OpenAI's latest GPT-5 series models.
- 2025-08-04 Supports new models, including Kimi K2 and Grok 4.
 - 2025-08-01 Supports agentic workflow and MCP.
 - 2025-05-23 Adds a Python/JavaScript code executor component to Agent.
 - 2025-05-05 Supports cross-language query.
@ -187,7 +187,7 @@ releases! 🌟
 > All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
 > If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.

-   > The command below downloads the `v0.20.5-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.5-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5` for the full edition `v0.20.5`.
+   > The command below downloads the `v0.21.0-slim` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.21.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0` for the full edition `v0.21.0`.

   ```bash
   $ cd ragflow/docker
@ -200,8 +200,8 @@ releases! 🌟

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   |-------------------|-----------------|-----------------------|--------------------------|
-   | v0.20.5           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.5-slim      | &approx;2       | ❌                   | Stable release            |
+   | v0.21.0           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.21.0-slim      | &approx;2       | ❌                   | Stable release            |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                   | _Unstable_ nightly build  |

--- a/README_id.md
+++ b/README_id.md
@ -1,6 +1,6 @@
 <div align="center">
 <a href="https://demo.ragflow.io/">
-<img src="web/src/assets/logo-with-text.png" width="520" alt="Logo ragflow">
+<img src="web/src/assets/logo-with-text.svg" width="520" alt="Logo ragflow">
 </a>
 </div>

@ -22,7 +22,7 @@
        <img alt="Lencana Daring" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.5">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.21.0">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Rilis%20Terbaru" alt="Rilis Terbaru">
@ -80,8 +80,8 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).

 ## 🔥 Pembaruan Terbaru

+- 2025-10-15 Dukungan untuk jalur data yang terorkestrasi.
 - 2025-08-08 Mendukung model seri GPT-5 terbaru dari OpenAI.
- 2025-08-04 Mendukung model baru, termasuk Kimi K2 dan Grok 4.
 - 2025-08-01 Mendukung alur kerja agen dan MCP.
 - 2025-05-23 Menambahkan komponen pelaksana kode Python/JS ke Agen.
 - 2025-05-05 Mendukung kueri lintas bahasa.
@ -181,7 +181,7 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
 > Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
 > Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).

-> Perintah di bawah ini mengunduh edisi v0.20.5-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.20.5-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5 untuk edisi lengkap v0.20.5.
+> Perintah di bawah ini mengunduh edisi v0.21.0-slim dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.21.0-slim, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server. Misalnya, atur RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0 untuk edisi lengkap v0.21.0.

 ```bash
 $ cd ragflow/docker
@ -194,8 +194,8 @@ $ docker compose -f docker-compose.yml up -d

 | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
 | ----------------- | --------------- | --------------------- | ------------------------ |
-| v0.20.5           | &approx;9       | :heavy_check_mark:    | Stable release           |
-| v0.20.5-slim      | &approx;2       | ❌                    | Stable release           |
+| v0.21.0           | &approx;9       | :heavy_check_mark:    | Stable release           |
+| v0.21.0-slim      | &approx;2       | ❌                    | Stable release           |
 | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
 | nightly-slim      | &approx;2       | ❌                    | _Unstable_ nightly build |

--- a/README_ja.md
+++ b/README_ja.md
@ -1,6 +1,6 @@
 <div align="center">
 <a href="https://demo.ragflow.io/">
-<img src="web/src/assets/logo-with-text.png" width="350" alt="ragflow logo">
+<img src="web/src/assets/logo-with-text.svg" width="350" alt="ragflow logo">
 </a>
 </div>

@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.5">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.21.0">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -60,8 +60,8 @@

 ## 🔥 最新情報

+- 2025-10-15 オーケストレーションされたデータパイプラインのサポート。
 - 2025-08-08 OpenAI の最新 GPT-5 シリーズモデルをサポートします。
- 2025-08-04 新モデル、キミK2およびGrok 4をサポート。
 - 2025-08-01 エージェントワークフローとMCPをサポート。
 - 2025-05-23 エージェントに Python/JS コードエグゼキュータコンポーネントを追加しました。
 - 2025-05-05 言語間クエリをサポートしました。
@ -160,7 +160,7 @@
 > 現在、公式に提供されているすべての Docker イメージは x86 アーキテクチャ向けにビルドされており、ARM64 用の Docker イメージは提供されていません。
 > ARM64 アーキテクチャのオペレーティングシステムを使用している場合は、[このドキュメント](https://ragflow.io/docs/dev/build_docker_image)を参照して Docker イメージを自分でビルドしてください。

-   > 以下のコマンドは、RAGFlow Docker イメージの v0.20.5-slim エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.20.5-slim とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。例えば、完全版 v0.20.5 をダウンロードするには、RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5 と設定します。
+   > 以下のコマンドは、RAGFlow Docker イメージの v0.21.0-slim エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.21.0-slim とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。例えば、完全版 v0.21.0 をダウンロードするには、RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0 と設定します。

   ```bash
   $ cd ragflow/docker
@ -173,8 +173,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.20.5           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.5-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.21.0           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.21.0-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

--- a/README_ko.md
+++ b/README_ko.md
@ -1,6 +1,6 @@
 <div align="center">
 <a href="https://demo.ragflow.io/">
-<img src="web/src/assets/logo-with-text.png" width="520" alt="ragflow logo">
+<img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
 </a>
 </div>

@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.5">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.21.0">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -60,8 +60,8 @@

 ## 🔥 업데이트

+- 2025-10-15 조정된 데이터 파이프라인 지원.
 - 2025-08-08 OpenAI의 최신 GPT-5 시리즈 모델을 지원합니다.
- 2025-08-04 새로운 모델인 Kimi K2와 Grok 4를 포함하여 지원합니다.
 - 2025-08-01 에이전트 워크플로우와 MCP를 지원합니다.
 - 2025-05-23 Agent에 Python/JS 코드 실행기 구성 요소를 추가합니다.
 - 2025-05-05 언어 간 쿼리를 지원합니다.
@ -160,7 +160,7 @@
 > 모든 Docker 이미지는 x86 플랫폼을 위해 빌드되었습니다. 우리는 현재 ARM64 플랫폼을 위한 Docker 이미지를 제공하지 않습니다.
 > ARM64 플랫폼을 사용 중이라면, [시스템과 호환되는 Docker 이미지를 빌드하려면 이 가이드를 사용해 주세요](https://ragflow.io/docs/dev/build_docker_image).

-   > 아래 명령어는 RAGFlow Docker 이미지의 v0.20.5-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.20.5-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.20.5을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5로 설정합니다.
+   > 아래 명령어는 RAGFlow Docker 이미지의 v0.21.0-slim 버전을 다운로드합니다. 다양한 RAGFlow 버전에 대한 설명은 다음 표를 참조하십시오. v0.21.0-slim과 다른 RAGFlow 버전을 다운로드하려면, docker/.env 파일에서 RAGFLOW_IMAGE 변수를 적절히 업데이트한 후 docker compose를 사용하여 서버를 시작하십시오. 예를 들어, 전체 버전인 v0.21.0을 다운로드하려면 RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0로 설정합니다.

   ```bash
   $ cd ragflow/docker
@ -173,8 +173,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.20.5           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.5-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.21.0           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.21.0-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

--- a/README_pt_br.md
+++ b/README_pt_br.md
@ -1,6 +1,6 @@
 <div align="center">
 <a href="https://demo.ragflow.io/">
-<img src="web/src/assets/logo-with-text.png" width="520" alt="ragflow logo">
+<img src="web/src/assets/logo-with-text.svg" width="520" alt="ragflow logo">
 </a>
 </div>

@ -22,7 +22,7 @@
        <img alt="Badge Estático" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.5">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.21.0">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Última%20Relese" alt="Última Versão">
@ -80,8 +80,8 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).

 ## 🔥 Últimas Atualizações

+- 10-15-2025 Suporte para pipelines de dados orquestrados.
 - 08-08-2025 Suporta a mais recente série GPT-5 da OpenAI.
- 04-08-2025 Suporta novos modelos, incluindo Kimi K2 e Grok 4.
 - 01-08-2025 Suporta fluxo de trabalho agente e MCP.
 - 23-05-2025 Adicione o componente executor de código Python/JS ao Agente.
 - 05-05-2025 Suporte a consultas entre idiomas.
@ -180,7 +180,7 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
 > Todas as imagens Docker são construídas para plataformas x86. Atualmente, não oferecemos imagens Docker para ARM64.
 > Se você estiver usando uma plataforma ARM64, por favor, utilize [este guia](https://ragflow.io/docs/dev/build_docker_image) para construir uma imagem Docker compatível com o seu sistema.

-    > O comando abaixo baixa a edição `v0.20.5-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.20.5-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5` para a edição completa `v0.20.5`.
+    > O comando abaixo baixa a edição `v0.21.0-slim` da imagem Docker do RAGFlow. Consulte a tabela a seguir para descrições de diferentes edições do RAGFlow. Para baixar uma edição do RAGFlow diferente da `v0.21.0-slim`, atualize a variável `RAGFLOW_IMAGE` conforme necessário no **docker/.env** antes de usar `docker compose` para iniciar o servidor. Por exemplo: defina `RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0` para a edição completa `v0.21.0`.

    ```bash
    $ cd ragflow/docker
@ -193,8 +193,8 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).

    | Tag da imagem RAGFlow | Tamanho da imagem (GB) | Possui modelos de incorporação? | Estável?                 |
    | --------------------- | ---------------------- | ------------------------------- | ------------------------ |
-    | v0.20.5               | ~9                     | :heavy_check_mark:              | Lançamento estável       |
-    | v0.20.5-slim          | ~2                     | ❌                              | Lançamento estável       |
+    | v0.21.0               | ~9                     | :heavy_check_mark:              | Lançamento estável       |
+    | v0.21.0-slim          | ~2                     | ❌                              | Lançamento estável       |
    | nightly               | ~9                     | :heavy_check_mark:              | _Instável_ build noturno |
    | nightly-slim          | ~2                     | ❌                               | _Instável_ build noturno |

--- a/README_tzh.md
+++ b/README_tzh.md
@ -1,6 +1,6 @@
 <div align="center">
 <a href="https://demo.ragflow.io/">
-<img src="web/src/assets/logo-with-text.png" width="350" alt="ragflow logo">
+<img src="web/src/assets/logo-with-text.svg" width="350" alt="ragflow logo">
 </a>
 </div>

@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.5">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.21.0">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -83,8 +83,8 @@

 ## 🔥 近期更新

+- 2025-10-15 支援可編排的資料管道。
 - 2025-08-08 支援 OpenAI 最新的 GPT-5 系列模型。
- 2025-08-04 支援 Kimi K2 和 Grok 4 等模型.
 - 2025-08-01 支援 agentic workflow 和 MCP
 - 2025-05-23 為 Agent 新增 Python/JS 程式碼執行器元件。
 - 2025-05-05 支援跨語言查詢。
@ -183,7 +183,7 @@
 > 所有 Docker 映像檔都是為 x86 平台建置的。目前，我們不提供 ARM64 平台的 Docker 映像檔。
 > 如果您使用的是 ARM64 平台，請使用 [這份指南](https://ragflow.io/docs/dev/build_docker_image) 來建置適合您系統的 Docker 映像檔。

-   > 執行以下指令會自動下載 RAGFlow slim Docker 映像 `v0.20.5-slim`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.20.5-slim` 的 Docker 映像，請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。例如，你可以透過設定 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5` 來下載 RAGFlow 鏡像的 `v0.20.5` 完整發行版。
+   > 執行以下指令會自動下載 RAGFlow slim Docker 映像 `v0.21.0-slim`。請參考下表查看不同 Docker 發行版的說明。如需下載不同於 `v0.21.0-slim` 的 Docker 映像，請在執行 `docker compose` 啟動服務之前先更新 **docker/.env** 檔案內的 `RAGFLOW_IMAGE` 變數。例如，你可以透過設定 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0` 來下載 RAGFlow 鏡像的 `v0.21.0` 完整發行版。

   ```bash
   $ cd ragflow/docker
@ -196,8 +196,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.20.5           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.5-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.21.0           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.21.0-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

--- a/README_zh.md
+++ b/README_zh.md
@ -1,6 +1,6 @@
 <div align="center">
 <a href="https://demo.ragflow.io/">
-<img src="web/src/assets/logo-with-text.png" width="350" alt="ragflow logo">
+<img src="web/src/assets/logo-with-text.svg" width="350" alt="ragflow logo">
 </a>
 </div>

@ -22,7 +22,7 @@
        <img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
    </a>
    <a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
-        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.20.5">
+        <img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.21.0">
    </a>
    <a href="https://github.com/infiniflow/ragflow/releases/latest">
        <img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@ -83,8 +83,8 @@

 ## 🔥 近期更新

- 2025-08-08 支持 OpenAI 最新的 GPT-5 系列模型.
- 2025-08-04 新增对 Kimi K2 和 Grok 4 等模型的支持.
+- 2025-10-15 支持可编排的数据管道。
+- 2025-08-08 支持 OpenAI 最新的 GPT-5 系列模型。
 - 2025-08-01 支持 agentic workflow 和 MCP。
 - 2025-05-23 Agent 新增 Python/JS 代码执行器组件。
 - 2025-05-05 支持跨语言查询。
@ -183,7 +183,7 @@
 > 请注意，目前官方提供的所有 Docker 镜像均基于 x86 架构构建，并不提供基于 ARM64 的 Docker 镜像。
 > 如果你的操作系统是 ARM64 架构，请参考[这篇文档](https://ragflow.io/docs/dev/build_docker_image)自行构建 Docker 镜像。

-   > 运行以下命令会自动下载 RAGFlow slim Docker 镜像 `v0.20.5-slim`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.20.5-slim` 的 Docker 镜像，请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。比如，你可以通过设置 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5` 来下载 RAGFlow 镜像的 `v0.20.5` 完整发行版。
+   > 运行以下命令会自动下载 RAGFlow slim Docker 镜像 `v0.21.0-slim`。请参考下表查看不同 Docker 发行版的描述。如需下载不同于 `v0.21.0-slim` 的 Docker 镜像，请在运行 `docker compose` 启动服务之前先更新 **docker/.env** 文件内的 `RAGFLOW_IMAGE` 变量。比如，你可以通过设置 `RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0` 来下载 RAGFlow 镜像的 `v0.21.0` 完整发行版。

   ```bash
   $ cd ragflow/docker
@ -196,8 +196,8 @@

   | RAGFlow image tag | Image size (GB) | Has embedding models? | Stable?                  |
   | ----------------- | --------------- | --------------------- | ------------------------ |
-   | v0.20.5           | &approx;9       | :heavy_check_mark:    | Stable release           |
-   | v0.20.5-slim      | &approx;2       | ❌                    | Stable release           |
+   | v0.21.0           | &approx;9       | :heavy_check_mark:    | Stable release           |
+   | v0.21.0-slim      | &approx;2       | ❌                    | Stable release           |
   | nightly           | &approx;9       | :heavy_check_mark:    | _Unstable_ nightly build |
   | nightly-slim      | &approx;2       | ❌                     | _Unstable_ nightly build |

--- a/admin/build_cli_release.sh
+++ b/admin/build_cli_release.sh
@ -0,0 +1,47 @@
+#!/bin/bash
+
+set -e
+
+echo "🚀 Start building..."
+echo "================================"
+
+PROJECT_NAME="ragflow-cli"
+
+RELEASE_DIR="release"
+BUILD_DIR="dist"
+SOURCE_DIR="src"
+PACKAGE_DIR="ragflow_cli"
+
+echo "🧹 Clean old build folder..."
+rm -rf release/
+
+echo "📁 Prepare source code..."
+mkdir release/$PROJECT_NAME/$SOURCE_DIR -p
+cp pyproject.toml release/$PROJECT_NAME/pyproject.toml
+cp README.md release/$PROJECT_NAME/README.md
+
+mkdir release/$PROJECT_NAME/$SOURCE_DIR/$PACKAGE_DIR -p
+cp admin_client.py release/$PROJECT_NAME/$SOURCE_DIR/$PACKAGE_DIR/admin_client.py
+
+if [ -d "release/$PROJECT_NAME/$SOURCE_DIR" ]; then
+    echo "✅ source dir: release/$PROJECT_NAME/$SOURCE_DIR"
+else
+    echo "❌ source dir not exist: release/$PROJECT_NAME/$SOURCE_DIR"
+    exit 1
+fi
+
+echo "🔨 Make build file..."
+cd release/$PROJECT_NAME
+export PYTHONPATH=$(pwd)
+python -m build
+
+echo "✅ check build result..."
+if [ -d "$BUILD_DIR" ]; then
+    echo "📦 Package generated:"
+    ls -la $BUILD_DIR/
+else
+    echo "❌ Build Failed: $BUILD_DIR not exist."
+    exit 1
+fi
+
+echo "🎉 Build finished successfully!"
--- a/admin/client/README.md
+++ b/admin/client/README.md
@ -15,22 +15,48 @@ It consists of a server-side Service and a command-line client (CLI), both imple
 - **Admin Service**: A backend service that interfaces with the RAGFlow system to execute administrative operations and monitor its status.
 - **Admin CLI**: A command-line interface that allows users to connect to the Admin Service and issue commands for system management.

+
+
 ### Starting the Admin Service

-1.  Before start Admin Service, please make sure RAGFlow system is already started.
+#### Launching from source code
+
+1. Before start Admin Service, please make sure RAGFlow system is already started.
+
+2. Launch from source code:
+
+   ```bash
+   python admin/server/admin_server.py
+   ```
+   The service will start and listen for incoming connections from the CLI on the configured port. 
+
+#### Using docker image
+
+1. Before startup, please configure the `docker_compose.yml`  file to enable admin server:
+
+   ```bash
+   command:
+     - --enable-adminserver
+   ```
+
+2. Start the containers, the service will start and listen for incoming connections from the CLI on the configured port.
+

-2.  Run the service script:
-    ```bash
-    python admin/admin_server.py
-    ```
-    The service will start and listen for incoming connections from the CLI on the configured port.

 ### Using the Admin CLI

 1.  Ensure the Admin Service is running.
-2.  Launch the CLI client:
+2.  Install ragflow-cli.
    ```bash
-    python admin/admin_client.py -h 0.0.0.0 -p 9381
+    pip install ragflow-cli
+    ```
+3.  Launch the CLI client:
+    ```bash
+    ragflow-cli -h 0.0.0.0 -p 9381
+    ```
+	Enter superuser's password to login. Default password is `admin`.
+
+

 ## Supported Commands

@ -42,12 +68,7 @@ Commands are case-insensitive and must be terminated with a semicolon (`;`).
    -   Lists all available services within the RAGFlow system.
 -   `SHOW SERVICE <id>;`
    -   Shows detailed status information for the service identified by `<id>`.
-   `STARTUP SERVICE <id>;`
-    -   Attempts to start the service identified by `<id>`.
-   `SHUTDOWN SERVICE <id>;`
-    -   Attempts to gracefully shut down the service identified by `<id>`.
-   `RESTART SERVICE <id>;`
-    -   Attempts to restart the service identified by `<id>`.
+

 ### User Management Commands

@ -55,10 +76,17 @@ Commands are case-insensitive and must be terminated with a semicolon (`;`).
    -   Lists all users known to the system.
 -   `SHOW USER '<username>';`
    -   Shows details and permissions for the specified user. The username must be enclosed in single or double quotes.
+
+- `CREATE USER <username> <password>;`
+  - Create user by username and password. The username and password must be enclosed in single or double quotes.
+
 -   `DROP USER '<username>';`
    -   Removes the specified user from the system. Use with caution.
 -   `ALTER USER PASSWORD '<username>' '<new_password>';`
    -   Changes the password for the specified user.
+-   `ALTER USER ACTIVE <username> <on/off>;`
+    -   Changes the user to active or inactive.
+

 ### Data and Agent Commands

--- a/admin/client/admin_client.py
+++ b/admin/client/admin_client.py
@ -16,14 +16,14 @@

 import argparse
 import base64
+from cmd import Cmd

 from Cryptodome.PublicKey import RSA
 from Cryptodome.Cipher import PKCS1_v1_5 as Cipher_pkcs1_v1_5
 from typing import Dict, List, Any
-from lark import Lark, Transformer, Tree
+from lark import Lark, Transformer, Tree, Token
 import requests
 from requests.auth import HTTPBasicAuth
-from api.common.base64 import encode_to_base64

 GRAMMAR = r"""
 start: command
@ -192,12 +192,59 @@ def encrypt(input_string):
    return base64.b64encode(cipher_text).decode("utf-8")


-class AdminCommandParser:
+def encode_to_base64(input_string):
+    base64_encoded = base64.b64encode(input_string.encode('utf-8'))
+    return base64_encoded.decode('utf-8')
+
+
+class AdminCLI(Cmd):
    def __init__(self):
+        super().__init__()
        self.parser = Lark(GRAMMAR, start='start', parser='lalr', transformer=AdminTransformer())
        self.command_history = []
+        self.is_interactive = False
+        self.admin_account = "admin@ragflow.io"
+        self.admin_password: str = "admin"
+        self.host: str = ""
+        self.port: int = 0

-    def parse_command(self, command_str: str) -> Dict[str, Any]:
+    intro = r"""Type "\h" for help."""
+    prompt = "admin> "
+
+    def onecmd(self, command: str) -> bool:
+        try:
+            # print(f"command: {command}")
+            result = self.parse_command(command)
+
+            # if 'type' in result and result.get('type') == 'empty':
+            #     return False
+
+            if isinstance(result, dict):
+                if 'type' in result and result.get('type') == 'empty':
+                    return False
+
+            self.execute_command(result)
+
+            if isinstance(result, Tree):
+                return False
+
+            if result.get('type') == 'meta' and result.get('command') in ['q', 'quit', 'exit']:
+                return True
+
+        except KeyboardInterrupt:
+            print("\nUse '\\q' to quit")
+        except EOFError:
+            print("\nGoodbye!")
+            return True
+        return False
+
+    def emptyline(self) -> bool:
+        return False
+
+    def default(self, line: str) -> bool:
+        return self.onecmd(line)
+
+    def parse_command(self, command_str: str) -> dict[str, str] | Tree[Token]:
        if not command_str.strip():
            return {'type': 'empty'}

@ -209,16 +256,6 @@ class AdminCommandParser:
        except Exception as e:
            return {'type': 'error', 'message': f'Parse error: {str(e)}'}

-
-class AdminCLI:
-    def __init__(self):
-        self.parser = AdminCommandParser()
-        self.is_interactive = False
-        self.admin_account = "admin@ragflow.io"
-        self.admin_password: str = "admin"
-        self.host: str = ""
-        self.port: int = 0
-
    def verify_admin(self, args):

        conn_info = self._parse_connection_args(args)
@ -267,10 +304,25 @@ class AdminCLI:
        columns = list(data[0].keys())
        col_widths = {}

+        def get_string_width(text):
+            half_width_chars = (
+                " !\"#$%&'()*+,-./0123456789:;<=>?@"
+                "ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`"
+                "abcdefghijklmnopqrstuvwxyz{|}~"
+                "\t\n\r"
+            )
+            width = 0
+            for char in text:
+                if char in half_width_chars:
+                    width += 1
+                else:
+                    width += 2
+            return width
+
        for col in columns:
-            max_width = len(str(col))
+            max_width = get_string_width(str(col))
            for item in data:
-                value_len = len(str(item.get(col, '')))
+                value_len = get_string_width(str(item.get(col, '')))
                if value_len > max_width:
                    max_width = value_len
            col_widths[col] = max(2, max_width)
@ -308,7 +360,7 @@ class AdminCLI:
                    continue

                print(f"command: {command}")
-                result = self.parser.parse_command(command)
+                result = self.parse_command(command)
                self.execute_command(result)

                if isinstance(result, Tree):
@ -595,10 +647,17 @@ def main():
        /_/ |_/_/  |_\____/_/   /_/\____/|__/|__/  /_/  |_\__,_/_/ /_/ /_/_/_/ /_/ 
        """)
        if cli.verify_admin(sys.argv):
-            cli.run_interactive()
+            cli.cmdloop()
    else:
+        print(r"""
+            ____  ___   ______________                 ___       __          _     
+           / __ \/   | / ____/ ____/ /___ _      __   /   | ____/ /___ ___  (_)___ 
+          / /_/ / /| |/ / __/ /_  / / __ \ | /| / /  / /| |/ __  / __ `__ \/ / __ \
+         / _, _/ ___ / /_/ / __/ / / /_/ / |/ |/ /  / ___ / /_/ / / / / / / / / / /
+        /_/ |_/_/  |_\____/_/   /_/\____/|__/|__/  /_/  |_\__,_/_/ /_/ /_/_/_/ /_/ 
+        """)
        if cli.verify_admin(sys.argv):
-            cli.run_interactive()
+            cli.cmdloop()
            # cli.run_single_command(sys.argv[1:])


--- a/admin/client/pyproject.toml
+++ b/admin/client/pyproject.toml
@ -0,0 +1,24 @@
+[project]
+name = "ragflow-cli"
+version = "0.21.0.dev5"
+description = "Admin Service's client of [RAGFlow](https://github.com/infiniflow/ragflow). The Admin Service provides user management and system monitoring. "
+authors = [{ name = "Lynn", email = "lynn_inf@hotmail.com" }]
+license = { text = "Apache License, Version 2.0" }
+readme = "README.md"
+requires-python = ">=3.10,<3.13"
+dependencies = [
+    "requests>=2.30.0,<3.0.0",
+    "beartype>=0.18.5,<0.19.0",
+    "pycryptodomex>=3.10.0",
+    "lark>=1.1.0",
+]
+
+[dependency-groups]
+test = [
+    "pytest>=8.3.5",
+    "requests>=2.32.3",
+    "requests-toolbelt>=1.0.0",
+]
+
+[project.scripts]
+ragflow-cli = "admin_client:main"
--- a/admin/pyproject.toml
+++ b/admin/pyproject.toml
@ -0,0 +1,24 @@
+[project]
+name = "ragflow-cli"
+version = "0.21.0.dev2"
+description = "Admin Service's client of [RAGFlow](https://github.com/infiniflow/ragflow). The Admin Service provides user management and system monitoring. "
+authors = [{ name = "Lynn", email = "lynn_inf@hotmail.com" }]
+license = { text = "Apache License, Version 2.0" }
+readme = "README.md"
+requires-python = ">=3.10,<3.13"
+dependencies = [
+    "requests>=2.30.0,<3.0.0",
+    "beartype>=0.18.5,<0.19.0",
+    "pycryptodomex>=3.10.0",
+    "lark>=1.1.0",
+]
+
+[dependency-groups]
+test = [
+    "pytest>=8.3.5",
+    "requests>=2.32.3",
+    "requests-toolbelt>=1.0.0",
+]
+
+[project.scripts]
+ragflow-cli = "ragflow_cli.admin_client:main"
--- a/admin/server/admin_server.py
+++ b/admin/server/admin_server.py
@ -26,7 +26,7 @@ from routes import admin_bp
 from api.utils.log_utils import init_root_logger
 from api.constants import SERVICE_CONF
 from api import settings
-from config import load_configurations, SERVICE_CONFIGS
+from admin.server.config import load_configurations, SERVICE_CONFIGS

 stop_event = threading.Event()

--- a/admin/server/auth.py
+++ b/admin/server/auth.py
--- a/admin/server/config.py
+++ b/admin/server/config.py
--- a/admin/server/exceptions.py
+++ b/admin/server/exceptions.py
--- a/admin/server/models.py
+++ b/admin/server/models.py
--- a/admin/server/responses.py
+++ b/admin/server/responses.py
--- a/admin/server/routes.py
+++ b/admin/server/routes.py
@ -17,7 +17,7 @@

 from flask import Blueprint, request

-from auth import login_verify
+from admin.server.auth import login_verify
 from responses import success_response, error_response
 from services import UserMgr, ServiceMgr, UserServiceMgr
 from api.common.exceptions import AdminException
--- a/admin/server/services.py
+++ b/admin/server/services.py
@ -27,7 +27,7 @@ from api.utils.crypt import decrypt
 from api.utils import health_utils

 from api.common.exceptions import AdminException, UserAlreadyExistsError, UserNotFoundError
-from config import SERVICE_CONFIGS
+from admin.server.config import SERVICE_CONFIGS


 class UserMgr:
@ -177,8 +177,17 @@ class ServiceMgr:
    def get_all_services():
        result = []
        configs = SERVICE_CONFIGS.configs
-        for config in configs:
-            result.append(config.to_dict())
+        for service_id, config in enumerate(configs):
+            config_dict = config.to_dict()
+            try:
+                service_detail = ServiceMgr.get_service_details(service_id)
+                if service_detail['alive']:
+                    config_dict['status'] = 'Alive'
+                else:
+                    config_dict['status'] = 'Timeout'
+            except Exception:
+                config_dict['status'] = 'Timeout'
+            result.append(config_dict)
        return result

    @staticmethod
--- a/agent/canvas.py
+++ b/agent/canvas.py
@ -203,7 +203,6 @@ class Canvas(Graph):
            self.history = []
            self.retrieval = []
            self.memory = []
-
        for k in self.globals.keys():
            if isinstance(self.globals[k], str):
                self.globals[k] = ""
@ -292,7 +291,6 @@ class Canvas(Graph):
                    "thoughts": self.get_component_thoughts(self.path[i])
                })
            _run_batch(idx, to)
-
            # post processing of components invocation
            for i in range(idx, to):
                cpn = self.get_component(self.path[i])
@ -393,7 +391,6 @@ class Canvas(Graph):
                self.path = path
                yield decorate("user_inputs", {"inputs": another_inputs, "tips": tips})
                return
-
        self.path = self.path[:idx]
        if not self.error:
            yield decorate("workflow_finished",
--- a/agent/component/agent_with_tools.py
+++ b/agent/component/agent_with_tools.py
@ -346,7 +346,11 @@ Respond immediately with your final comprehensive answer.

        return "Error occurred."

-    def reset(self):
+    def reset(self, temp=False):
+        """
+        Reset all tools if they have a reset method. This avoids errors for tools like MCPToolCallSession.
+        """
        for k, cpn in self.tools.items():
-            cpn.reset()
+            if hasattr(cpn, "reset") and callable(cpn.reset):
+                cpn.reset()

--- a/agent/templates/advanced_ingestion_pipeline.json
+++ b/agent/templates/advanced_ingestion_pipeline.json
--- a/agent/templates/chunk_summary.json
+++ b/agent/templates/chunk_summary.json
--- a/agent/templates/image_lingo.json
+++ b/agent/templates/image_lingo.json
--- a/agent/templates/stock_research_report.json
+++ b/agent/templates/stock_research_report.json
--- a/agent/templates/title_chunker.json
+++ b/agent/templates/title_chunker.json
--- a/agent/tools/exesql.py
+++ b/agent/tools/exesql.py
@ -53,12 +53,13 @@ class ExeSQLParam(ToolParamBase):
        self.max_records = 1024

    def check(self):
-        self.check_valid_value(self.db_type, "Choose DB type", ['mysql', 'postgres', 'mariadb', 'mssql', 'IBM DB2'])
+        self.check_valid_value(self.db_type, "Choose DB type", ['mysql', 'postgres', 'mariadb', 'mssql', 'IBM DB2', 'trino'])
        self.check_empty(self.database, "Database name")
        self.check_empty(self.username, "database username")
        self.check_empty(self.host, "IP Address")
        self.check_positive_integer(self.port, "IP Port")
-        self.check_empty(self.password, "Database password")
+        if self.db_type != "trino":
+            self.check_empty(self.password, "Database password")
        self.check_positive_integer(self.max_records, "Maximum number of records")
        if self.database == "rag_flow":
            if self.host == "ragflow-mysql":
@ -123,6 +124,45 @@ class ExeSQL(ToolBase, ABC):
                    r'PWD=' + self._param.password
            )
            db = pyodbc.connect(conn_str)
+        elif self._param.db_type == 'trino':
+            try:
+                import trino
+                from trino.auth import BasicAuthentication
+            except Exception:
+                raise Exception("Missing dependency 'trino'. Please install: pip install trino")
+
+            def _parse_catalog_schema(db: str):
+                if not db:
+                    return None, None
+                if "." in db:
+                    c, s = db.split(".", 1)
+                elif "/" in db:
+                    c, s = db.split("/", 1)
+                else:
+                    c, s = db, "default"
+                return c, s
+
+            catalog, schema = _parse_catalog_schema(self._param.database)
+            if not catalog:
+                raise Exception("For Trino, `database` must be 'catalog.schema' or at least 'catalog'.")
+
+            http_scheme = "https" if os.environ.get("TRINO_USE_TLS", "0") == "1" else "http"
+            auth = None
+            if http_scheme == "https" and self._param.password:
+                auth = BasicAuthentication(self._param.username, self._param.password)
+
+            try:
+                db = trino.dbapi.connect(
+                    host=self._param.host,
+                    port=int(self._param.port or 8080),
+                    user=self._param.username or "ragflow",
+                    catalog=catalog,
+                    schema=schema or "default",
+                    http_scheme=http_scheme,
+                    auth=auth
+                )
+            except Exception as e:
+                raise Exception("Database Connection Failed! \n" + str(e))
        elif self._param.db_type == 'IBM DB2':
            import ibm_db
            conn_str = (
--- a/agent/tools/pubmed.py
+++ b/agent/tools/pubmed.py
@ -85,13 +85,7 @@ class PubMed(ToolBase, ABC):
                self._retrieve_chunks(pubmedcnt.findall("PubmedArticle"),
                                      get_title=lambda child: child.find("MedlineCitation").find("Article").find("ArticleTitle").text,
                                      get_url=lambda child: "https://pubmed.ncbi.nlm.nih.gov/" + child.find("MedlineCitation").find("PMID").text,
-                                      get_content=lambda child: child.find("MedlineCitation") \
-                                                                    .find("Article") \
-                                                                    .find("Abstract") \
-                                                                    .find("AbstractText").text \
-                                                                    if child.find("MedlineCitation")\
-                                                                            .find("Article").find("Abstract")  \
-                                                                    else "No abstract available")
+                                      get_content=lambda child: self._format_pubmed_content(child),)
                return self.output("formalized_content")
            except Exception as e:
                last_e = e
@ -104,5 +98,50 @@ class PubMed(ToolBase, ABC):

        assert False, self.output()

+    def _format_pubmed_content(self, child):
+        """Extract structured reference info from PubMed XML"""
+        def safe_find(path):
+            node = child
+            for p in path.split("/"):
+                if node is None:
+                    return None
+                node = node.find(p)
+            return node.text if node is not None and node.text else None
+
+        title = safe_find("MedlineCitation/Article/ArticleTitle") or "No title"
+        abstract = safe_find("MedlineCitation/Article/Abstract/AbstractText") or "No abstract available"
+        journal = safe_find("MedlineCitation/Article/Journal/Title") or "Unknown Journal"
+        volume = safe_find("MedlineCitation/Article/Journal/JournalIssue/Volume") or "-"
+        issue = safe_find("MedlineCitation/Article/Journal/JournalIssue/Issue") or "-"
+        pages = safe_find("MedlineCitation/Article/Pagination/MedlinePgn") or "-"
+
+        # Authors
+        authors = []
+        for author in child.findall(".//AuthorList/Author"):
+            lastname = safe_find("LastName") or ""
+            forename = safe_find("ForeName") or ""
+            fullname = f"{forename} {lastname}".strip()
+            if fullname:
+                authors.append(fullname)
+        authors_str = ", ".join(authors) if authors else "Unknown Authors"
+
+        # DOI
+        doi = None
+        for eid in child.findall(".//ArticleId"):
+            if eid.attrib.get("IdType") == "doi":
+                doi = eid.text
+                break
+
+        return (
+            f"Title: {title}\n"
+            f"Authors: {authors_str}\n"
+            f"Journal: {journal}\n"
+            f"Volume: {volume}\n"
+            f"Issue: {issue}\n"
+            f"Pages: {pages}\n"
+            f"DOI: {doi or '-'}\n"
+            f"Abstract: {abstract.strip()}"
+        )
+
    def thoughts(self) -> str:
        return "Looking for scholarly papers on `{}`,” prioritising reputable sources.".format(self.get_input().get("query", "-_-!"))
--- a/agent/tools/retrieval.py
+++ b/agent/tools/retrieval.py
@ -57,6 +57,7 @@ class RetrievalParam(ToolParamBase):
        self.empty_response = ""
        self.use_kg = False
        self.cross_languages = []
+        self.toc_enhance = False

    def check(self):
        self.check_decimal_float(self.similarity_threshold, "[Retrieval] Similarity threshold")
@ -134,6 +135,11 @@ class Retrieval(ToolBase, ABC):
                rerank_mdl=rerank_mdl,
                rank_feature=label_question(query, kbs),
            )
+            if self._param.toc_enhance:
+                chat_mdl = LLMBundle(self._canvas._tenant_id, LLMType.CHAT)
+                cks = settings.retriever.retrieval_by_toc(query, kbinfos["chunks"], [kb.tenant_id for kb in kbs], chat_mdl, self._param.top_n)
+                if cks:
+                    kbinfos["chunks"] = cks
            if self._param.use_kg:
                ck = settings.kg_retriever.retrieval(query,
                                                       [kb.tenant_id for kb in kbs],
--- a/api/apps/canvas_app.py
+++ b/api/apps/canvas_app.py
@ -51,7 +51,7 @@ from rag.utils.redis_conn import REDIS_CONN
@manager.route('/templates', methods=['GET'])  # noqa: F821
@login_required
 def templates():
-    return get_json_result(data=[c.to_dict() for c in CanvasTemplateService.query(canvas_category=CanvasCategory.Agent)])
+    return get_json_result(data=[c.to_dict() for c in CanvasTemplateService.get_all()])


@manager.route('/rm', methods=['POST'])  # noqa: F821
@ -409,6 +409,49 @@ def test_db_connect():
            ibm_db.fetch_assoc(stmt)
            ibm_db.close(conn)
            return get_json_result(data="Database Connection Successful!")
+        elif req["db_type"] == 'trino':
+            def _parse_catalog_schema(db: str):
+                if not db:
+                    return None, None
+                if "." in db:
+                    c, s = db.split(".", 1)
+                elif "/" in db:
+                    c, s = db.split("/", 1)
+                else:
+                    c, s = db, "default"
+                return c, s
+            try:
+                import trino
+                import os
+                from trino.auth import BasicAuthentication
+            except Exception:
+                return server_error_response("Missing dependency 'trino'. Please install: pip install trino")
+
+            catalog, schema = _parse_catalog_schema(req["database"])
+            if not catalog:
+                return server_error_response("For Trino, 'database' must be 'catalog.schema' or at least 'catalog'.")
+            
+            http_scheme = "https" if os.environ.get("TRINO_USE_TLS", "0") == "1" else "http"
+
+            auth = None
+            if http_scheme == "https" and req.get("password"):
+                auth = BasicAuthentication(req.get("username") or "ragflow", req["password"])
+
+            conn = trino.dbapi.connect(
+                host=req["host"],
+                port=int(req["port"] or 8080),
+                user=req["username"] or "ragflow",
+                catalog=catalog,
+                schema=schema or "default",
+                http_scheme=http_scheme,
+                auth=auth
+            )
+            cur = conn.cursor()
+            cur.execute("SELECT 1")
+            cur.fetchall()
+            cur.close()
+            conn.close()
+            return get_json_result(data="Database Connection Successful!")
        else:
            return server_error_response("Unsupported database type.")
        if req["db_type"] != 'mssql':
--- a/api/apps/document_app.py
+++ b/api/apps/document_app.py
@ -568,7 +568,7 @@ def change_parser():

    def reset_doc():
        nonlocal doc
-        e = DocumentService.update_by_id(doc.id, {"parser_id": req["parser_id"], "progress": 0, "progress_msg": "", "run": TaskStatus.UNSTART.value})
+        e = DocumentService.update_by_id(doc.id, {"pipeline_id": req["pipeline_id"], "parser_id": req["parser_id"], "progress": 0, "progress_msg": "", "run": TaskStatus.UNSTART.value})
        if not e:
            return get_data_error_result(message="Document not found!")
        if doc.token_num > 0:
--- a/api/apps/kb_app.py
+++ b/api/apps/kb_app.py
@ -36,6 +36,7 @@ from api import settings
 from rag.nlp import search
 from api.constants import DATASET_NAME_LIMIT
 from rag.settings import PAGERANK_FLD
+from rag.utils.redis_conn import REDIS_CONN
 from rag.utils.storage_factory import STORAGE_IMPL


@ -187,6 +188,9 @@ def detail():
            return get_data_error_result(
                message="Can't find this knowledgebase!")
        kb["size"] = DocumentService.get_total_size_by_kb_id(kb_id=kb["id"],keywords="", run_status=[], types=[])
+        for key in ["graphrag_task_finish_at", "raptor_task_finish_at", "mindmap_task_finish_at"]:
+            if finish_at := kb.get(key):
+                kb[key] = finish_at.strftime("%Y-%m-%d %H:%M:%S")
        return get_json_result(data=kb)
    except Exception as e:
        return server_error_response(e)
@ -760,18 +764,25 @@ def delete_kb_task():
    match pipeline_task_type:
        case PipelineTaskType.GRAPH_RAG:
            settings.docStoreConn.delete({"knowledge_graph_kwd": ["graph", "subgraph", "entity", "relation"]}, search.index_name(kb.tenant_id), kb_id)
-            kb_task_id = "graphrag_task_id"
+            kb_task_id_field = "graphrag_task_id"
+            task_id = kb.graphrag_task_id
            kb_task_finish_at = "graphrag_task_finish_at"
        case PipelineTaskType.RAPTOR:
-            kb_task_id = "raptor_task_id"
+            kb_task_id_field = "raptor_task_id"
+            task_id = kb.raptor_task_id
            kb_task_finish_at = "raptor_task_finish_at"
        case PipelineTaskType.MINDMAP:
-            kb_task_id = "mindmap_task_id"
+            kb_task_id_field = "mindmap_task_id"
+            task_id = kb.mindmap_task_id
            kb_task_finish_at = "mindmap_task_finish_at"
        case _:
            return get_error_data_result(message="Internal Error: Invalid task type")

-    ok = KnowledgebaseService.update_by_id(kb_id, {kb_task_id: "", kb_task_finish_at: None})
+    def cancel_task(task_id):
+        REDIS_CONN.set(f"{task_id}-cancel", "x")
+    cancel_task(task_id)
+
+    ok = KnowledgebaseService.update_by_id(kb_id, {kb_task_id_field: "", kb_task_finish_at: None})
    if not ok:
        return server_error_response(f"Internal error: cannot delete task {pipeline_task_type}")

--- a/api/apps/sdk/dify_retrieval.py
+++ b/api/apps/sdk/dify_retrieval.py
@ -31,6 +31,89 @@ from api.db.services.dialog_service import meta_filter, convert_conditions
@apikey_required
@validate_request("knowledge_id", "query")
 def retrieval(tenant_id):
+    """
+    Dify-compatible retrieval API
+    ---
+    tags:
+      - SDK
+    security:
+      - ApiKeyAuth: []
+    parameters:
+      - in: body
+        name: body
+        required: true
+        schema:
+          type: object
+          required:
+            - knowledge_id
+            - query
+          properties:
+            knowledge_id:
+              type: string
+              description: Knowledge base ID
+            query:
+              type: string
+              description: Query text
+            use_kg:
+              type: boolean
+              description: Whether to use knowledge graph
+              default: false
+            retrieval_setting:
+              type: object
+              description: Retrieval configuration
+              properties:
+                score_threshold:
+                  type: number
+                  description: Similarity threshold
+                  default: 0.0
+                top_k:
+                  type: integer
+                  description: Number of results to return
+                  default: 1024
+            metadata_condition:
+              type: object
+              description: Metadata filter condition
+              properties:
+                conditions:
+                  type: array
+                  items:
+                    type: object
+                    properties:
+                      name:
+                        type: string
+                        description: Field name
+                      comparison_operator:
+                        type: string
+                        description: Comparison operator
+                      value:
+                        type: string
+                        description: Field value
+    responses:
+      200:
+        description: Retrieval succeeded
+        schema:
+          type: object
+          properties:
+            records:
+              type: array
+              items:
+                type: object
+                properties:
+                  content:
+                    type: string
+                    description: Content text
+                  score:
+                    type: number
+                    description: Similarity score
+                  title:
+                    type: string
+                    description: Document title
+                  metadata:
+                    type: object
+                    description: Metadata info
+      404:
+        description: Knowledge base or document not found
+    """
    req = request.json
    question = req["query"]
    kb_id = req["knowledge_id"]
--- a/api/apps/sdk/doc.py
+++ b/api/apps/sdk/doc.py
@ -458,7 +458,7 @@ def list_docs(dataset_id, tenant_id):
        required: false
        default: true
        description: Order in descending.
-    - in: query
+      - in: query
        name: create_time_from
        type: integer
        required: false
--- a/api/apps/sdk/files.py
+++ b/api/apps/sdk/files.py
@ -62,22 +62,22 @@ def upload(tenant_id):
          type: object
          properties:
            data:
-            type: array
-            items:
-              type: object
-              properties:
-                id:
-                  type: string
-                  description: File ID
-                name:
-                  type: string
-                  description: File name
-                size:
-                  type: integer
-                  description: File size in bytes
-                type:
-                  type: string
-                  description: File type (e.g., document, folder)
+              type: array
+              items:
+                type: object
+                properties:
+                  id:
+                    type: string
+                    description: File ID
+                  name:
+                    type: string
+                    description: File name
+                  size:
+                    type: integer
+                    description: File size in bytes
+                  type:
+                    type: string
+                    description: File type (e.g., document, folder)
    """
    pf_id = request.form.get("parent_id")

--- a/api/db/db_models.py
+++ b/api/db/db_models.py
@ -641,7 +641,7 @@ class TenantLLM(DataBaseModel):
    llm_factory = CharField(max_length=128, null=False, help_text="LLM factory name", index=True)
    model_type = CharField(max_length=128, null=True, help_text="LLM, Text Embedding, Image2Text, ASR", index=True)
    llm_name = CharField(max_length=128, null=True, help_text="LLM name", default="", index=True)
-    api_key = CharField(max_length=2048, null=True, help_text="API KEY", index=True)
+    api_key = TextField(null=True, help_text="API KEY")
    api_base = CharField(max_length=255, null=True, help_text="API Base")
    max_tokens = IntegerField(default=8192, index=True)
    used_tokens = IntegerField(default=0, index=True)
@ -1142,4 +1142,8 @@ def migrate_db():
        migrate(migrator.add_column("knowledgebase", "mindmap_task_finish_at", CharField(null=True)))
    except Exception:
        pass
+    try:
+        migrate(migrator.alter_column_type("tenant_llm", "api_key", TextField(null=True, help_text="API KEY")))
+    except Exception:
+        pass
    logging.disable(logging.NOTSET)
--- a/api/db/services/canvas_service.py
+++ b/api/db/services/canvas_service.py
@ -143,15 +143,12 @@ class UserCanvasService(CommonService):
        ]
        if keywords:
            agents = cls.model.select(*fields).join(User, on=(cls.model.user_id == User.id)).where(
-                cls.model.user_id.in_(joined_tenant_ids),
-                fn.LOWER(cls.model.title).contains(keywords.lower())
-                #(((cls.model.user_id.in_(joined_tenant_ids)) & (cls.model.permission == TenantPermission.TEAM.value)) | (cls.model.user_id == user_id)),
-                #(fn.LOWER(cls.model.title).contains(keywords.lower()))
+                (((cls.model.user_id.in_(joined_tenant_ids)) & (cls.model.permission == TenantPermission.TEAM.value)) | (cls.model.user_id == user_id)),
+                (fn.LOWER(cls.model.title).contains(keywords.lower()))
            )
        else:
            agents = cls.model.select(*fields).join(User, on=(cls.model.user_id == User.id)).where(
-                cls.model.user_id.in_(joined_tenant_ids)
-                #(((cls.model.user_id.in_(joined_tenant_ids)) & (cls.model.permission == TenantPermission.TEAM.value)) | (cls.model.user_id == user_id))
+                (((cls.model.user_id.in_(joined_tenant_ids)) & (cls.model.permission == TenantPermission.TEAM.value)) | (cls.model.user_id == user_id))
            )
        if canvas_category:
            agents = agents.where(cls.model.canvas_category == canvas_category)
--- a/api/db/services/dialog_service.py
+++ b/api/db/services/dialog_service.py
@ -466,6 +466,10 @@ def chat(dialog, messages, stream=True, **kwargs):
                    rerank_mdl=rerank_mdl,
                    rank_feature=label_question(" ".join(questions), kbs),
                )
+                if prompt_config.get("toc_enhance"):
+                    cks = retriever.retrieval_by_toc(" ".join(questions), kbinfos["chunks"], tenant_ids, chat_mdl, dialog.top_n)
+                    if cks:
+                        kbinfos["chunks"] = cks
            if prompt_config.get("tavily_api_key"):
                tav = Tavily(prompt_config["tavily_api_key"])
                tav_res = tav.retrieve_chunks(" ".join(questions))
--- a/api/db/services/knowledgebase_service.py
+++ b/api/db/services/knowledgebase_service.py
@ -397,9 +397,10 @@ class KnowledgebaseService(CommonService):
        else:
            kbs = kbs.order_by(cls.model.getter_by(orderby).asc())

+        total = kbs.count()
        kbs = kbs.paginate(page_number, items_per_page)

-        return list(kbs.dicts()), kbs.count()
+        return list(kbs.dicts()), total

    @classmethod
    @DB.connection_context()
--- a/api/utils/api_utils.py
+++ b/api/utils/api_utils.py
@ -51,9 +51,6 @@ from api import settings
 from api.constants import REQUEST_MAX_WAIT_SEC, REQUEST_WAIT_SEC
 from api.db import ActiveEnum
 from api.db.db_models import APIToken
-from api.db.services import UserService
-from api.db.services.llm_service import LLMService
-from api.db.services.tenant_llm_service import TenantLLMService
 from api.utils.json import CustomJSONEncoder, json_dumps
 from api.utils import get_uuid
 from rag.utils.mcp_tool_call_conn import MCPToolCallSession, close_multiple_mcp_toolcall_sessions
@ -154,10 +151,12 @@ def get_data_error_result(code=settings.RetCode.DATA_ERROR, message="Sorry! Data
 def server_error_response(e):
    logging.exception(e)
    try:
-        if e.code == 401:
-            return get_json_result(code=401, message=repr(e))
-    except BaseException:
-        pass
+        msg = repr(e).lower()
+        if getattr(e, "code", None) == 401 or ("unauthorized" in msg) or ("401" in msg):
+            return get_json_result(code=settings.RetCode.UNAUTHORIZED, message=repr(e))
+    except Exception as ex:
+        logging.warning(f"error checking authorization: {ex}")
+
    if len(e.args) > 1:
        try:
            serialized_data = serialize_for_json(e.args[1])
@ -239,6 +238,7 @@ def not_allowed_parameters(*params):
 def active_required(f):
    @wraps(f)
    def wrapper(*args, **kwargs):
+        from api.db.services import UserService
        user_id = current_user.id
        usr = UserService.filter_by_id(user_id)
        # check is_active
@ -544,6 +544,8 @@ def check_duplicate_ids(ids, id_type="item"):


 def verify_embedding_availability(embd_id: str, tenant_id: str) -> tuple[bool, Response | None]:
+    from api.db.services.llm_service import LLMService
+    from api.db.services.tenant_llm_service import TenantLLMService
    """
    Verifies availability of an embedding model for a specific tenant.

--- a/conf/llm_factories.json
+++ b/conf/llm_factories.json
@ -803,6 +803,12 @@
                    "tags": "TEXT EMBEDDING",
                    "max_tokens": 512,
                    "model_type": "embedding"
+                },
+                {
+                    "llm_name": "glm-asr",
+                    "tags": "SPEECH2TEXT",
+                    "max_tokens": 4096,
+                    "model_type": "speech2text"
                }
            ]
        },
@ -2816,6 +2822,13 @@
            "tags": "LLM,TEXT EMBEDDING,TEXT RE-RANK,IMAGE2TEXT",
            "status": "1",
            "llm": [
+                {
+                    "llm_name":"THUDM/GLM-4.1V-9B-Thinking",
+                    "tags":"LLM,CHAT,IMAGE2TEXT, 64k",
+                    "max_tokens":64000,
+                    "model_type":"chat",
+                    "is_tools": false
+                },
                {
                    "llm_name": "Qwen/Qwen3-Embedding-8B",
                    "tags": "TEXT EMBEDDING,TEXT RE-RANK,32k",
@ -3145,13 +3158,6 @@
                    "model_type": "chat",
                    "is_tools": true
                },
-                {
-                    "llm_name": "Qwen/Qwen2-1.5B-Instruct",
-                    "tags": "LLM,CHAT,32k",
-                    "max_tokens": 32000,
-                    "model_type": "chat",
-                    "is_tools": true
-                },
                {
                    "llm_name": "Pro/Qwen/Qwen2.5-Coder-7B-Instruct",
                    "tags": "LLM,CHAT,32k",
@ -3159,13 +3165,6 @@
                    "model_type": "chat",
                    "is_tools": false
                },
-                {
-                    "llm_name": "Pro/Qwen/Qwen2-VL-7B-Instruct",
-                    "tags": "LLM,CHAT,IMAGE2TEXT,32k",
-                    "max_tokens": 32000,
-                    "model_type": "image2text",
-                    "is_tools": false
-                },
                {
                    "llm_name": "Pro/Qwen/Qwen2.5-7B-Instruct",
                    "tags": "LLM,CHAT,32k",
@ -5147,4 +5146,4 @@
            ]
        }
    ]
-}
+}
--- a/conf/os_mapping.json
+++ b/conf/os_mapping.json
@ -200,6 +200,61 @@
          }
        }
      },
+      {
+        "knn_vector": {
+          "match": "*_2048_vec",
+          "mapping": {
+            "type": "knn_vector",
+            "index": true,
+            "space_type": "cosinesimil",
+            "dimension": 2048
+          }
+        }
+      },
+      {
+        "knn_vector": {
+          "match": "*_4096_vec",
+          "mapping": {
+            "type": "knn_vector",
+            "index": true,
+            "space_type": "cosinesimil",
+            "dimension": 4096
+          }
+        }
+      },
+      {
+        "knn_vector": {
+          "match": "*_6144_vec",
+          "mapping": {
+            "type": "knn_vector",
+            "index": true,
+            "space_type": "cosinesimil",
+            "dimension": 6144
+          }
+        }
+      },
+      {
+        "knn_vector": {
+          "match": "*_8192_vec",
+          "mapping": {
+            "type": "knn_vector",
+            "index": true,
+            "space_type": "cosinesimil",
+            "dimension": 8192
+          }
+        }
+      },
+      {
+        "knn_vector": {
+          "match": "*_10240_vec",
+          "mapping": {
+            "type": "knn_vector",
+            "index": true,
+            "space_type": "cosinesimil",
+            "dimension": 10240
+          }
+        }
+      },
      {
        "binary": {
          "match": "*_bin",
--- a/deepdoc/parser/markdown_parser.py
+++ b/deepdoc/parser/markdown_parser.py
@ -17,7 +17,6 @@

 import re

-import mistune
 from markdown import markdown


@ -117,8 +116,6 @@ class MarkdownElementExtractor:
    def __init__(self, markdown_content):
        self.markdown_content = markdown_content
        self.lines = markdown_content.split("\n")
-        self.ast_parser = mistune.create_markdown(renderer="ast")
-        self.ast_nodes = self.ast_parser(markdown_content)

    def extract_elements(self):
        """Extract individual elements (headers, code blocks, lists, etc.)"""
--- a/deepdoc/parser/pdf_parser.py
+++ b/deepdoc/parser/pdf_parser.py
@ -15,11 +15,13 @@
 #

 import logging
+import math
 import os
 import random
 import re
 import sys
 import threading
+from collections import Counter, defaultdict
 from copy import deepcopy
 from io import BytesIO
 from timeit import default_timer as timer
@ -349,9 +351,78 @@ class RAGFlowPdfParser:
            self.boxes[i]["top"] += self.page_cum_height[self.boxes[i]["page_number"] - 1]
            self.boxes[i]["bottom"] += self.page_cum_height[self.boxes[i]["page_number"] - 1]

-    def _text_merge(self):
+    def _assign_column(self, boxes, zoomin=3):
+        if not boxes:
+            return boxes
+
+        if all("col_id" in b for b in boxes):
+            return boxes
+
+        by_page = defaultdict(list)
+        for b in boxes:
+            by_page[b["page_number"]].append(b)
+
+        page_info = {}  # pg -> dict(page_w, left_edge, cand_cols)
+        counter = Counter()
+
+        for pg, bxs in by_page.items():
+            if not bxs:
+                page_info[pg] = {"page_w": 1.0, "left_edge": 0.0, "cand": 1}
+                counter[1] += 1
+                continue
+
+            if hasattr(self, "page_images") and self.page_images and len(self.page_images) >= pg:
+                page_w = self.page_images[pg - 1].size[0] / max(1, zoomin)
+                left_edge = 0.0
+            else:
+                xs0 = [box["x0"] for box in bxs]
+                xs1 = [box["x1"] for box in bxs]
+                left_edge = float(min(xs0))
+                page_w = max(1.0, float(max(xs1) - left_edge))
+
+            widths = [max(1.0, (box["x1"] - box["x0"])) for box in bxs]
+            median_w = float(np.median(widths)) if widths else 1.0
+
+            raw_cols = int(page_w / max(1.0, median_w))
+
+            # cand = raw_cols if (raw_cols >= 2 and median_w < page_w / raw_cols * 0.8) else 1
+            cand = raw_cols
+
+            page_info[pg] = {"page_w": page_w, "left_edge": left_edge, "cand": cand}
+            counter[cand] += 1
+
+            logging.info(f"[Page {pg}] median_w={median_w:.2f}, page_w={page_w:.2f}, raw_cols={raw_cols}, cand={cand}")
+
+        global_cols = counter.most_common(1)[0][0]
+        logging.info(f"Global column_num decided by majority: {global_cols}")
+
+        for pg, bxs in by_page.items():
+            if not bxs:
+                continue
+
+            page_w = page_info[pg]["page_w"]
+            left_edge = page_info[pg]["left_edge"]
+
+            if global_cols == 1:
+                for box in bxs:
+                    box["col_id"] = 0
+                continue
+
+            for box in bxs:
+                w = box["x1"] - box["x0"]
+                if w >= 0.8 * page_w:
+                    box["col_id"] = 0
+                    continue
+                cx = 0.5 * (box["x0"] + box["x1"])
+                norm_cx = (cx - left_edge) / page_w
+                norm_cx = max(0.0, min(norm_cx, 0.999999))
+                box["col_id"] = int(min(global_cols - 1, norm_cx * global_cols))
+
+        return boxes
+
+    def _text_merge(self, zoomin=3):
        # merge adjusted boxes
-        bxs = self.boxes
+        bxs = self._assign_column(self.boxes, zoomin)

        def end_with(b, txt):
            txt = txt.strip()
@ -367,9 +438,15 @@ class RAGFlowPdfParser:
        while i < len(bxs) - 1:
            b = bxs[i]
            b_ = bxs[i + 1]
+
+            if b["page_number"] != b_["page_number"] or b.get("col_id") != b_.get("col_id"):
+                i += 1
+                continue
+
            if b.get("layoutno", "0") != b_.get("layoutno", "1") or b.get("layout_type", "") in ["table", "figure", "equation"]:
                i += 1
                continue
+
            if abs(self._y_dis(b, b_)) < self.mean_height[bxs[i]["page_number"] - 1] / 3:
                # merge
                bxs[i]["x1"] = b_["x1"]
@ -379,83 +456,108 @@ class RAGFlowPdfParser:
                bxs.pop(i + 1)
                continue
            i += 1
-            continue
-
-            dis_thr = 1
-            dis = b["x1"] - b_["x0"]
-            if b.get("layout_type", "") != "text" or b_.get("layout_type", "") != "text":
-                if end_with(b, "，") or start_with(b_, "（，"):
-                    dis_thr = -8
-                else:
-                    i += 1
-                    continue
-
-            if abs(self._y_dis(b, b_)) < self.mean_height[bxs[i]["page_number"] - 1] / 5 and dis >= dis_thr and b["x1"] < b_["x1"]:
-                # merge
-                bxs[i]["x1"] = b_["x1"]
-                bxs[i]["top"] = (b["top"] + b_["top"]) / 2
-                bxs[i]["bottom"] = (b["bottom"] + b_["bottom"]) / 2
-                bxs[i]["text"] += b_["text"]
-                bxs.pop(i + 1)
-                continue
-            i += 1
        self.boxes = bxs

    def _naive_vertical_merge(self, zoomin=3):
-        import math
-        bxs = Recognizer.sort_Y_firstly(self.boxes, np.median(self.mean_height) / 3)
+        bxs = self._assign_column(self.boxes, zoomin)

-        column_width = np.median([b["x1"] - b["x0"] for b in self.boxes])
-        if not column_width or math.isnan(column_width):
-            column_width = self.mean_width[0]
-        self.column_num = int(self.page_images[0].size[0] / zoomin / column_width)
-        if column_width < self.page_images[0].size[0] / zoomin / self.column_num:
-            logging.info("Multi-column................... {} {}".format(column_width, self.page_images[0].size[0] / zoomin / self.column_num))
-            self.boxes = self.sort_X_by_page(self.boxes, column_width / self.column_num)
+        grouped = defaultdict(list)
+        for b in bxs:
+            grouped[(b["page_number"], b.get("col_id", 0))].append(b)

-        i = 0
-        while i + 1 < len(bxs):
-            b = bxs[i]
-            b_ = bxs[i + 1]
-            if b["page_number"] < b_["page_number"] and re.match(r"[0-9  •一—-]+$", b["text"]):
-                bxs.pop(i)
+        merged_boxes = []
+        for (pg, col), bxs in grouped.items():
+            bxs = sorted(bxs, key=lambda x: (x["top"], x["x0"]))
+            if not bxs:
                continue
-            if not b["text"].strip():
-                bxs.pop(i)
-                continue
-            concatting_feats = [
-                b["text"].strip()[-1] in ",;:'\"，、‘“；：-",
-                len(b["text"].strip()) > 1 and b["text"].strip()[-2] in ",;:'\"，‘“、；：",
-                b_["text"].strip() and b_["text"].strip()[0] in "。；？！?”）),，、：",
-            ]
-            # features for not concating
-            feats = [
-                b.get("layoutno", 0) != b_.get("layoutno", 0),
-                b["text"].strip()[-1] in "。？！?",
-                self.is_english and b["text"].strip()[-1] in ".!?",
-                b["page_number"] == b_["page_number"] and b_["top"] - b["bottom"] > self.mean_height[b["page_number"] - 1] * 1.5,
-                b["page_number"] < b_["page_number"] and abs(b["x0"] - b_["x0"]) > self.mean_width[b["page_number"] - 1] * 4,
-            ]
-            # split features
-            detach_feats = [b["x1"] < b_["x0"], b["x0"] > b_["x1"]]
-            if (any(feats) and not any(concatting_feats)) or any(detach_feats):
-                logging.debug(
-                    "{} {} {} {}".format(
-                        b["text"],
-                        b_["text"],
-                        any(feats),
-                        any(concatting_feats),
+
+            mh = self.mean_height[pg - 1] if self.mean_height else np.median([b["bottom"] - b["top"] for b in bxs]) or 10
+
+            i = 0
+            while i + 1 < len(bxs):
+                b = bxs[i]
+                b_ = bxs[i + 1]
+
+                if b["page_number"] < b_["page_number"] and re.match(r"[0-9  •一—-]+$", b["text"]):
+                    bxs.pop(i)
+                    continue
+
+                if not b["text"].strip():
+                    bxs.pop(i)
+                    continue
+
+                if not b["text"].strip() or b.get("layoutno") != b_.get("layoutno"):
+                    i += 1
+                    continue
+
+                if b_["top"] - b["bottom"] > mh * 1.5:
+                    i += 1
+                    continue
+
+                overlap = max(0, min(b["x1"], b_["x1"]) - max(b["x0"], b_["x0"]))
+                if overlap / max(1, min(b["x1"] - b["x0"], b_["x1"] - b_["x0"])) < 0.3:
+                    i += 1
+                    continue
+
+                concatting_feats = [
+                    b["text"].strip()[-1] in ",;:'\"，、‘“；：-",
+                    len(b["text"].strip()) > 1 and b["text"].strip()[-2] in ",;:'\"，‘“、；：",
+                    b_["text"].strip() and b_["text"].strip()[0] in "。；？！?”）),，、：",
+                ]
+                # features for not concating
+                feats = [
+                    b.get("layoutno", 0) != b_.get("layoutno", 0),
+                    b["text"].strip()[-1] in "。？！?",
+                    self.is_english and b["text"].strip()[-1] in ".!?",
+                    b["page_number"] == b_["page_number"] and b_["top"] - b["bottom"] > self.mean_height[b["page_number"] - 1] * 1.5,
+                    b["page_number"] < b_["page_number"] and abs(b["x0"] - b_["x0"]) > self.mean_width[b["page_number"] - 1] * 4,
+                ]
+                # split features
+                detach_feats = [b["x1"] < b_["x0"], b["x0"] > b_["x1"]]
+                if (any(feats) and not any(concatting_feats)) or any(detach_feats):
+                    logging.debug(
+                        "{} {} {} {}".format(
+                            b["text"],
+                            b_["text"],
+                            any(feats),
+                            any(concatting_feats),
+                        )
                    )
-                )
-                i += 1
-                continue
-            # merge up and down
-            b["bottom"] = b_["bottom"]
-            b["text"] += b_["text"]
-            b["x0"] = min(b["x0"], b_["x0"])
-            b["x1"] = max(b["x1"], b_["x1"])
-            bxs.pop(i + 1)
-        self.boxes = bxs
+                    i += 1
+                    continue
+
+                b["text"] = (b["text"].rstrip() + " " + b_["text"].lstrip()).strip()
+                b["bottom"] = b_["bottom"]
+                b["x0"] = min(b["x0"], b_["x0"])
+                b["x1"] = max(b["x1"], b_["x1"])
+                bxs.pop(i + 1)
+
+            merged_boxes.extend(bxs)
+
+        self.boxes = sorted(merged_boxes, key=lambda x: (x["page_number"], x.get("col_id", 0), x["top"]))
+
+    def _final_reading_order_merge(self, zoomin=3):
+        if not self.boxes:
+            return
+
+        self.boxes = self._assign_column(self.boxes, zoomin=zoomin)
+
+        pages = defaultdict(lambda: defaultdict(list))
+        for b in self.boxes:
+            pg = b["page_number"]
+            col = b.get("col_id", 0)
+            pages[pg][col].append(b)
+
+        for pg in pages:
+            for col in pages[pg]:
+                pages[pg][col].sort(key=lambda x: (x["top"], x["x0"]))
+
+        new_boxes = []
+        for pg in sorted(pages.keys()):
+            for col in sorted(pages[pg].keys()):
+                new_boxes.extend(pages[pg][col])
+
+        self.boxes = new_boxes

    def _concat_downward(self, concat_between_pages=True):
        self.boxes = Recognizer.sort_Y_firstly(self.boxes, 0)
@ -997,7 +1099,7 @@ class RAGFlowPdfParser:
                self.__ocr(i + 1, img, chars, zoomin, id)

            if callback and i % 6 == 5:
-                callback(prog=(i + 1) * 0.6 / len(self.page_images), msg="")
+                callback((i + 1) * 0.6 / len(self.page_images))

        async def __img_ocr_launcher():
            def __ocr_preprocess():
@ -1048,7 +1150,7 @@ class RAGFlowPdfParser:

    def parse_into_bboxes(self, fnm, callback=None, zoomin=3):
        start = timer()
-        self.__images__(fnm, zoomin)
+        self.__images__(fnm, zoomin, callback=callback)
        if callback:
            callback(0.40, "OCR finished ({:.2f}s)".format(timer() - start))

@ -1074,7 +1176,6 @@ class RAGFlowPdfParser:

        def insert_table_figures(tbls_or_figs, layout_type):
            def min_rectangle_distance(rect1, rect2):
-                import math
                pn1, left1, right1, top1, bottom1 = rect1
                pn2, left2, right2, top2, bottom2 = rect2
                if right1 >= left2 and right2 >= left1 and bottom1 >= top2 and bottom2 >= top1:
@ -1091,27 +1192,39 @@ class RAGFlowPdfParser:
                    dy = top1 - bottom2
                else:
                    dy = 0
-                return math.sqrt(dx*dx + dy*dy)# + (pn2-pn1)*10000
+                return math.sqrt(dx * dx + dy * dy)  # + (pn2-pn1)*10000

            for (img, txt), poss in tbls_or_figs:
                bboxes = [(i, (b["page_number"], b["x0"], b["x1"], b["top"], b["bottom"])) for i, b in enumerate(self.boxes)]
-                dists = [(min_rectangle_distance((pn, left, right, top+self.page_cum_height[pn], bott+self.page_cum_height[pn]), rect),i) for i, rect in bboxes for pn, left, right, top, bott in poss]
+                dists = [
+                    (min_rectangle_distance((pn, left, right, top + self.page_cum_height[pn], bott + self.page_cum_height[pn]), rect), i) for i, rect in bboxes for pn, left, right, top, bott in poss
+                ]
                min_i = np.argmin(dists, axis=0)[0]
                min_i, rect = bboxes[dists[min_i][-1]]
                if isinstance(txt, list):
                    txt = "\n".join(txt)
                pn, left, right, top, bott = poss[0]
-                if self.boxes[min_i]["bottom"] < top+self.page_cum_height[pn]:
+                if self.boxes[min_i]["bottom"] < top + self.page_cum_height[pn]:
                    min_i += 1
-                self.boxes.insert(min_i, {
-                    "page_number": pn+1, "x0": left, "x1": right, "top": top+self.page_cum_height[pn], "bottom": bott+self.page_cum_height[pn], "layout_type": layout_type, "text": txt, "image": img,
-                    "positions": [[pn+1, int(left), int(right), int(top), int(bott)]]
-                })
+                self.boxes.insert(
+                    min_i,
+                    {
+                        "page_number": pn + 1,
+                        "x0": left,
+                        "x1": right,
+                        "top": top + self.page_cum_height[pn],
+                        "bottom": bott + self.page_cum_height[pn],
+                        "layout_type": layout_type,
+                        "text": txt,
+                        "image": img,
+                        "positions": [[pn + 1, int(left), int(right), int(top), int(bott)]],
+                    },
+                )

        for b in self.boxes:
            b["position_tag"] = self._line_tag(b, zoomin)
            b["image"] = self.crop(b["position_tag"], zoomin)
-            b["positions"] = [[pos[0][-1]+1, *pos[1:]] for pos in RAGFlowPdfParser.extract_positions(b["position_tag"])]
+            b["positions"] = [[pos[0][-1] + 1, *pos[1:]] for pos in RAGFlowPdfParser.extract_positions(b["position_tag"])]

        insert_table_figures(tbls, "table")
        insert_table_figures(figs, "figure")
--- a/docker/.env
+++ b/docker/.env
@ -37,9 +37,12 @@ OPENSEARCH_PASSWORD=infini_rag_flow_OS_01

 # The port used to expose the Kibana service to the host machine,
 # allowing EXTERNAL access to the service running inside the Docker container.
+# To enable kibana, you need to:
+# 1. Ensure that COMPOSE_PROFILES includes kibana, for example: COMPOSE_PROFILES=${DOC_ENGINE},kibana
+# 2. Comment out or delete the following configurations of the es service in docker-compose-base.yml: xpack.security.enabled、xpack.security.http.ssl.enabled、xpack.security.transport.ssl.enabled (for details: https://www.elastic.co/docs/deploy-manage/security/self-auto-setup#stack-existing-settings-detected)
+# 3. Adjust the es.hosts in conf/service_config.yaml or docker/service_conf.yaml.template to 'https://localhost:1200'
+# 4. After the startup is successful, in the es container, execute the command to generate the kibana token: `bin/elasticsearch-create-enrollment-token -s kibana`, then you can use kibana normally
 KIBANA_PORT=6601
-KIBANA_USER=rag_flow
-KIBANA_PASSWORD=infini_rag_flow

 # The maximum amount of the memory, in bytes, that a specific Docker container can use while running.
 # Update it according to the available memory in the host machine.
@ -91,15 +94,16 @@ REDIS_PASSWORD=infini_rag_flow
 # The port used to expose RAGFlow's HTTP API service to the host machine,
 # allowing EXTERNAL access to the service running inside the Docker container.
 SVR_HTTP_PORT=9380
+ADMIN_SVR_HTTP_PORT=9381

 # The RAGFlow Docker image to download.
-# Defaults to the v0.20.5-slim edition, which is the RAGFlow Docker image without embedding models.
-RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5-slim
+# Defaults to the v0.21.0-slim edition, which is the RAGFlow Docker image without embedding models.
+RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0-slim
 #
 # To download the RAGFlow Docker image with embedding models, uncomment the following line instead:
-# RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5
+# RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0
 #
-# The Docker image of the v0.20.5 edition includes built-in embedding models:
+# The Docker image of the v0.21.0 edition includes built-in embedding models:
 #   - BAAI/bge-large-zh-v1.5
 #   - maidalun1020/bce-embedding-base_v1
 #
--- a/docker/README.md
+++ b/docker/README.md
@ -79,8 +79,8 @@ The [.env](./.env) file contains important environment variables for Docker.
 - `RAGFLOW-IMAGE`  
  The Docker image edition. Available editions:  
  
-  - `infiniflow/ragflow:v0.20.5-slim` (default): The RAGFlow Docker image without embedding models.  
-  - `infiniflow/ragflow:v0.20.5`: The RAGFlow Docker image with embedding models including:
+  - `infiniflow/ragflow:v0.21.0-slim` (default): The RAGFlow Docker image without embedding models.  
+  - `infiniflow/ragflow:v0.21.0`: The RAGFlow Docker image with embedding models including:
    - Built-in embedding models:
      - `BAAI/bge-large-zh-v1.5` 
      - `maidalun1020/bce-embedding-base_v1`
--- a/docker/docker-compose-base.yml
+++ b/docker/docker-compose-base.yml
@ -77,7 +77,7 @@ services:
    container_name: ragflow-infinity
    profiles:
      - infinity
-    image: infiniflow/infinity:v0.6.0-dev7
+    image: infiniflow/infinity:v0.6.0
    volumes:
      - infinity_data:/var/infinity
      - ./infinity_conf.toml:/infinity_conf.toml
@ -207,6 +207,30 @@ services:
      start_period: 10s    


+  kibana:
+    container_name: ragflow-kibana
+    profiles:
+      - kibana
+    image: kibana:${STACK_VERSION}
+    ports:
+      - ${KIBANA_PORT-5601}:5601
+    env_file: .env
+    environment:
+      - TZ=${TIMEZONE}
+    volumes:
+      - kibana_data:/usr/share/kibana/data
+    depends_on:
+      es01:
+        condition: service_started
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:5601/api/status"]
+      interval: 10s
+      timeout: 10s
+      retries: 120
+    networks:
+      - ragflow
+    restart: on-failure
+

 volumes:
  esdata01:
@ -221,6 +245,8 @@ volumes:
    driver: local
  redis_data:
    driver: local
+  kibana_data:
+    driver: local

 networks:
  ragflow:
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@ -22,9 +22,14 @@ services:
    #   - --no-transport-sse-enabled # Disable legacy SSE endpoints (/sse and /messages/)
    #   - --no-transport-streamable-http-enabled #  Disable Streamable HTTP transport (/mcp endpoint)
    #   - --no-json-response # Disable JSON response mode in Streamable HTTP transport (instead of SSE over HTTP)
+
+    # Example configration to start Admin server:
+    # command:
+    #   - --enable-adminserver
    container_name: ragflow-server
    ports:
      - ${SVR_HTTP_PORT}:9380
+      - ${ADMIN_SVR_HTTP_PORT}:9381
      - 80:80
      - 443:443
      - 5678:5678
--- a/docker/entrypoint.sh
+++ b/docker/entrypoint.sh
@ -11,6 +11,7 @@ function usage() {
    echo "  --disable-webserver             Disables the web server (nginx + ragflow_server)."
    echo "  --disable-taskexecutor          Disables task executor workers."
    echo "  --enable-mcpserver              Enables the MCP server."
+    echo "  --enable-adminserver            Enables the Admin server."
    echo "  --consumer-no-beg=<num>         Start range for consumers (if using range-based)."
    echo "  --consumer-no-end=<num>         End range for consumers (if using range-based)."
    echo "  --workers=<num>                 Number of task executors to run (if range is not used)."
@ -21,12 +22,14 @@ function usage() {
    echo "  $0 --disable-webserver --consumer-no-beg=0 --consumer-no-end=5"
    echo "  $0 --disable-webserver --workers=2 --host-id=myhost123"
    echo "  $0 --enable-mcpserver"
+    echo "  $0 --enable-adminserver"
    exit 1
 }

 ENABLE_WEBSERVER=1 # Default to enable web server
 ENABLE_TASKEXECUTOR=1  # Default to enable task executor
 ENABLE_MCP_SERVER=0
+ENABLE_ADMIN_SERVER=0 # Default close admin server
 CONSUMER_NO_BEG=0
 CONSUMER_NO_END=0
 WORKERS=1
@ -70,6 +73,10 @@ for arg in "$@"; do
      ENABLE_MCP_SERVER=1
      shift
      ;;
+    --enable-adminserver)
+      ENABLE_ADMIN_SERVER=1
+      shift
+      ;;
    --mcp-host=*)
      MCP_HOST="${arg#*=}"
      shift
@ -185,6 +192,12 @@ if [[ "${ENABLE_WEBSERVER}" -eq 1 ]]; then
    done &
 fi

+if [[ "${ENABLE_ADMIN_SERVER}" -eq 1 ]]; then
+    echo "Starting admin_server..."
+    while true; do
+        "$PY" admin/server/admin_server.py
+    done &
+fi

 if [[ "${ENABLE_MCP_SERVER}" -eq 1 ]]; then
    start_mcp_server
--- a/docs/configurations.md
+++ b/docs/configurations.md
@ -99,8 +99,8 @@ RAGFlow utilizes MinIO as its object storage solution, leveraging its scalabilit
 - `RAGFLOW-IMAGE`  
  The Docker image edition. Available editions:  
  
-  - `infiniflow/ragflow:v0.20.5-slim` (default): The RAGFlow Docker image without embedding models.  
-  - `infiniflow/ragflow:v0.20.5`: The RAGFlow Docker image with embedding models including:
+  - `infiniflow/ragflow:v0.21.0-slim` (default): The RAGFlow Docker image without embedding models.  
+  - `infiniflow/ragflow:v0.21.0`: The RAGFlow Docker image with embedding models including:
    - Built-in embedding models:
      - `BAAI/bge-large-zh-v1.5` 
      - `maidalun1020/bce-embedding-base_v1`
--- a/docs/develop/build_docker_image.mdx
+++ b/docs/develop/build_docker_image.mdx
@ -77,7 +77,7 @@ After building the infiniflow/ragflow:nightly-slim image, you are ready to launc

 1. Edit Docker Compose Configuration

-Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.20.5-slim` to `infiniflow/ragflow:nightly-slim` to use the pre-built image.
+Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.21.0-slim` to `infiniflow/ragflow:nightly-slim` to use the pre-built image.


 2. Launch the Service
--- a/docs/faq.mdx
+++ b/docs/faq.mdx
@ -30,29 +30,19 @@ The "garbage in garbage out" status quo remains unchanged despite the fact that

 Each RAGFlow release is available in two editions:

- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.5-slim`
- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.5`
+- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.21.0-slim`
+- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.21.0`

 ---

 ### Which embedding models can be deployed locally?

-RAGFlow offers two Docker image editions, `v0.20.5-slim` and `v0.20.5`:  
+RAGFlow offers two Docker image editions, `v0.21.0-slim` and `v0.21.0`:  
  
- `infiniflow/ragflow:v0.20.5-slim` (default): The RAGFlow Docker image without embedding models.  
- `infiniflow/ragflow:v0.20.5`: The RAGFlow Docker image with embedding models including:
-  - Built-in embedding models:
-    - `BAAI/bge-large-zh-v1.5`
-    - `maidalun1020/bce-embedding-base_v1`
-  - Embedding models that will be downloaded once you select them in the RAGFlow UI:
-    - `BAAI/bge-base-en-v1.5`
-    - `BAAI/bge-large-en-v1.5`
-    - `BAAI/bge-small-en-v1.5`
-    - `BAAI/bge-small-zh-v1.5`
-    - `jinaai/jina-embeddings-v2-base-en`
-    - `jinaai/jina-embeddings-v2-small-en`
-    - `nomic-ai/nomic-embed-text-v1.5`
-    - `sentence-transformers/all-MiniLM-L6-v2`
+- `infiniflow/ragflow:v0.21.0-slim` (default): The RAGFlow Docker image without embedding models.  
+- `infiniflow/ragflow:v0.21.0`: The RAGFlow Docker image with the following built-in embedding models:
+  - `BAAI/bge-large-zh-v1.5`
+  - `maidalun1020/bce-embedding-base_v1`

 ---

--- a/docs/guides/agent/agent_component_reference/agent.mdx
+++ b/docs/guides/agent/agent_component_reference/agent.mdx
@ -9,7 +9,7 @@ The component equipped with reasoning, tool usage, and multi-agent collaboration

 ---

-An **Agent** component fine-tunes the LLM and sets its prompt. From v0.20.5 onwards, an **Agent** component is able to work independently and with the following capabilities:
+An **Agent** component fine-tunes the LLM and sets its prompt. From v0.21.0 onwards, an **Agent** component is able to work independently and with the following capabilities:

 - Autonomous reasoning with reflection and adjustment based on environmental feedback.
 - Use of tools or subagents to complete tasks.
@ -147,7 +147,7 @@ An **Agent** component relies on keys (variables) to specify its data inputs. It

 #### Advanced usage

-From v0.20.5 onwards, four framework-level prompt blocks are available in the **System prompt** field, enabling you to customize and *override* prompts at the framework level. Type `/` or click **(x)** to view them; they appear under the **Framework** entry in the dropdown menu.
+From v0.21.0 onwards, four framework-level prompt blocks are available in the **System prompt** field, enabling you to customize and *override* prompts at the framework level. Type `/` or click **(x)** to view them; they appear under the **Framework** entry in the dropdown menu.

 - `task_analysis` prompt block
  - This block is responsible for analyzing tasks — either a user task or a task assigned by the lead Agent when the **Agent** component is acting as a Sub-Agent.
--- a/docs/guides/chat/start_chat.md
+++ b/docs/guides/chat/start_chat.md
@ -48,7 +48,7 @@ You start an AI conversation by creating an assistant.
     - If no target language is selected, the system will search only in the language of your query, which may cause relevant information in other languages to be missed.
   - **Variable** refers to the variables (keys) to be used in the system prompt. `{knowledge}` is a reserved variable. Click **Add** to add more variables for the system prompt.
      - If you are uncertain about the logic behind **Variable**, leave it *as-is*.
-      - As of v0.20.5, if you add custom variables here, the only way you can pass in their values is to call:
+      - As of v0.21.0, if you add custom variables here, the only way you can pass in their values is to call:
         - HTTP method [Converse with chat assistant](../../references/http_api_reference.md#converse-with-chat-assistant), or
         - Python method [Converse with chat assistant](../../references/python_api_reference.md#converse-with-chat-assistant).

--- a/docs/guides/dataset/configure_knowledge_base.md
+++ b/docs/guides/dataset/configure_knowledge_base.md
@ -124,7 +124,7 @@ See [Run retrieval test](./run_retrieval_test.md) for details.

 ## Search for dataset

-As of RAGFlow v0.20.5, the search feature is still in a rudimentary form, supporting only dataset search by name.
+As of RAGFlow v0.21.0, the search feature is still in a rudimentary form, supporting only dataset search by name.

 ![search dataset](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/search_datasets.jpg)

--- a/docs/guides/dataset/set_metadata.md
+++ b/docs/guides/dataset/set_metadata.md
@ -21,6 +21,10 @@ Ensure that your metadata is in JSON format; otherwise, your updates will not be

 ![Input metadata](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/input_metadata.jpg)

+## Related APIs
+
+[Retrieve chunks](../../references/http_api_reference.md#retrieve-chunks)
+
 ## Frequently asked questions

 ### Can I set metadata for multiple documents at once?
--- a/docs/guides/manage_files.md
+++ b/docs/guides/manage_files.md
@ -87,4 +87,4 @@ RAGFlow's file management allows you to download an uploaded file:

 ![download_file](https://github.com/infiniflow/ragflow/assets/93570324/cf3b297f-7d9b-4522-bf5f-4f45743e4ed5)

-> As of RAGFlow v0.20.5, bulk download is not supported, nor can you download an entire folder. 
+> As of RAGFlow v0.21.0, bulk download is not supported, nor can you download an entire folder. 
--- a/docs/guides/manage_users_and_services.md
+++ b/docs/guides/manage_users_and_services.md
@ -1,3 +1,9 @@
+---
+sidebar_position: 6
+slug: /manage_users_and_services
+---
+
+
 # Admin CLI and Admin Service


@ -8,31 +14,48 @@ The Admin CLI and Admin Service form a client-server architectural suite for RAG

 ## Starting the Admin Service

+### Launching from source code
+
 1. Before start Admin Service, please make sure RAGFlow system is already started.
-2. Switch to ragflow/ directory and run the service script:

-```bash
-source .venv/bin/activate
-export PYTHONPATH=$(pwd)
-python admin/admin_server.py
-```
+2. Launch from source code:

-The service will start and listen for incoming connections from the CLI on the configured port. Default port is 9381.
+   ```bash
+   python admin/server/admin_server.py
+   ```
+
+   The service will start and listen for incoming connections from the CLI on the configured port. 
+
+### Using docker image
+
+1. Before startup, please configure the `docker_compose.yml`  file to enable admin server:
+
+   ```bash
+   command:
+     - --enable-adminserver
+   ```
+
+2. Start the containers, the service will start and listen for incoming connections from the CLI on the configured port.



 ## Using the Admin CLI

 1. Ensure the Admin Service is running.
-2. Launch the CLI client:

-```bash
-source .venv/bin/activate
-export PYTHONPATH=$(pwd)
-python admin/admin_client.py -h 0.0.0.0 -p 9381
-```
+2. Install ragflow-cli.

-Enter superuser's password to login. Default password is `admin`.
+   ```bash
+   pip install ragflow-cli
+   ```
+
+3. Launch the CLI client:
+
+   ```bash
+   ragflow-cli -h 0.0.0.0 -p 9381
+   ```
+
+	Enter superuser's password to login. Default password is `admin`.



@ -44,13 +67,13 @@ Commands are case-insensitive and must be terminated with a semicolon(;).

 `LIST SERVICES;`

- Lists all available services within the RAGFLow system.
+- Lists all available services within the RAGFlow system.

 - [Example](#example-list-services)

 `SHOW SERVICE <id>;`

- Shows detailed status information for the service identified by <id>.
+- Shows detailed status information for the service identified by **id**.
 - [Example](#example-show-service)

 ### User Management Commands
@ -115,16 +138,16 @@ Commands are case-insensitive and must be terminated with a semicolon(;).
 admin> list services;
 command: list services;
 Listing all services
-+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
-| extra                                                                                     | host      | id | name          | port  | service_type   |
-+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
-| {}                                                                                        | 0.0.0.0   | 0  | ragflow_0     | 9380  | ragflow_server |
-| {'meta_type': 'mysql', 'password': 'infini_rag_flow', 'username': 'root'}                 | localhost | 1  | mysql         | 5455  | meta_data      |
-| {'password': 'infini_rag_flow', 'store_type': 'minio', 'user': 'rag_flow'}                | localhost | 2  | minio         | 9000  | file_store     |
-| {'password': 'infini_rag_flow', 'retrieval_type': 'elasticsearch', 'username': 'elastic'} | localhost | 3  | elasticsearch | 1200  | retrieval      |
-| {'db_name': 'default_db', 'retrieval_type': 'infinity'}                                   | localhost | 4  | infinity      | 23817 | retrieval      |
-| {'database': 1, 'mq_type': 'redis', 'password': 'infini_rag_flow'}                        | localhost | 5  | redis         | 6379  | message_queue  |
-+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+
+| extra                                                                                     | host      | id | name          | port  | service_type   | status  |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+
+| {}                                                                                        | 0.0.0.0   | 0  | ragflow_0     | 9380  | ragflow_server | Timeout |
+| {'meta_type': 'mysql', 'password': 'infini_rag_flow', 'username': 'root'}                 | localhost | 1  | mysql         | 5455  | meta_data      | Alive   |
+| {'password': 'infini_rag_flow', 'store_type': 'minio', 'user': 'rag_flow'}                | localhost | 2  | minio         | 9000  | file_store     | Alive   |
+| {'password': 'infini_rag_flow', 'retrieval_type': 'elasticsearch', 'username': 'elastic'} | localhost | 3  | elasticsearch | 1200  | retrieval      | Alive   |
+| {'db_name': 'default_db', 'retrieval_type': 'infinity'}                                   | localhost | 4  | infinity      | 23817 | retrieval      | Timeout |
+| {'database': 1, 'mq_type': 'redis', 'password': 'infini_rag_flow'}                        | localhost | 5  | redis         | 6379  | message_queue  | Alive   |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+

 ```

@ -324,7 +347,7 @@ Listing all agents of user: lynn_inf@hotmail.com

 <span id="example-meta-commands"></span>

- Show help infomation.
+- Show help information.

 ```
 admin> \help
--- a/docs/guides/tracing.mdx
+++ b/docs/guides/tracing.mdx
@ -18,7 +18,7 @@ RAGFlow ships with a built-in [Langfuse](https://langfuse.com) integration so th
 Langfuse stores traces, spans and prompt payloads in a purpose-built observability backend and offers filtering and visualisations on top.  

 :::info NOTE
-• RAGFlow **≥ 0.20.5** (contains the Langfuse connector)  
+• RAGFlow **≥ 0.21.0** (contains the Langfuse connector)  
 • A Langfuse workspace (cloud or self-hosted) with a _Project Public Key_ and _Secret Key_
 :::

--- a/docs/guides/upgrade_ragflow.mdx
+++ b/docs/guides/upgrade_ragflow.mdx
@ -66,10 +66,10 @@ To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker imag
   git clone https://github.com/infiniflow/ragflow.git
   ```

-2. Switch to the latest, officially published release, e.g., `v0.20.5`:
+2. Switch to the latest, officially published release, e.g., `v0.21.0`:

   ```bash
-   git checkout -f v0.20.5
+   git checkout -f v0.21.0
   ```

 3. Update **ragflow/docker/.env**:
@ -83,14 +83,14 @@ To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker imag
  <TabItem value="slim">

 ```bash
-RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5-slim
+RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0-slim
 ```

  </TabItem>
  <TabItem value="full">

 ```bash
-RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5
+RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0
 ```

  </TabItem>
@ -114,10 +114,10 @@ No, you do not need to. Upgrading RAGFlow in itself will *not* remove your uploa
 1. From an environment with Internet access, pull the required Docker image.
 2. Save the Docker image to a **.tar** file.
   ```bash
-   docker save -o ragflow.v0.20.5.tar infiniflow/ragflow:v0.20.5
+   docker save -o ragflow.v0.21.0.tar infiniflow/ragflow:v0.21.0
   ```
 3. Copy the **.tar** file to the target server.
 4. Load the **.tar** file into Docker:
   ```bash
-   docker load -i ragflow.v0.20.5.tar
+   docker load -i ragflow.v0.21.0.tar
   ```
--- a/docs/quickstart.mdx
+++ b/docs/quickstart.mdx
@ -44,7 +44,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If

   `vm.max_map_count`. This value sets the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abnormal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.

-   RAGFlow v0.20.5 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.
+   RAGFlow v0.21.0 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component.

 <Tabs
  defaultValue="linux"
@ -184,13 +184,13 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
   ```bash
   $ git clone https://github.com/infiniflow/ragflow.git
   $ cd ragflow/docker
-   $ git checkout -f v0.20.5
+   $ git checkout -f v0.21.0
   ```

 3. Use the pre-built Docker images and start up the server:

   :::tip NOTE
-   The command below downloads the `v0.20.5-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.20.5-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.20.5` for the full edition `v0.20.5`.
+   The command below downloads the `v0.21.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.21.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.21.0` for the full edition `v0.21.0`.
   :::

   ```bash
@ -207,8 +207,8 @@ This section provides instructions on setting up the RAGFlow server on Linux. If

 | RAGFlow image tag   | Image size (GB) | Has embedding models and Python packages? | Stable?                  |
 | ------------------- | --------------- | ----------------------------------------- | ------------------------ |
-| `v0.20.5`           | &approx;9       | :heavy_check_mark:                        | Stable release           |
-| `v0.20.5-slim`      | &approx;2       | ❌                                        | Stable release           |
+| `v0.21.0`           | &approx;9       | :heavy_check_mark:                        | Stable release           |
+| `v0.21.0-slim`      | &approx;2       | ❌                                        | Stable release           |
 | `nightly`           | &approx;9       | :heavy_check_mark:                        | *Unstable* nightly build |
 | `nightly-slim`      | &approx;2       | ❌                                        | *Unstable* nightly build |

@ -217,7 +217,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
 ```

 :::danger IMPORTANT
-The embedding models included in `v0.20.5` and `nightly` are:
+The embedding models included in `v0.21.0` and `nightly` are:

 - BAAI/bge-large-zh-v1.5
 - maidalun1020/bce-embedding-base_v1
--- a/docs/references/glossary.mdx
+++ b/docs/references/glossary.mdx
@ -19,7 +19,7 @@ import TOCInline from '@theme/TOCInline';

 ### Cross-language search

-Cross-language search (also known as cross-lingual retrieval) is a feature introduced in version 0.20.5. It enables users to submit queries in one language (for example, English) and retrieve relevant documents written in other languages such as Chinese or Spanish. This feature is enabled by the system’s default chat model, which translates queries to ensure accurate matching of semantic meaning across languages.
+Cross-language search (also known as cross-lingual retrieval) is a feature introduced in version 0.21.0. It enables users to submit queries in one language (for example, English) and retrieve relevant documents written in other languages such as Chinese or Spanish. This feature is enabled by the system’s default chat model, which translates queries to ensure accurate matching of semantic meaning across languages.

 By enabling cross-language search, users can effortlessly access a broader range of information regardless of language barriers, significantly enhancing the system’s usability and inclusiveness.

--- a/docs/references/http_api_reference.md
+++ b/docs/references/http_api_reference.md
@ -1823,7 +1823,21 @@ curl --request POST \
     {
          "question": "What is advantage of ragflow?",
          "dataset_ids": ["b2a62730759d11ef987d0242ac120004"],
-          "document_ids": ["77df9ef4759a11ef8bdd0242ac120004"]
+          "document_ids": ["77df9ef4759a11ef8bdd0242ac120004"],
+          "metadata_condition": {
+            "conditions": [
+              {
+                "name": "author",
+                "comparison_operator": "=",
+                "value": "Toby"
+              },
+              {
+                "name": "url",
+                "comparison_operator": "not contains",
+                "value": "amd"
+              }
+            ]
+          }
     }'
 ```

@ -1858,7 +1872,25 @@ curl --request POST \
 - `"cross_languages"`: (*Body parameter*) `list[string]`  
  The languages that should be translated into, in order to achieve keywords retrievals in different languages.
 - `"metadata_condition"`: (*Body parameter*), `object`  
-  The metadata condition for filtering chunks.
+  The metadata condition used for filtering chunks:  
+  - `"conditions"`: (*Body parameter*), `array`  
+    A list of metadata filter conditions.  
+    - `"name"`: `string` - The metadata field name to filter by, e.g., `"author"`, `"company"`, `"url"`. Ensure this parameter before use. See [Set metadata](../guides/dataset/set_metadata.md) for details.
+    - `comparison_operator`: `string` - The comparison operator. Can be one of: 
+      - `"contains"`
+      - `"not contains"`
+      - `"start with"`
+      - `"empty"`
+      - `"not empty"`
+      - `"="`
+      - `"≠"`
+      - `">"`
+      - `"<"`
+      - `"≥"`
+      - `"≤"`
+    - `"value"`: `string` - The value to compare.
+
+
 #### Response

 Success:
--- a/docs/references/python_api_reference.md
+++ b/docs/references/python_api_reference.md
@ -698,6 +698,58 @@ print("Async bulk parsing initiated.")

 ---

+### Parse documents (with document status)
+
+```python
+DataSet.parse_documents(document_ids: list[str]) -> list[tuple[str, str, int, int]]
+```
+
+*Asynchronously* parses documents in the current dataset.
+
+This method encapsulates `async_parse_documents()`. It awaits the completion of all parsing tasks before returning detailed results, including the parsing status and statistics for each document. If a keyboard interruption occurs (e.g., `Ctrl+C`), all pending parsing tasks will be cancelled gracefully.
+
+#### Parameters
+
+##### document_ids: `list[str]`, *Required*
+
+The IDs of the documents to parse.
+
+#### Returns
+
+A list of tuples with detailed parsing results:
+
+```python
+[
+  (document_id: str, status: str, chunk_count: int, token_count: int),
+  ...
+]
+```
+- `status`: The final parsing state (e.g., `success`, `failed`, `cancelled`).  
+- `chunk_count`: The number of content chunks created from the document.  
+- `token_count`: The total number of tokens processed.  
+
+---
+
+#### Example
+
+```python
+rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
+dataset = rag_object.create_dataset(name="dataset_name")
+documents = dataset.list_documents(keywords="test")
+ids = [doc.id for doc in documents]
+
+try:
+    finished = dataset.parse_documents(ids)
+    for doc_id, status, chunk_count, token_count in finished:
+        print(f"Document {doc_id} parsing finished with status: {status}, chunks: {chunk_count}, tokens: {token_count}")
+except KeyboardInterrupt:
+    print("\nParsing interrupted by user. All pending tasks have been cancelled.")
+except Exception as e:
+    print(f"Parsing failed: {e}")
+```
+
+---
+
 ### Stop parsing documents

 ```python
--- a/docs/references/supported_models.mdx
+++ b/docs/references/supported_models.mdx
@ -33,7 +33,7 @@ A complete list of models supported by RAGFlow, which will continue to expand.
 | Jina                  |                    | :heavy_check_mark: | :heavy_check_mark: |                    |                    |                    |
 | LeptonAI              | :heavy_check_mark: |                    |                    |                    |                    |                    |
 | LocalAI               | :heavy_check_mark: | :heavy_check_mark: |                    | :heavy_check_mark: |                    |                    |
-| LM-Studio             | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |                    |                    |
+| LM-Studio             | :heavy_check_mark: | :heavy_check_mark: |                    | :heavy_check_mark: |                    |                    |
 | MiniMax               | :heavy_check_mark: |                    |                    |                    |                    |                    |
 | Mistral               | :heavy_check_mark: | :heavy_check_mark: |                    |                    |                    |                    |
 | ModelScope            | :heavy_check_mark: |                    |                    |                    |                    |                    |
--- a/docs/release_notes.md
+++ b/docs/release_notes.md
@ -9,8 +9,8 @@ Key features, improvements and bug fixes in the latest releases.

 :::info
 Each RAGFlow release is available in two editions:
- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.20.5-slim`
- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.20.5`
+- **Slim edition**: excludes built-in embedding models and is identified by a **-slim** suffix added to the version name. Example: `infiniflow/ragflow:v0.21.0-slim`
+- **Full edition**: includes built-in embedding models and has no suffix added to the version name. Example: `infiniflow/ragflow:v0.21.0`
 :::

 :::danger IMPORTANT
@ -22,6 +22,34 @@ The embedding models included in a full edition are:
 These two embedding models are optimized specifically for English and Chinese, so performance may be compromised if you use them to embed documents in other languages.
 :::

+## v0.21.0
+
+Released on October 15, 2025.
+
+### New features
+
+- Orchestratable ingestion pipeline: Supports customized data ingestion and cleansing workflows, enabling users to flexibly design their data flows or directly apply the official data flow templates on the canvas.
+- GraphRAG & RAPTOR write process optimized: Replaces the automatic incremental build process with manual batch building, significantly reducing construction overhead.
+- Long-context RAG: Automatically generates document-level table of contents (TOC) structures to mitigate context loss caused by inaccurate or excessive chunking, substantially improving retrieval quality. This feature is now available via a TOC extraction template.
+- Video file parsing: Expands the system's multimodal data processing capabilities by supporting video file parsing.
+- Admin CLI: Introduces a new command-line tool for system administration, allowing users to manage and monitor RAGFlow's service status via command line.
+
+### Improvements
+
+- Redesigns RAGFlow's Login and Registration pages.
+- Upgrades RAGFlow's document engine Infinity to v0.6.0.
+
+### Added models
+
+- Tongyi Qwen 3 series
+- Claude Sonnet 4.5
+- Meituan LongCat-Flash-Thinking
+
+## New agent templates
+
+- Company Research Report Deep Dive Agent: Designed for financial institutions to help analysts quickly organize information, generate research reports, and make investment decisions.
+- Orchestratable Ingestion Pipeline Template: Allows users to apply this template on the canvas to rapidly establish standardized data ingestion and cleansing processes.
+
 ## v0.20.5

 Released on September 10, 2025.
@ -580,7 +608,7 @@ Released on September 30, 2024.

 ### Compatibility changes

-From this release onwards, RAGFlow offers slim editions of its Docker images to improve the experience for users with limited Internet access. A slim edition of RAGFlow's Docker image does not include built-in BGE/BCE embedding models and has a size of about 1GB; a full edition of RAGFlow is approximately 9GB and includes both built-in embedding models and embedding models that will be downloaded once you select them in the RAGFlow UI.
+From this release onwards, RAGFlow offers slim editions of its Docker images to improve the experience for users with limited Internet access. A slim edition of RAGFlow's Docker image does not include built-in BGE/BCE embedding models and has a size of about 1GB; a full edition of RAGFlow is approximately 9GB and includes two built-in embedding models.

 The default Docker image edition is `nightly-slim`. The following list clarifies the differences between various editions:

--- a/download_deps.py
+++ b/download_deps.py
@ -16,7 +16,7 @@ import os
 import urllib.request
 import argparse

-def get_urls(use_china_mirrors=False) -> Union[str, list[str]]:
+def get_urls(use_china_mirrors=False) -> list[Union[str, list[str]]]:
    if use_china_mirrors:
        return [
            "http://mirrors.tuna.tsinghua.edu.cn/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2_amd64.deb",
--- a/graphrag/entity_resolution.py
+++ b/graphrag/entity_resolution.py
@ -60,7 +60,7 @@ class EntityResolution(Extractor):
        self._llm = llm_invoker
        self._resolution_prompt = ENTITY_RESOLUTION_PROMPT
        self._record_delimiter_key = "record_delimiter"
-        self._entity_index_dilimiter_key = "entity_index_delimiter"
+        self._entity_index_delimiter_key = "entity_index_delimiter"
        self._resolution_result_delimiter_key = "resolution_result_delimiter"
        self._input_text_key = "input_text"

@ -77,7 +77,7 @@ class EntityResolution(Extractor):
            **prompt_variables,
            self._record_delimiter_key: prompt_variables.get(self._record_delimiter_key)
                                        or DEFAULT_RECORD_DELIMITER,
-            self._entity_index_dilimiter_key: prompt_variables.get(self._entity_index_dilimiter_key)
+            self._entity_index_delimiter_key: prompt_variables.get(self._entity_index_delimiter_key)
                                              or DEFAULT_ENTITY_INDEX_DELIMITER,
            self._resolution_result_delimiter_key: prompt_variables.get(self._resolution_result_delimiter_key)
                                                   or DEFAULT_RESOLUTION_RESULT_DELIMITER,
@ -185,7 +185,7 @@ class EntityResolution(Extractor):
        result = self._process_results(len(candidate_resolution_i[1]), response,
                                       self.prompt_variables.get(self._record_delimiter_key,
                                                            DEFAULT_RECORD_DELIMITER),
-                                       self.prompt_variables.get(self._entity_index_dilimiter_key,
+                                       self.prompt_variables.get(self._entity_index_delimiter_key,
                                                            DEFAULT_ENTITY_INDEX_DELIMITER),
                                       self.prompt_variables.get(self._resolution_result_delimiter_key,
                                                            DEFAULT_RESOLUTION_RESULT_DELIMITER))
--- a/graphrag/utils.py
+++ b/graphrag/utils.py
@ -92,10 +92,7 @@ def dict_has_keys_with_types(data: dict, expected_fields: list[tuple[str, type]]

 def get_llm_cache(llmnm, txt, history, genconf):
    hasher = xxhash.xxh64()
-    hasher.update(str(llmnm).encode("utf-8"))
-    hasher.update(str(txt).encode("utf-8"))
-    hasher.update(str(history).encode("utf-8"))
-    hasher.update(str(genconf).encode("utf-8"))
+    hasher.update((str(llmnm)+str(txt)+str(history)+str(genconf)).encode("utf-8"))

    k = hasher.hexdigest()
    bin = REDIS_CONN.get(k)
@ -106,11 +103,7 @@ def get_llm_cache(llmnm, txt, history, genconf):

 def set_llm_cache(llmnm, txt, v, history, genconf):
    hasher = xxhash.xxh64()
-    hasher.update(str(llmnm).encode("utf-8"))
-    hasher.update(str(txt).encode("utf-8"))
-    hasher.update(str(history).encode("utf-8"))
-    hasher.update(str(genconf).encode("utf-8"))
-
+    hasher.update((str(llmnm)+str(txt)+str(history)+str(genconf)).encode("utf-8"))
    k = hasher.hexdigest()
    REDIS_CONN.set(k, v.encode("utf-8"), 24 * 3600)

--- a/helm/values.yaml
+++ b/helm/values.yaml
@ -56,7 +56,7 @@ env:
 ragflow:
  image:
    repository: infiniflow/ragflow
-    tag: v0.20.5-slim
+    tag: v0.21.0-slim
    pullPolicy: IfNotPresent
    pullSecrets: []
  # Optional service configuration overrides
@ -96,7 +96,7 @@ ragflow:
 infinity:
  image:
    repository: infiniflow/infinity
-    tag: v0.6.0-dev7
+    tag: v0.6.0
    pullPolicy: IfNotPresent
    pullSecrets: []
  storage:
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "ragflow"
-version = "0.20.5"
+version = "0.21.0"
 description = "[RAGFlow](https://ragflow.io/) is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data."
 authors = [{ name = "Zhichang Yu", email = "yuzhichang@gmail.com" }]
 license-files = ["LICENSE"]
@ -46,7 +46,7 @@ dependencies = [
    "html-text==0.6.2",
    "httpx[socks]==0.27.2",
    "huggingface-hub>=0.25.0,<0.26.0",
-    "infinity-sdk==0.6.0.dev7",
+    "infinity-sdk==0.6.0",
    "infinity-emb>=0.0.66,<0.0.67",
    "itsdangerous==2.1.2",
    "json-repair==0.35.0",
--- a/rag/app/naive.py
+++ b/rag/app/naive.py
@ -328,7 +328,7 @@ class Pdf(PdfParser):
        callback(0.65, "Table analysis ({:.2f}s)".format(timer() - start))

        start = timer()
-        self._text_merge()
+        self._text_merge(zoomin=zoomin)
        callback(0.67, "Text merged ({:.2f}s)".format(timer() - start))

        if separate_tables_figures:
@ -340,6 +340,7 @@ class Pdf(PdfParser):
            tbls = self._extract_table_figure(True, zoomin, True, True)
            self._naive_vertical_merge()
            self._concat_downward()
+            self._final_reading_order_merge()
            # self._filter_forpages()
            logging.info("layouts cost: {}s".format(timer() - first_start))
            return [(b["text"], self._line_tag(b, zoomin)) for b in self.boxes], tbls
--- a/rag/flow/hierarchical_merger/hierarchical_merger.py
+++ b/rag/flow/hierarchical_merger/hierarchical_merger.py
@ -166,7 +166,7 @@ class HierarchicalMerger(ProcessBase):
                img = None
                for i in path:
                    txt += lines[i] + "\n"
-                    concat_img(img, id2image(section_images[i], partial(STORAGE_IMPL.get)))
+                    concat_img(img, id2image(section_images[i], partial(STORAGE_IMPL.get, tenant_id=self._canvas._tenant_id)))
                cks.append(txt)
                images.append(img)

@ -180,7 +180,7 @@ class HierarchicalMerger(ProcessBase):
            ]
            async with trio.open_nursery() as nursery:
                for d in cks:
-                    nursery.start_soon(image2id, d, partial(STORAGE_IMPL.put), get_uuid())
+                    nursery.start_soon(image2id, d, partial(STORAGE_IMPL.put, tenant_id=self._canvas._tenant_id), get_uuid())
            self.set_output("chunks", cks)

        self.callback(1, "Done.")
--- a/rag/flow/parser/parser.py
+++ b/rag/flow/parser/parser.py
@ -184,8 +184,6 @@ class ParserParam(ProcessParamBase):
        audio_config = self.setups.get("audio", "")
        if audio_config:
            self.check_empty(audio_config.get("llm_id"), "Audio VLM")
-            audio_language = audio_config.get("lang", "")
-            self.check_empty(audio_language, "Language")

        email_config = self.setups.get("email", "")
        if email_config:
@ -348,15 +346,13 @@ class Parser(ProcessBase):

        conf = self._param.setups["audio"]
        self.set_output("output_format", conf["output_format"])
-
-        lang = conf["lang"]
        _, ext = os.path.splitext(name)
        with tempfile.NamedTemporaryFile(suffix=ext) as tmpf:
            tmpf.write(blob)
            tmpf.flush()
            tmp_path = os.path.abspath(tmpf.name)

-            seq2txt_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.SPEECH2TEXT, lang=lang)
+            seq2txt_mdl = LLMBundle(self._canvas.get_tenant_id(), LLMType.SPEECH2TEXT)
            txt = seq2txt_mdl.transcription(tmp_path)

            self.set_output("text", txt)
@ -366,6 +362,7 @@ class Parser(ProcessBase):

        email_content = {}
        conf = self._param.setups["email"]
+        self.set_output("output_format", conf["output_format"])
        target_fields = conf["fields"]

        _, ext = os.path.splitext(name)
@ -403,8 +400,8 @@ class Parser(ProcessBase):

                _add_content(msg, msg.get_content_type())

-                email_content["text"] = body_text
-                email_content["text_html"] = body_html
+                email_content["text"] = "\n".join(body_text)
+                email_content["text_html"] = "\n".join(body_html)
            # get attachment
            if "attachments" in target_fields:
                attachments = []
@ -414,7 +411,7 @@ class Parser(ProcessBase):
                        dispositions = content_disposition.strip().split(";")
                        if dispositions[0].lower() == "attachment":
                            filename = part.get_filename()
-                            payload = part.get_payload(decode=True)
+                            payload = part.get_payload(decode=True).decode(part.get_content_charset())
                            attachments.append({
                                "filename": filename,
                                "payload": payload,
@ -442,15 +439,16 @@ class Parser(ProcessBase):
            }
            # get body
            if "body" in target_fields:
-                email_content["text"] = msg.body  # usually empty. try text_html instead
-                email_content["text_html"] = msg.htmlBody
+                email_content["text"] = msg.body[0] if isinstance(msg.body, list) and msg.body else msg.body
+                if not email_content["text"] and msg.htmlBody:
+                    email_content["text"] = msg.htmlBody[0] if isinstance(msg.htmlBody, list) and msg.htmlBody else msg.htmlBody
            # get attachments
            if "attachments" in target_fields:
                attachments = []
                for t in msg.attachments:
                    attachments.append({
                        "filename": t.name,
-                        "payload": t.data  # binary
+                        "payload": t.data.decode("utf-8")
                    })
                email_content["attachments"] = attachments

@ -514,4 +512,4 @@ class Parser(ProcessBase):
        outs = self.output()
        async with trio.open_nursery() as nursery:
            for d in outs.get("json", []):
-                nursery.start_soon(image2id, d, partial(STORAGE_IMPL.put), get_uuid())
+                nursery.start_soon(image2id, d, partial(STORAGE_IMPL.put, tenant_id=self._canvas._tenant_id), get_uuid())
--- a/rag/flow/splitter/schema.py
+++ b/rag/flow/splitter/schema.py
@ -25,7 +25,7 @@ class SplitterFromUpstream(BaseModel):
    file: dict | None = Field(default=None)
    chunks: list[dict[str, Any]] | None = Field(default=None)

-    output_format: Literal["json", "markdown", "text", "html"] | None = Field(default=None)
+    output_format: Literal["json", "markdown", "text", "html", "chunks"] | None = Field(default=None)

    json_result: list[dict[str, Any]] | None = Field(default=None, alias="json")
    markdown_result: str | None = Field(default=None, alias="markdown")
--- a/rag/flow/splitter/splitter.py
+++ b/rag/flow/splitter/splitter.py
@ -87,7 +87,7 @@ class Splitter(ProcessBase):
        sections, section_images = [], []
        for o in from_upstream.json_result or []:
            sections.append((o.get("text", ""), o.get("position_tag", "")))
-            section_images.append(id2image(o.get("img_id"), partial(STORAGE_IMPL.get)))
+            section_images.append(id2image(o.get("img_id"), partial(STORAGE_IMPL.get, tenant_id=self._canvas._tenant_id)))

        chunks, images = naive_merge_with_images(
            sections,
@ -106,6 +106,6 @@ class Splitter(ProcessBase):
        ]
        async with trio.open_nursery() as nursery:
            for d in cks:
-                nursery.start_soon(image2id, d, partial(STORAGE_IMPL.put), get_uuid())
+                nursery.start_soon(image2id, d, partial(STORAGE_IMPL.put, tenant_id=self._canvas._tenant_id), get_uuid())
        self.set_output("chunks",  cks)
        self.callback(1, "Done.")
--- a/rag/flow/tokenizer/tokenizer.py
+++ b/rag/flow/tokenizer/tokenizer.py
@ -126,7 +126,7 @@ class Tokenizer(ProcessBase):
                    if ck.get("summary"):
                        ck["content_ltks"] = rag_tokenizer.tokenize(str(ck["summary"]))
                        ck["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(ck["content_ltks"])
-                    else:
+                    elif ck.get("text"):
                        ck["content_ltks"] = rag_tokenizer.tokenize(ck["text"])
                        ck["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(ck["content_ltks"])
                    if i % 100 == 99:
@ -155,6 +155,8 @@ class Tokenizer(ProcessBase):
                for i, ck in enumerate(chunks):
                    ck["title_tks"] = rag_tokenizer.tokenize(re.sub(r"\.[a-zA-Z]+$", "", from_upstream.name))
                    ck["title_sm_tks"] = rag_tokenizer.fine_grained_tokenize(ck["title_tks"])
+                    if not ck.get("text"):
+                        continue
                    ck["content_ltks"] = rag_tokenizer.tokenize(ck["text"])
                    ck["content_sm_ltks"] = rag_tokenizer.fine_grained_tokenize(ck["content_ltks"])
                    if i % 100 == 99:
--- a/rag/llm/chat_model.py
+++ b/rag/llm/chat_model.py
@ -132,8 +132,7 @@ class Base(ABC):
            "tool_choice",
            "logprobs",
            "top_logprobs",
-            "extra_headers",
-            "enable_thinking"
+            "extra_headers"
        }

        gen_conf = {k: v for k, v in gen_conf.items() if k in allowed_conf}
@ -142,6 +141,22 @@ class Base(ABC):

    def _chat(self, history, gen_conf, **kwargs):
        logging.info("[HISTORY]" + json.dumps(history, ensure_ascii=False, indent=2))
+        if self.model_name.lower().find("qwq") >= 0:
+            logging.info(f"[INFO] {self.model_name} detected as reasoning model, using _chat_streamly")
+
+            final_ans = ""
+            tol_token = 0
+            for delta, tol in self._chat_streamly(history, gen_conf, with_reasoning=False, **kwargs):
+                if delta.startswith("<think>") or delta.endswith("</think>"):
+                    continue
+                final_ans += delta
+                tol_token = tol
+
+            if len(final_ans.strip()) == 0:
+                final_ans = "**ERROR**: Empty response from reasoning model"
+
+            return final_ans.strip(), tol_token
+
        if self.model_name.lower().find("qwen3") >= 0:
            kwargs["extra_body"] = {"enable_thinking": False}

@ -1167,13 +1182,43 @@ class GoogleChat(Base):
        else:
            if "max_tokens" in gen_conf:
                gen_conf["max_output_tokens"] = gen_conf["max_tokens"]
+                del gen_conf["max_tokens"]
            for k in list(gen_conf.keys()):
                if k not in ["temperature", "top_p", "max_output_tokens"]:
                    del gen_conf[k]
        return gen_conf

+    def _get_thinking_config(self, gen_conf):
+        """Extract and create ThinkingConfig from gen_conf.
+
+        Default behavior for Vertex AI Generative Models: thinking_budget=0 (disabled)
+        unless explicitly specified by the user. This does not apply to Claude models.
+
+        Users can override by setting thinking_budget in gen_conf/llm_setting:
+        - 0: Disabled (default)
+        - 1-24576: Manual budget
+        - -1: Auto (model decides)
+        """
+        # Claude models don't support ThinkingConfig
+        if "claude" in self.model_name:
+            gen_conf.pop("thinking_budget", None)
+            return None
+
+        # For Vertex AI Generative Models, default to thinking disabled
+        thinking_budget = gen_conf.pop("thinking_budget", 0)
+
+        if thinking_budget is not None:
+            try:
+                import vertexai.generative_models as glm  # type: ignore
+                return glm.ThinkingConfig(thinking_budget=thinking_budget)
+            except Exception:
+                pass
+        return None
+
    def _chat(self, history, gen_conf={}, **kwargs):
        system = history[0]["content"] if history and history[0]["role"] == "system" else ""
+        thinking_config = self._get_thinking_config(gen_conf)
+        gen_conf = self._clean_conf(gen_conf)
        if "claude" in self.model_name:
            response = self.client.messages.create(
                model=self.model_name,
@ -1206,7 +1251,10 @@ class GoogleChat(Base):
                    }
                ]

-        response = self.client.generate_content(hist, generation_config=gen_conf)
+        if thinking_config:
+            response = self.client.generate_content(hist, generation_config=gen_conf, thinking_config=thinking_config)
+        else:
+            response = self.client.generate_content(hist, generation_config=gen_conf)
        ans = response.text
        return ans, response.usage_metadata.total_token_count

@ -1235,9 +1283,13 @@ class GoogleChat(Base):

            yield total_tokens
        else:
+            response = None
+            total_tokens = 0
            self.client._system_instruction = system
+            thinking_config = self._get_thinking_config(gen_conf)
            if "max_tokens" in gen_conf:
                gen_conf["max_output_tokens"] = gen_conf["max_tokens"]
+                del gen_conf["max_tokens"]
            for k in list(gen_conf.keys()):
                if k not in ["temperature", "top_p", "max_output_tokens"]:
                    del gen_conf[k]
@ -1245,18 +1297,26 @@ class GoogleChat(Base):
                if "role" in item and item["role"] == "assistant":
                    item["role"] = "model"
                if "content" in item:
-                    item["parts"] = item.pop("content")
+                    item["parts"] = [
+                        {
+                            "text": item.pop("content"),
+                        }
+                    ]
            ans = ""
            try:
-                response = self.model.generate_content(history, generation_config=gen_conf, stream=True)
+                if thinking_config:
+                    response = self.client.generate_content(history, generation_config=gen_conf, thinking_config=thinking_config, stream=True)
+                else:
+                    response = self.client.generate_content(history, generation_config=gen_conf, stream=True)
                for resp in response:
                    ans = resp.text
+                    total_tokens += num_tokens_from_string(ans)
                    yield ans

            except Exception as e:
                yield ans + "\n**ERROR**: " + str(e)

-            yield response._chunks[-1].usage_metadata.total_token_count
+            yield total_tokens


 class GPUStackChat(Base):
--- a/rag/llm/sequence2txt_model.py
+++ b/rag/llm/sequence2txt_model.py
@ -234,8 +234,8 @@ class DeepInfraSeq2txt(Base):

        self.client = OpenAI(api_key=key, base_url=base_url)
        self.model_name = model_name
-        
-        
+
+
 class CometAPISeq2txt(Base):
    _FACTORY_NAME = "CometAPI"

@ -244,7 +244,8 @@ class CometAPISeq2txt(Base):
            base_url = "https://api.cometapi.com/v1"
        self.client = OpenAI(api_key=key, base_url=base_url)
        self.model_name = model_name
-        
+
+
 class DeerAPISeq2txt(Base):
    _FACTORY_NAME = "DeerAPI"

@ -253,3 +254,44 @@ class DeerAPISeq2txt(Base):
            base_url = "https://api.deerapi.com/v1"
        self.client = OpenAI(api_key=key, base_url=base_url)
        self.model_name = model_name
+
+
+class ZhipuSeq2txt(Base):
+    _FACTORY_NAME = "ZHIPU-AI"
+
+    def __init__(self, key, model_name="glm-asr", base_url="https://open.bigmodel.cn/api/paas/v4", **kwargs):
+        if not base_url:
+            base_url = "https://open.bigmodel.cn/api/paas/v4"
+        self.base_url = base_url
+        self.api_key = key
+        self.model_name = model_name
+        self.gen_conf = kwargs.get("gen_conf", {})
+        self.stream = kwargs.get("stream", False)
+
+    def transcription(self, audio_path):
+        payload = {
+            "model": self.model_name,
+            "temperature": str(self.gen_conf.get("temperature", 0.75)) or "0.75",
+            "stream": self.stream,
+        }
+
+        headers = {"Authorization": f"Bearer {self.api_key}"}
+        with open(audio_path, "rb") as audio_file:
+            files = {"file": audio_file}
+
+            try:
+                response = requests.post(
+                    url=f"{self.base_url}/audio/transcriptions",
+                    data=payload,
+                    files=files,
+                    headers=headers,
+                )
+                body = response.json()
+                if response.status_code == 200:
+                    full_content = body["text"]
+                    return full_content, num_tokens_from_string(full_content)
+                else:
+                    error = body["error"]
+                    return f"**ERROR**: code: {error['code']}, message: {error['message']}", 0
+            except Exception as e:
+                return "**ERROR**: " + str(e), 0
--- a/rag/nlp/init.py
+++ b/rag/nlp/init.py
@ -613,13 +613,13 @@ def naive_merge(sections: str | list, chunk_token_num=128, delimiter="\n。；
    dels = get_delimiters(delimiter)
    for sec, pos in sections:
        if num_tokens_from_string(sec) < chunk_token_num:
-            add_chunk(sec, pos)
+            add_chunk("\n"+sec, pos)
            continue
        split_sec = re.split(r"(%s)" % dels, sec, flags=re.DOTALL)
        for sub_sec in split_sec:
            if re.match(f"^{dels}$", sub_sec):
                continue
-            add_chunk(sub_sec, pos)
+            add_chunk("\n"+sub_sec, pos)

    return cks

@ -669,13 +669,13 @@ def naive_merge_with_images(texts, images, chunk_token_num=128, delimiter="\n。
            for sub_sec in split_sec:
                if re.match(f"^{dels}$", sub_sec):
                    continue
-                add_chunk(sub_sec, image, text_pos)
+                add_chunk("\n"+sub_sec, image, text_pos)
        else:
            split_sec = re.split(r"(%s)" % dels, text)
            for sub_sec in split_sec:
                if re.match(f"^{dels}$", sub_sec):
                    continue
-                add_chunk(sub_sec, image)
+                add_chunk("\n"+sub_sec, image)

    return cks, result_images

@ -757,7 +757,7 @@ def naive_merge_docx(sections, chunk_token_num=128, delimiter="\n。；！？"):
        for sub_sec in split_sec:
            if re.match(f"^{dels}$", sub_sec):
                continue
-            add_chunk(sub_sec, image,"")
+            add_chunk("\n"+sub_sec, image,"")
        line = ""

    if line:
@ -765,7 +765,7 @@ def naive_merge_docx(sections, chunk_token_num=128, delimiter="\n。；！？"):
        for sub_sec in split_sec:
            if re.match(f"^{dels}$", sub_sec):
                continue
-            add_chunk(sub_sec, image,"")
+            add_chunk("\n"+sub_sec, image,"")

    return cks, images

--- a/rag/nlp/search.py
+++ b/rag/nlp/search.py
@ -13,12 +13,14 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
+import json
 import logging
 import re
 import math
 from collections import OrderedDict
 from dataclasses import dataclass

+from rag.prompts.generator import relevant_chunks_with_toc
 from rag.settings import TAG_FLD, PAGERANK_FLD
 from rag.utils import rmSpace, get_float
 from rag.nlp import rag_tokenizer, query
@ -514,3 +516,63 @@ class Dealer:
        tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / max(1e-6, all_tags.get(a, 0.0001)))) for a, c in aggs],
                         key=lambda x: x[1] * -1)[:topn_tags]
        return {a.replace(".", "_"): max(1, c) for a, c in tag_fea}
+
+    def retrieval_by_toc(self, query:str, chunks:list[dict], tenant_ids:list[str], chat_mdl, topn: int=6):
+        if not chunks:
+            return []
+        idx_nms = [index_name(tid) for tid in tenant_ids]
+        ranks, doc_id2kb_id = {}, {}
+        for ck in chunks:
+            if ck["doc_id"] not in ranks:
+                ranks[ck["doc_id"]] = 0
+            ranks[ck["doc_id"]] += ck["similarity"]
+            doc_id2kb_id[ck["doc_id"]] = ck["kb_id"]
+        doc_id = sorted(ranks.items(), key=lambda x: x[1]*-1.)[0][0]
+        kb_ids = [doc_id2kb_id[doc_id]]
+        es_res = self.dataStore.search(["content_with_weight"], [], {"doc_id": doc_id, "toc_kwd": "toc"}, [], OrderByExpr(), 0, 128, idx_nms,
+                                       kb_ids)
+        toc = []
+        dict_chunks = self.dataStore.getFields(es_res, ["content_with_weight"])
+        for _, doc in dict_chunks.items():
+            try:
+                toc.extend(json.loads(doc["content_with_weight"]))
+            except Exception as e:
+                logging.exception(e)
+        if not toc:
+            return chunks
+
+        ids = relevant_chunks_with_toc(query, toc, chat_mdl, topn*2)
+        if not ids:
+            return chunks
+        
+        vector_size = 1024
+        id2idx = {ck["chunk_id"]: i for i, ck in enumerate(chunks)}
+        for cid, sim in ids:
+            if cid in id2idx:
+                chunks[id2idx[cid]]["similarity"] += sim
+                continue
+            chunk = self.dataStore.get(cid, idx_nms, kb_ids)
+            d = {
+                "chunk_id": cid,
+                "content_ltks": chunk["content_ltks"],
+                "content_with_weight": chunk["content_with_weight"],
+                "doc_id": doc_id,
+                "docnm_kwd": chunk.get("docnm_kwd", ""),
+                "kb_id": chunk["kb_id"],
+                "important_kwd": chunk.get("important_kwd", []),
+                "image_id": chunk.get("img_id", ""),
+                "similarity": sim,
+                "vector_similarity": sim,
+                "term_similarity": sim,
+                "vector": [0.0] * vector_size,
+                "positions": chunk.get("position_int", []),
+                "doc_type_kwd": chunk.get("doc_type_kwd", "")
+            }
+            for k in chunk.keys():
+                if k[-4:] == "_vec":
+                    d["vector"] = chunk[k]
+                    vector_size = len(chunk[k])
+                    break
+            chunks.append(d)
+
+        return sorted(chunks, key=lambda x:x["similarity"]*-1)[:topn]
--- a/rag/prompts/assign_toc_levels.md
+++ b/rag/prompts/assign_toc_levels.md
@ -1,4 +1,4 @@
-You are given a JSON array of TOC items. Each item has at least {"title": string} and may include an existing structure.
+You are given a JSON array of TOC(tabel of content) items. Each item has at least {"title": string} and may include an existing title hierarchical level.

 Task
 - For each item, assign a depth label using Arabic numerals only: top-level = 1, second-level = 2, third-level = 3, etc.
@ -9,7 +9,7 @@ Task

 Output
 - Return a valid JSON array only (no extra text).
- Each element must be {"structure": "1|2|3", "title": <original title string>}.
+- Each element must be {"level": "1|2|3", "title": <original title string>}.
 - title must be the original title string.

 Examples
@ -20,10 +20,10 @@ Input:

 Output:
 [
-  {"structure":"1","title":"Chapter 1 Methods"},
-  {"structure":"2","title":"Section 1 Definition"},
-  {"structure":"2","title":"Section 2 Process"},
-  {"structure":"1","title":"Chapter 2 Experiment"}
+  {"level":"1","title":"Chapter 1 Methods"},
+  {"level":"2","title":"Section 1 Definition"},
+  {"level":"2","title":"Section 2 Process"},
+  {"level":"1","title":"Chapter 2 Experiment"}
 ]

 Example B (parts with chapters)
@ -32,11 +32,11 @@ Input:

 Output:
 [
-  {"structure":"1","title":"Part I Theory"},
-  {"structure":"2","title":"Chapter 1 Basics"},
-  {"structure":"2","title":"Chapter 2 Methods"},
-  {"structure":"1","title":"Part II Applications"},
-  {"structure":"2","title":"Chapter 3 Case Studies"}
+  {"level":"1","title":"Part I Theory"},
+  {"level":"2","title":"Chapter 1 Basics"},
+  {"level":"2","title":"Chapter 2 Methods"},
+  {"level":"1","title":"Part II Applications"},
+  {"level":"2","title":"Chapter 3 Case Studies"}
 ]

 Example C (plain headings)
@ -45,9 +45,9 @@ Input:

 Output:
 [
-  {"structure":"1","title":"Introduction"},
-  {"structure":"2","title":"Background and Motivation"},
-  {"structure":"2","title":"Related Work"},
-  {"structure":"1","title":"Methodology"},
-  {"structure":"1","title":"Evaluation"}
+  {"level":"1","title":"Introduction"},
+  {"level":"2","title":"Background and Motivation"},
+  {"level":"2","title":"Related Work"},
+  {"level":"1","title":"Methodology"},
+  {"level":"1","title":"Evaluation"}
 ]
--- a/rag/prompts/generator.py
+++ b/rag/prompts/generator.py
@ -21,7 +21,9 @@ from copy import deepcopy
 from typing import Tuple
 import jinja2
 import json_repair
+import trio
 from api.utils import hash_str2int
+from rag.nlp import rag_tokenizer
 from rag.prompts.template import load_prompt
 from rag.settings import TAG_FLD
 from rag.utils import encoder, num_tokens_from_string
@ -122,7 +124,7 @@ def kb_prompt(kbinfos, max_tokens, hash_id=False):

    knowledges = []
    for i, ck in enumerate(kbinfos["chunks"][:chunks_num]):
-        cnt = "\nID: {}".format(i if not hash_id else hash_str2int(get_value(ck, "id", "chunk_id"), 100))
+        cnt = "\nID: {}".format(i if not hash_id else hash_str2int(get_value(ck, "id", "chunk_id"), 500))
        cnt += draw_node("Title", get_value(ck, "docnm_kwd", "document_name"))
        cnt += draw_node("URL", ck['url'])  if "url" in ck else ""
        for k, v in docs.get(get_value(ck, "doc_id", "document_id"), {}).items():
@ -440,11 +442,17 @@ def gen_meta_filter(chat_mdl, meta_data:dict, query: str) -> list:


 def gen_json(system_prompt:str, user_prompt:str, chat_mdl, gen_conf = None):
+    from graphrag.utils import get_llm_cache, set_llm_cache
+    cached = get_llm_cache(chat_mdl.llm_name, system_prompt, user_prompt, gen_conf)
+    if cached:
+        return json_repair.loads(cached)
    _, msg = message_fit_in(form_message(system_prompt, user_prompt), chat_mdl.max_length)
    ans = chat_mdl.chat(msg[0]["content"], msg[1:],gen_conf=gen_conf)
    ans = re.sub(r"(^.*</think>|```json\n|```\n*$)", "", ans, flags=re.DOTALL)
    try:
-        return json_repair.loads(ans)
+        res = json_repair.loads(ans)
+        set_llm_cache(chat_mdl.llm_name, system_prompt, ans, user_prompt, gen_conf)
+        return res
    except Exception:
        logging.exception(f"Loading json failure: {ans}")

@ -651,29 +659,32 @@ def toc_transformer(toc_pages, chat_mdl):

 TOC_LEVELS = load_prompt("assign_toc_levels")
 def assign_toc_levels(toc_secs, chat_mdl, gen_conf = {"temperature": 0.2}):
-    print("\nBegin TOC level assignment...\n")
-
-    ans = gen_json(
+    if not toc_secs:
+        return []
+    return gen_json(
        PROMPT_JINJA_ENV.from_string(TOC_LEVELS).render(),
        str(toc_secs),
        chat_mdl,
        gen_conf
    )
-    
-    return ans


 TOC_FROM_TEXT_SYSTEM = load_prompt("toc_from_text_system")
 TOC_FROM_TEXT_USER = load_prompt("toc_from_text_user")
 # Generate TOC from text chunks with text llms
-def gen_toc_from_text(text, chat_mdl):
-    ans = gen_json(
-        PROMPT_JINJA_ENV.from_string(TOC_FROM_TEXT_SYSTEM).render(),
-        PROMPT_JINJA_ENV.from_string(TOC_FROM_TEXT_USER).render(text=text),
-        chat_mdl,
-        gen_conf={"temperature": 0.0, "top_p": 0.9, "enable_thinking": False, }
-    )
-    return ans
+async def gen_toc_from_text(txt_info: dict, chat_mdl, callback=None):
+    try:
+        ans = gen_json(
+            PROMPT_JINJA_ENV.from_string(TOC_FROM_TEXT_SYSTEM).render(),
+            PROMPT_JINJA_ENV.from_string(TOC_FROM_TEXT_USER).render(text="\n".join([json.dumps(d, ensure_ascii=False) for d in txt_info["chunks"]])),
+            chat_mdl,
+            gen_conf={"temperature": 0.0, "top_p": 0.9}
+        )
+        txt_info["toc"] = ans if ans and not isinstance(ans, str) else []
+        if callback:
+            callback(msg="")
+    except Exception as e:
+        logging.exception(e)


 def split_chunks(chunks, max_length: int):
@ -690,44 +701,96 @@ def split_chunks(chunks, max_length: int):
        if batch_tokens + t > max_length:
            result.append(batch)
            batch, batch_tokens = [], 0
-        batch.append({"id": idx, "text": chunk})    
+        batch.append({idx: chunk})
        batch_tokens += t
    if batch:
        result.append(batch)
    return result


-def run_toc_from_text(chunks, chat_mdl):
+async def run_toc_from_text(chunks, chat_mdl, callback=None):
    input_budget = int(chat_mdl.max_length * INPUT_UTILIZATION) - num_tokens_from_string(
        TOC_FROM_TEXT_USER + TOC_FROM_TEXT_SYSTEM
    )

-    input_budget =  2000 if input_budget > 2000 else input_budget
+    input_budget =  1024 if input_budget > 1024 else input_budget
    chunk_sections = split_chunks(chunks, input_budget)
-    res = []
+    titles = []

-    for chunk in chunk_sections:
-        ans = gen_toc_from_text(chunk, chat_mdl)
-        res.extend(ans)
+    chunks_res = []
+    async with trio.open_nursery() as nursery:
+        for i, chunk in enumerate(chunk_sections):
+            if not chunk:
+                continue
+            chunks_res.append({"chunks": chunk})
+            nursery.start_soon(gen_toc_from_text, chunks_res[-1], chat_mdl, callback)
+
+    for chunk in chunks_res:
+        titles.extend(chunk.get("toc", []))
        
    # Filter out entries with title == -1
-    filtered = [x for x in res if x.get("title") and x.get("title") != "-1"]
+    prune = len(titles) > 512
+    max_len = 12 if prune else 22
+    filtered = []
+    for x in titles:
+        if not isinstance(x, dict) or not x.get("title") or x["title"] == "-1":
+            continue
+        if len(rag_tokenizer.tokenize(x["title"]).split(" ")) > max_len:
+            continue
+        if re.match(r"[0-9,.()/ -]+$", x["title"]):
+            continue
+        filtered.append(x)

-    print("\n\nFiltered TOC sections:\n", filtered)
+    logging.info(f"\n\nFiltered TOC sections:\n{filtered}")
+    if not filtered:
+        return []

-    # Generate initial structure (structure/title)
-    raw_structure = [{"structure": "0", "title": x.get("title", "")} for x in filtered]
+    # Generate initial level (level/title)
+    raw_structure = [x.get("title", "") for x in filtered]

    # Assign hierarchy levels using LLM
-    toc_with_levels = assign_toc_levels(raw_structure, chat_mdl, {"temperature": 0.0, "top_p": 0.9, "enable_thinking": False})
+    toc_with_levels = assign_toc_levels(raw_structure, chat_mdl, {"temperature": 0.0, "top_p": 0.9})
+    if not toc_with_levels:
+        return []

    # Merge structure and content (by index)
+    prune = len(toc_with_levels) > 512
+    max_lvl = sorted([t.get("level", "0") for t in toc_with_levels])[-1]
    merged = []
    for _ , (toc_item, src_item) in enumerate(zip(toc_with_levels, filtered)):
+        if prune and toc_item.get("level", "0") >= max_lvl:
+            continue
        merged.append({
-            "structure": toc_item.get("structure", "0"),
+            "level": toc_item.get("level", "0"),
            "title": toc_item.get("title", ""),
-            "content": src_item.get("content", ""),
+            "chunk_id": src_item.get("chunk_id", ""),
        })

-    return merged
+    return merged
+
+
+TOC_RELEVANCE_SYSTEM = load_prompt("toc_relevance_system")
+TOC_RELEVANCE_USER = load_prompt("toc_relevance_user")
+def relevant_chunks_with_toc(query: str, toc:list[dict], chat_mdl, topn: int=6):
+    import numpy as np
+    try:
+        ans = gen_json(
+            PROMPT_JINJA_ENV.from_string(TOC_RELEVANCE_SYSTEM).render(),
+            PROMPT_JINJA_ENV.from_string(TOC_RELEVANCE_USER).render(query=query, toc_json="[\n%s\n]\n"%"\n".join([json.dumps({"level": d["level"], "title":d["title"]}, ensure_ascii=False) for d in toc])),
+            chat_mdl,
+            gen_conf={"temperature": 0.0, "top_p": 0.9}
+        )
+        id2score = {}
+        for ti, sc in zip(toc, ans):
+            if not isinstance(sc, dict) or sc.get("score", -1) < 1:
+                continue
+            for id in ti.get("ids", []):
+                if id not in id2score:
+                    id2score[id] = []
+                id2score[id].append(sc["score"]/5.)
+        for id in id2score.keys():
+            id2score[id] = np.mean(id2score[id])
+        return [(id, sc) for id, sc in list(id2score.items()) if sc>=0.3][:topn]
+    except Exception as e:
+        logging.exception(e)
+    return []
--- a/rag/prompts/toc_from_text_system.md
+++ b/rag/prompts/toc_from_text_system.md
@ -1,25 +1,25 @@
 You are a robust Table-of-Contents (TOC) extractor.

 GOAL
-Given a dictionary of chunks {chunk_id: chunk_text}, extract TOC-like headings and return a strict JSON array of objects:
+Given a dictionary of chunks {"<chunk_ID>": chunk_text}, extract TOC-like headings and return a strict JSON array of objects:
 [
-  {"title": , "content": ""},
+  {"title": "", "chunk_id": ""},
  ...
 ]

 FIELDS
 - "title": the heading text (clean, no page numbers or leader dots).
  - If any part of a chunk has no valid heading, output that part as {"title":"-1", ...}.
- "content": the chunk_id (string).
+- "chunk_id": the chunk ID (string).
  - One chunk can yield multiple JSON objects in order (unmatched text + one or more headings).

 RULES
 1) Preserve input chunk order strictly.
 2) If a chunk contains multiple headings, expand them in order:
-   - Pre-heading narrative → {"title":"-1","content":chunk_id}
-   - Then each heading → {"title":"...","content":chunk_id}
-3) Do not merge outputs across chunks; each object refers to exactly one chunk_id.
-4) "title" must be non-empty (or exactly "-1"). "content" must be a string (chunk_id).
+   - Pre-heading narrative → {"title":"-1","chunk_id":"<chunk_ID>"}
+   - Then each heading → {"title":"...","chunk_id":"<chunk_ID>"}
+3) Do not merge outputs across chunks; each object refers to exactly one chunk ID.
+4) "title" must be non-empty (or exactly "-1"). "chunk_id" must be a string (chunk ID).
 5) When ambiguous, prefer "-1" unless the text strongly looks like a heading.

 HEADING DETECTION (cues, not hard rules)
@ -51,63 +51,69 @@ EXAMPLES

 Example 1 — No heading
 Input:
-{0: "Copyright page · Publication info (ISBN 123-456). All rights reserved."}
+[{"0": "Copyright page · Publication info (ISBN 123-456). All rights reserved."}, ...]
 Output:
 [
-  {"title":"-1","content":"0"}
+  {"title":"-1","chunk_id":"0"},
+  ...
 ]

 Example 2 — One heading
 Input:
-{1: "Chapter 1: General Provisions This chapter defines the overall rules…"}
+[{"1": "Chapter 1: General Provisions This chapter defines the overall rules…"}, ...]
 Output:
 [
-  {"title":"Chapter 1: General Provisions","content":"1"}
+  {"title":"Chapter 1: General Provisions","chunk_id":"1"},
+  ...
 ]

 Example 3 — Narrative + heading
 Input:
-{2: "This paragraph introduces the background and goals. Section 2: Definitions Key terms are explained…"}
+[{"2": "This paragraph introduces the background and goals. Section 2: Definitions Key terms are explained…"}, ...]
 Output:
 [
-  {"title":"-1","content":"2"},
-  {"title":"Section 2: Definitions","content":"2"}
+  {"title":"Section 2: Definitions","chunk_id":"2"},
+  ...
 ]

 Example 4 — Multiple headings in one chunk
 Input:
-{3: "Declarations and Commitments (I) Party B commits… (II) Party C commits… Appendix A Data Specification"}
+[{"3": "Declarations and Commitments (I) Party B commits… (II) Party C commits… Appendix A Data Specification"}, ...]
 Output:
 [
-  {"title":"Declarations and Commitments (I)","content":"3"},
-  {"title":"(II)","content":"3"},
-  {"title":"Appendix A","content":"3"}
+  {"title":"Declarations and Commitments","chunk_id":"3"},
+  {"title":"(I) Party B commits","chunk_id":"3"},
+  {"title":"(II) Party C commits","chunk_id":"3"},
+  {"title":"Appendix A Data Specification","chunk_id":"3"},
+  ...
 ]

 Example 5 — Numbering styles
 Input:
-{4: "1. Scope: Defines boundaries. 2) Definitions: Terms used. III) Methods Overview."}
+[{"4": "1. Scope: Defines boundaries. 2) Definitions: Terms used. III) Methods Overview."}, ...]
 Output:
 [
-  {"title":"1. Scope","content":"4"},
-  {"title":"2) Definitions","content":"4"},
-  {"title":"III) Methods","content":"4"}
+  {"title":"1. Scope","chunk_id":"4"},
+  {"title":"2) Definitions","chunk_id":"4"},
+  {"title":"III) Methods Overview","chunk_id":"4"},
+  ...
 ]

 Example 6 — Long list (NOT headings)
 Input:
-{5: "Item list: apples, bananas, strawberries, blueberries, mangos, peaches"}
+{"5": "Item list: apples, bananas, strawberries, blueberries, mangos, peaches"}, ...]
 Output:
 [
-  {"title":"-1","content":"5"}
+  {"title":"-1","chunk_id":"5"},
+  ...
 ]

 Example 7 — Mixed Chinese/English
 Input:
-{6: "（出版信息略）This standard follows industry practices. Chapter 1: Overview 摘要… 第2节：术语与缩略语"}
+{"6": "（出版信息略）This standard follows industry practices. Chapter 1: Overview 摘要… 第2节：术语与缩略语"}, ...]
 Output:
 [
-  {"title":"-1","content":"6"},
-  {"title":"Chapter 1: Overview","content":"6"},
-  {"title":"第2节：术语与缩略语","content":"6"}
+  {"title":"Chapter 1: Overview","chunk_id":"6"},
+  {"title":"第2节：术语与缩略语","chunk_id":"6"},
+  ...
 ]
--- a/rag/svr/task_executor.py
+++ b/rag/svr/task_executor.py
@ -12,7 +12,7 @@
 #  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
-
+import concurrent
 # from beartype import BeartypeConf
 # from beartype.claw import beartype_all  # <-- you didn't sign up for this
 # beartype_all(conf=BeartypeConf(violation_type=UserWarning))    # <-- emit warnings from all code
@ -32,7 +32,7 @@ from api.utils.log_utils import init_root_logger, get_project_base_directory
 from graphrag.general.index import run_graphrag_for_kb
 from graphrag.utils import get_llm_cache, set_llm_cache, get_tags_from_cache, set_tags_to_cache
 from rag.flow.pipeline import Pipeline
-from rag.prompts.generator import keyword_extraction, question_proposal, content_tagging
+from rag.prompts.generator import keyword_extraction, question_proposal, content_tagging, run_toc_from_text
 import logging
 import os
 from datetime import datetime
@ -317,7 +317,7 @@ async def build_chunks(task, progress_callback):
                d["img_id"] = ""
                docs.append(d)
                return
-            await image2id(d, partial(STORAGE_IMPL.put), d["id"], task["kb_id"])
+            await image2id(d, partial(STORAGE_IMPL.put, tenant_id=task["tenant_id"]), d["id"], task["kb_id"])
            docs.append(d)
        except Exception:
            logging.exception(
@ -419,6 +419,39 @@ async def build_chunks(task, progress_callback):
    return docs


+def build_TOC(task, docs, progress_callback):
+    progress_callback(msg="Start to generate table of content ...")
+    chat_mdl = LLMBundle(task["tenant_id"], LLMType.CHAT, llm_name=task["llm_id"], lang=task["language"])
+    docs = sorted(docs, key=lambda d:(
+        d.get("page_num_int", 0)[0] if isinstance(d.get("page_num_int", 0), list) else d.get("page_num_int", 0),
+        d.get("top_int", 0)[0] if isinstance(d.get("top_int", 0), list) else d.get("top_int", 0)
+    ))
+    toc: list[dict] = trio.run(run_toc_from_text, [d["content_with_weight"] for d in docs], chat_mdl, progress_callback)
+    logging.info("------------ T O C -------------\n"+json.dumps(toc, ensure_ascii=False, indent='  '))
+    ii = 0
+    while ii < len(toc):
+        try:
+            idx = int(toc[ii]["chunk_id"])
+            del toc[ii]["chunk_id"]
+            toc[ii]["ids"] = [docs[idx]["id"]]
+            if ii == len(toc) -1:
+                break
+            for jj in range(idx+1, int(toc[ii+1]["chunk_id"])+1):
+                toc[ii]["ids"].append(docs[jj]["id"])
+        except Exception as e:
+            logging.exception(e)
+        ii += 1
+
+    if toc:
+        d = copy.deepcopy(docs[-1])
+        d["content_with_weight"] = json.dumps(toc, ensure_ascii=False)
+        d["toc_kwd"] = "toc"
+        d["available_int"] = 0
+        d["page_num_int"] = 100000000
+        d["id"] = xxhash.xxh64((d["content_with_weight"] + str(d["doc_id"])).encode("utf-8", "surrogatepass")).hexdigest()
+        return d
+
+
 def init_kb(row, vector_size: int):
    idxnm = search.index_name(row["tenant_id"])
    return settings.docStoreConn.createIdx(idxnm, row.get("kb_id", ""), vector_size)
@ -659,7 +692,7 @@ async def run_raptor_for_kb(row, kb_parser_config, chat_mdl, embd_mdl, vector_si
        raptor_config["threshold"],
    )
    original_length = len(chunks)
-    chunks = await raptor(chunks, row["kb_parser_config"]["raptor"]["random_seed"], callback)
+    chunks = await raptor(chunks, kb_parser_config["raptor"]["random_seed"], callback)
    doc = {
        "doc_id": fake_doc_id,
        "kb_id": [str(row["kb_id"])],
@ -721,7 +754,7 @@ async def insert_es(task_id, task_tenant_id, task_dataset_id, chunks, progress_c
    return True


-@timeout(60*60*2, 1)
+@timeout(60*60*3, 1)
 async def do_handle_task(task):
    task_type = task.get("task_type", "")

@ -741,6 +774,8 @@ async def do_handle_task(task):
    task_document_name = task["name"]
    task_parser_config = task["parser_config"]
    task_start_ts = timer()
+    toc_thread = None
+    executor = concurrent.futures.ThreadPoolExecutor()

    # prepare the progress callback function
    progress_callback = partial(set_progress, task_id, task_from_page, task_to_page)
@ -782,8 +817,22 @@ async def do_handle_task(task):

        kb_parser_config = kb.parser_config
        if not kb_parser_config.get("raptor", {}).get("use_raptor", False):
-            progress_callback(prog=-1.0, msg="Internal error: Invalid RAPTOR configuration")
-            return
+            kb_parser_config.update(
+                {
+                    "raptor": {
+                        "use_raptor": True,
+                        "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n      {cluster_content}\nThe above is the content you need to summarize.",
+                        "max_token": 256,
+                        "threshold": 0.1,
+                        "max_cluster": 64,
+                        "random_seed": 0,
+                    },
+                }
+            )
+            if not KnowledgebaseService.update_by_id(kb.id, {"parser_config":kb_parser_config}):
+                progress_callback(prog=-1.0, msg="Internal error: Invalid RAPTOR configuration")
+                return
+
        # bind LLM for raptor
        chat_model = LLMBundle(task_tenant_id, LLMType.CHAT, llm_name=task_llm_id, lang=task_language)
        # run RAPTOR
@ -806,8 +855,25 @@ async def do_handle_task(task):

        kb_parser_config = kb.parser_config
        if not kb_parser_config.get("graphrag", {}).get("use_graphrag", False):
-            progress_callback(prog=-1.0, msg="Internal error: Invalid GraphRAG configuration")
-            return
+            kb_parser_config.update(
+                {
+                    "graphrag": {
+                        "use_graphrag": True,
+                        "entity_types": [
+                            "organization",
+                            "person",
+                            "geo",
+                            "event",
+                            "category",
+                        ],
+                        "method": "light",
+                    }
+                }
+            )
+            if not KnowledgebaseService.update_by_id(kb.id, {"parser_config":kb_parser_config}):
+                progress_callback(prog=-1.0, msg="Internal error: Invalid GraphRAG configuration")
+                return
+

        graphrag_conf = kb_parser_config.get("graphrag", {})
        start_ts = timer()
@ -842,8 +908,6 @@ async def do_handle_task(task):
        if not chunks:
            progress_callback(1., msg=f"No chunk built from {task_document_name}")
            return
-        # TODO: exception handler
-        ## set_progress(task["did"], -1, "ERROR: ")
        progress_callback(msg="Generate {} chunks".format(len(chunks)))
        start_ts = timer()
        try:
@ -857,6 +921,8 @@ async def do_handle_task(task):
        progress_message = "Embedding chunks ({:.2f}s)".format(timer() - start_ts)
        logging.info(progress_message)
        progress_callback(msg=progress_message)
+        if task["parser_id"].lower() == "naive" and task["parser_config"].get("toc_extraction", False):
+            toc_thread = executor.submit(build_TOC,task, chunks, progress_callback)

    chunk_count = len(set([chunk["id"] for chunk in chunks]))
    start_ts = timer()
@ -871,8 +937,17 @@ async def do_handle_task(task):
    DocumentService.increment_chunk_num(task_doc_id, task_dataset_id, token_count, chunk_count, 0)

    time_cost = timer() - start_ts
+    progress_callback(msg="Indexing done ({:.2f}s).".format(time_cost))
+    if toc_thread:
+        d = toc_thread.result()
+        if d:
+            e = await insert_es(task_id, task_tenant_id, task_dataset_id, [d], progress_callback)
+            if not e:
+                return
+            DocumentService.increment_chunk_num(task_doc_id, task_dataset_id, 0, 1, 0)
+
    task_time_cost = timer() - task_start_ts
-    progress_callback(prog=1.0, msg="Indexing done ({:.2f}s). Task done ({:.2f}s)".format(time_cost, task_time_cost))
+    progress_callback(prog=1.0, msg="Task done ({:.2f}s)".format(task_time_cost))
    logging.info(
        "Chunk doc({}), page({}-{}), chunks({}), token({}), elapsed:{:.2f}".format(task_document_name, task_from_page,
                                                                                   task_to_page, len(chunks),
--- a/rag/utils/minio_conn.py
+++ b/rag/utils/minio_conn.py
@ -60,7 +60,7 @@ class RAGFlowMinio:
                                 )
        return r

-    def put(self, bucket, fnm, binary):
+    def put(self, bucket, fnm, binary, tenant_id=None):
        for _ in range(3):
            try:
                if not self.conn.bucket_exists(bucket):
@ -76,13 +76,13 @@ class RAGFlowMinio:
                self.__open__()
                time.sleep(1)

-    def rm(self, bucket, fnm):
+    def rm(self, bucket, fnm, tenant_id=None):
        try:
            self.conn.remove_object(bucket, fnm)
        except Exception:
            logging.exception(f"Fail to remove {bucket}/{fnm}:")

-    def get(self, bucket, filename):
+    def get(self, bucket, filename, tenant_id=None):
        for _ in range(1):
            try:
                r = self.conn.get_object(bucket, filename)
@ -93,7 +93,7 @@ class RAGFlowMinio:
                time.sleep(1)
        return

-    def obj_exist(self, bucket, filename):
+    def obj_exist(self, bucket, filename, tenant_id=None):
        try:
            if not self.conn.bucket_exists(bucket):
                return False
@ -121,7 +121,7 @@ class RAGFlowMinio:
            logging.exception(f"bucket_exist {bucket} got exception")
            return False

-    def get_presigned_url(self, bucket, fnm, expires):
+    def get_presigned_url(self, bucket, fnm, expires, tenant_id=None):
        for _ in range(10):
            try:
                return self.conn.get_presigned_url("GET", bucket, fnm, expires)
--- a/sdk/python/pyproject.toml
+++ b/sdk/python/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "ragflow-sdk"
-version = "0.20.5"
+version = "0.21.0"
 description = "Python client sdk of [RAGFlow](https://github.com/infiniflow/ragflow). RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding."
 authors = [{ name = "Zhichang Yu", email = "yuzhichang@gmail.com" }]
 license = { text = "Apache License, Version 2.0" }
--- a/sdk/python/ragflow_sdk/modules/dataset.py
+++ b/sdk/python/ragflow_sdk/modules/dataset.py
@ -100,12 +100,51 @@ class DataSet(Base):
        res = res.json()
        if res.get("code") != 0:
            raise Exception(res["message"])
-
+        
+    
+    def _get_documents_status(self, document_ids):
+        import time
+        terminal_states = {"DONE", "FAIL", "CANCEL"}
+        interval_sec = 1
+        pending = set(document_ids)
+        finished = []
+        while pending:
+            for doc_id in list(pending):
+                def fetch_doc(doc_id: str) -> Document | None:
+                    try:
+                        docs = self.list_documents(id=doc_id)
+                        return docs[0] if docs else None
+                    except Exception:
+                        return None
+                doc = fetch_doc(doc_id)
+                if doc is None:
+                    continue
+                if isinstance(doc.run, str) and doc.run.upper() in terminal_states:
+                    finished.append((doc_id, doc.run, doc.chunk_count, doc.token_count))
+                    pending.discard(doc_id)
+                elif float(doc.progress or 0.0) >= 1.0:
+                    finished.append((doc_id, "DONE", doc.chunk_count, doc.token_count))
+                    pending.discard(doc_id)
+            if pending:
+                time.sleep(interval_sec)
+        return finished
+    
    def async_parse_documents(self, document_ids):
        res = self.post(f"/datasets/{self.id}/chunks", {"document_ids": document_ids})
        res = res.json()
        if res.get("code") != 0:
            raise Exception(res.get("message"))
+        
+
+    def parse_documents(self, document_ids):
+        try:
+            self.async_parse_documents(document_ids)
+            self._get_documents_status(document_ids)
+        except KeyboardInterrupt:
+            self.async_cancel_parse_documents(document_ids)
+            
+        return self._get_documents_status(document_ids)
+

    def async_cancel_parse_documents(self, document_ids):
        res = self.rm(f"/datasets/{self.id}/chunks", {"document_ids": document_ids})
--- a/sdk/python/ragflow_sdk/modules/session.py
+++ b/sdk/python/ragflow_sdk/modules/session.py
@ -33,35 +33,52 @@ class Session(Base):
                self.__session_type = "agent"
        super().__init__(rag, res_dict)

-    def ask(self, question="", stream=True, **kwargs):
+
+    def ask(self, question="", stream=False, **kwargs):
+        """
+        Ask a question to the session. If stream=True, yields Message objects as they arrive (SSE streaming).
+        If stream=False, returns a single Message object for the final answer.
+        """
        if self.__session_type == "agent":
            res = self._ask_agent(question, stream)
        elif self.__session_type == "chat":
            res = self._ask_chat(question, stream, **kwargs)
+        else:
+            raise Exception(f"Unknown session type: {self.__session_type}")

        if stream:
-            for line in res.iter_lines():
-                line = line.decode("utf-8")
-                if line.startswith("{"):
-                    json_data = json.loads(line)
-                    raise Exception(json_data["message"])
-                if not line.startswith("data:"):
-                    continue
-                json_data = json.loads(line[5:])
-                if json_data["data"] is True or json_data["data"].get("running_status"):
-                    continue
-                message = self._structure_answer(json_data)
-                yield message
+            for line in res.iter_lines(decode_unicode=True):
+                if not line:
+                    continue  # Skip empty lines
+                line = line.strip()
+
+                if line.startswith("data:"):
+                    content = line[len("data:"):].strip()
+                    if content == "[DONE]":
+                        break  # End of stream
+                else:
+                    content = line
+
+                try:
+                    json_data = json.loads(content)
+                except json.JSONDecodeError:
+                    continue  # Skip lines that are not valid JSON
+
+                event = json_data.get("event")
+                if event == "message":
+                    yield self._structure_answer(json_data)
+                elif event == "message_end":
+                    return  # End of message stream
        else:
            try:
-                json_data = json.loads(res.text)
+                json_data = res.json()
            except ValueError:
                raise Exception(f"Invalid response {res}")
-            return self._structure_answer(json_data)
+            yield self._structure_answer(json_data["data"])
        

    def _structure_answer(self, json_data):
-        answer = json_data["data"]["answer"]
+        answer = json_data["data"]["content"]
        reference = json_data["data"].get("reference", {})
        temp_dict = {
            "content": answer,
--- a/sdk/python/uv.lock
+++ b/sdk/python/uv.lock
@ -342,7 +342,7 @@ wheels = [

 [[package]]
 name = "ragflow-sdk"
-version = "0.20.5"
+version = "0.21.0"
 source = { virtual = "." }
 dependencies = [
    { name = "beartype" },
--- a/uv.lock
+++ b/uv.lock
@ -834,10 +834,10 @@ wheels = [
 [[package]]
 name = "cobble"
 version = "0.1.4"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/54/7a/a507c709be2c96e1bb6102eb7b7f4026c5e5e223ef7d745a17d239e9d844/cobble-0.1.4.tar.gz", hash = "sha256:de38be1539992c8a06e569630717c485a5f91be2192c461ea2b220607dfa78aa" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/54/7a/a507c709be2c96e1bb6102eb7b7f4026c5e5e223ef7d745a17d239e9d844/cobble-0.1.4.tar.gz", hash = "sha256:de38be1539992c8a06e569630717c485a5f91be2192c461ea2b220607dfa78aa", size = 3805, upload-time = "2024-06-01T18:11:09.528Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/d5/e1/3714a2f371985215c219c2a70953d38e3eed81ef165aed061d21de0e998b/cobble-0.1.4-py3-none-any.whl", hash = "sha256:36c91b1655e599fd428e2b95fdd5f0da1ca2e9f1abb0bc871dec21a0e78a2b44" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d5/e1/3714a2f371985215c219c2a70953d38e3eed81ef165aed061d21de0e998b/cobble-0.1.4-py3-none-any.whl", hash = "sha256:36c91b1655e599fd428e2b95fdd5f0da1ca2e9f1abb0bc871dec21a0e78a2b44", size = 3984, upload-time = "2024-06-01T18:11:07.911Z" },
 ]

 [[package]]
@ -873,10 +873,10 @@ wheels = [
 [[package]]
 name = "colorclass"
 version = "2.2.2"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/d7/1a/31ff00a33569a3b59d65bbdc445c73e12f92ad28195b7ace299f68b9af70/colorclass-2.2.2.tar.gz", hash = "sha256:6d4fe287766166a98ca7bc6f6312daf04a0481b1eda43e7173484051c0ab4366" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d7/1a/31ff00a33569a3b59d65bbdc445c73e12f92ad28195b7ace299f68b9af70/colorclass-2.2.2.tar.gz", hash = "sha256:6d4fe287766166a98ca7bc6f6312daf04a0481b1eda43e7173484051c0ab4366", size = 16709, upload-time = "2021-12-09T00:41:35.661Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/30/b6/daf3e2976932da4ed3579cff7a30a53d22ea9323ee4f0d8e43be60454897/colorclass-2.2.2-py2.py3-none-any.whl", hash = "sha256:6f10c273a0ef7a1150b1120b6095cbdd68e5cf36dfd5d0fc957a2500bbf99a55" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/30/b6/daf3e2976932da4ed3579cff7a30a53d22ea9323ee4f0d8e43be60454897/colorclass-2.2.2-py2.py3-none-any.whl", hash = "sha256:6f10c273a0ef7a1150b1120b6095cbdd68e5cf36dfd5d0fc957a2500bbf99a55", size = 18995, upload-time = "2021-12-09T00:41:34.653Z" },
 ]

 [[package]]
@ -894,10 +894,10 @@ wheels = [
 [[package]]
 name = "compressed-rtf"
 version = "1.0.7"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/b7/0c/929a4e8ef9d7143f54d77dadb5f370cc7b98534b1bd6e1124d0abe8efb24/compressed_rtf-1.0.7.tar.gz", hash = "sha256:7c30859334839f3cdc7d10796af5b434bb326b9df7cb5a65e95a8eacb2951b0e" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/b7/0c/929a4e8ef9d7143f54d77dadb5f370cc7b98534b1bd6e1124d0abe8efb24/compressed_rtf-1.0.7.tar.gz", hash = "sha256:7c30859334839f3cdc7d10796af5b434bb326b9df7cb5a65e95a8eacb2951b0e", size = 8152, upload-time = "2025-03-24T22:39:32.062Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/07/1d/62f5bf92e12335eb63517f42671ed78512d48bbc69e02a942dd7b90f03f0/compressed_rtf-1.0.7-py3-none-any.whl", hash = "sha256:b7904921d78c67a0a4b7fff9fb361a00ae2b447b6edca010ce321cd98fa0fcc0" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/07/1d/62f5bf92e12335eb63517f42671ed78512d48bbc69e02a942dd7b90f03f0/compressed_rtf-1.0.7-py3-none-any.whl", hash = "sha256:b7904921d78c67a0a4b7fff9fb361a00ae2b447b6edca010ce321cd98fa0fcc0", size = 7968, upload-time = "2025-03-24T23:03:57.433Z" },
 ]

 [[package]]
@ -1352,18 +1352,18 @@ wheels = [
 [[package]]
 name = "easygui"
 version = "0.98.3"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/cc/ad/e35f7a30272d322be09dc98592d2f55d27cc933a7fde8baccbbeb2bd9409/easygui-0.98.3.tar.gz", hash = "sha256:d653ff79ee1f42f63b5a090f2f98ce02335d86ad8963b3ce2661805cafe99a04" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/cc/ad/e35f7a30272d322be09dc98592d2f55d27cc933a7fde8baccbbeb2bd9409/easygui-0.98.3.tar.gz", hash = "sha256:d653ff79ee1f42f63b5a090f2f98ce02335d86ad8963b3ce2661805cafe99a04", size = 85583, upload-time = "2022-04-01T13:15:50.752Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/8e/a7/b276ff776533b423710a285c8168b52551cb2ab0855443131fdc7fd8c16f/easygui-0.98.3-py2.py3-none-any.whl", hash = "sha256:33498710c68b5376b459cd3fc48d1d1f33822139eb3ed01defbc0528326da3ba" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/8e/a7/b276ff776533b423710a285c8168b52551cb2ab0855443131fdc7fd8c16f/easygui-0.98.3-py2.py3-none-any.whl", hash = "sha256:33498710c68b5376b459cd3fc48d1d1f33822139eb3ed01defbc0528326da3ba", size = 92655, upload-time = "2022-04-01T13:15:49.568Z" },
 ]

 [[package]]
 name = "ebcdic"
 version = "1.1.1"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/0d/2f/633031205333bee5f9f93761af8268746aa75f38754823aabb8570eb245b/ebcdic-1.1.1-py2.py3-none-any.whl", hash = "sha256:33b4cb729bc2d0bf46cc1847b0e5946897cb8d3f53520c5b9aa5fa98d7e735f1" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/0d/2f/633031205333bee5f9f93761af8268746aa75f38754823aabb8570eb245b/ebcdic-1.1.1-py2.py3-none-any.whl", hash = "sha256:33b4cb729bc2d0bf46cc1847b0e5946897cb8d3f53520c5b9aa5fa98d7e735f1", size = 128537, upload-time = "2019-08-09T00:54:35.544Z" },
 ]

 [[package]]
@ -1482,7 +1482,7 @@ wheels = [
 [[package]]
 name = "extract-msg"
 version = "0.41.5"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "beautifulsoup4" },
    { name = "chardet" },
@ -1494,9 +1494,9 @@ dependencies = [
    { name = "rtfde" },
    { name = "tzlocal" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/ef/fa/67443d9b9f505c32cba96e34745223378b84cd4795c387310788cc8b6d7d/extract_msg-0.41.5.tar.gz", hash = "sha256:99d4fdc0c0912c836370bf9fbb6e77558bb978499c1b5fdd31634684e323885c" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ef/fa/67443d9b9f505c32cba96e34745223378b84cd4795c387310788cc8b6d7d/extract_msg-0.41.5.tar.gz", hash = "sha256:99d4fdc0c0912c836370bf9fbb6e77558bb978499c1b5fdd31634684e323885c", size = 181877, upload-time = "2023-06-11T17:19:42.931Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/be/e2/f0ed8df3907ad6e90e762d8e90adb4e25d12fea851a8371611fa14405782/extract_msg-0.41.5-py2.py3-none-any.whl", hash = "sha256:ad70dcdab3701b0fae554168c9642ad4ebef7f2ec283313c55e895a6518911e5" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/be/e2/f0ed8df3907ad6e90e762d8e90adb4e25d12fea851a8371611fa14405782/extract_msg-0.41.5-py2.py3-none-any.whl", hash = "sha256:ad70dcdab3701b0fae554168c9642ad4ebef7f2ec283313c55e895a6518911e5", size = 185222, upload-time = "2023-06-11T17:19:40.781Z" },
 ]

 [[package]]
@ -2642,13 +2642,13 @@ wheels = [
 [[package]]
 name = "imapclient"
 version = "2.3.1"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "six" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/19/d8/a4a0337d5e39a0569d89793d5053d7535eefd9b8756df4e10dc114caf3c2/IMAPClient-2.3.1.zip", hash = "sha256:26ea995664fae3a88b878ebce2aff7402931697b86658b7882043ddb01b0e6ba" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/19/d8/a4a0337d5e39a0569d89793d5053d7535eefd9b8756df4e10dc114caf3c2/IMAPClient-2.3.1.zip", hash = "sha256:26ea995664fae3a88b878ebce2aff7402931697b86658b7882043ddb01b0e6ba" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/13/9c/b2890e73bc9eee53fe63218e3f3cb774a6beefdb7b5c47928a81cc3b3c13/IMAPClient-2.3.1-py2.py3-none-any.whl", hash = "sha256:057f28025d2987c63e065afb0e4370b0b850b539b0e1494cea0427e88130108c" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/13/9c/b2890e73bc9eee53fe63218e3f3cb774a6beefdb7b5c47928a81cc3b3c13/IMAPClient-2.3.1-py2.py3-none-any.whl", hash = "sha256:057f28025d2987c63e065afb0e4370b0b850b539b0e1494cea0427e88130108c" },
 ]

 [[package]]
@ -2679,7 +2679,7 @@ wheels = [

 [[package]]
 name = "infinity-sdk"
-version = "0.6.0.dev7"
+version = "0.6.0"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "numpy" },
@ -2696,7 +2696,7 @@ dependencies = [
    { name = "thrift" },
 ]
 wheels = [
-    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/28/ec/f44f451d588f0d1d729eb1fcf1c0006d9fdeb116a33017e94d181dbee851/infinity_sdk-0.6.0.dev7-py3-none-any.whl", hash = "sha256:be4f51b667154ea407c2964769f10ebc00e362d3788e70e6c79f96df4970a40c", size = 75304, upload-time = "2025-10-10T02:42:08.49Z" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/f4/12/1ce243cbede6da5fc28e5462d90d96b13995446b3a90889287d31255b36e/infinity_sdk-0.6.0-py3-none-any.whl", hash = "sha256:e379853ffc44acba428572d633032e6c9bb842d1f08e9cad690916f52a8c6ba8", size = 75256, upload-time = "2025-10-14T12:05:13.918Z" },
 ]

 [[package]]
@ -2981,10 +2981,10 @@ wheels = [
 [[package]]
 name = "lark-parser"
 version = "0.12.0"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/5a/ee/fd1192d7724419ddfe15b6f17d1c8742800d4de917c0adac3b6aaf22e921/lark-parser-0.12.0.tar.gz", hash = "sha256:15967db1f1214013dca65b1180745047b9be457d73da224fcda3d9dd4e96a138" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/5a/ee/fd1192d7724419ddfe15b6f17d1c8742800d4de917c0adac3b6aaf22e921/lark-parser-0.12.0.tar.gz", hash = "sha256:15967db1f1214013dca65b1180745047b9be457d73da224fcda3d9dd4e96a138", size = 235029, upload-time = "2021-08-30T09:14:44.484Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/76/00/90f05db333fe1aa6b6ffea83a35425b7d53ea95c8bba0b1597f226cf1d5f/lark_parser-0.12.0-py2.py3-none-any.whl", hash = "sha256:0eaf30cb5ba787fe404d73a7d6e61df97b21d5a63ac26c5008c78a494373c675" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/76/00/90f05db333fe1aa6b6ffea83a35425b7d53ea95c8bba0b1597f226cf1d5f/lark_parser-0.12.0-py2.py3-none-any.whl", hash = "sha256:0eaf30cb5ba787fe404d73a7d6e61df97b21d5a63ac26c5008c78a494373c675", size = 103498, upload-time = "2021-08-30T13:01:01.603Z" },
 ]

 [[package]]
@ -3157,13 +3157,13 @@ wheels = [
 [[package]]
 name = "mammoth"
 version = "1.11.0"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "cobble" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/ed/3c/a58418d2af00f2da60d4a51e18cd0311307b72d48d2fffec36a97b4a5e44/mammoth-1.11.0.tar.gz", hash = "sha256:a0f59e442f34d5b6447f4b0999306cbf3e67aaabfa8cb516f878fb1456744637" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ed/3c/a58418d2af00f2da60d4a51e18cd0311307b72d48d2fffec36a97b4a5e44/mammoth-1.11.0.tar.gz", hash = "sha256:a0f59e442f34d5b6447f4b0999306cbf3e67aaabfa8cb516f878fb1456744637", size = 53142, upload-time = "2025-09-19T10:35:20.373Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/ca/54/2e39566a131b13f6d8d193f974cb6a34e81bb7cc2fa6f7e03de067b36588/mammoth-1.11.0-py2.py3-none-any.whl", hash = "sha256:c077ab0d450bd7c0c6ecd529a23bf7e0fa8190c929e28998308ff4eada3f063b" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ca/54/2e39566a131b13f6d8d193f974cb6a34e81bb7cc2fa6f7e03de067b36588/mammoth-1.11.0-py2.py3-none-any.whl", hash = "sha256:c077ab0d450bd7c0c6ecd529a23bf7e0fa8190c929e28998308ff4eada3f063b", size = 54752, upload-time = "2025-09-19T10:35:18.699Z" },
 ]

 [[package]]
@ -3199,14 +3199,14 @@ wheels = [
 [[package]]
 name = "markdownify"
 version = "1.2.0"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "beautifulsoup4" },
    { name = "six" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/83/1b/6f2697b51eaca81f08852fd2734745af15718fea10222a1d40f8a239c4ea/markdownify-1.2.0.tar.gz", hash = "sha256:f6c367c54eb24ee953921804dfe6d6575c5e5b42c643955e7242034435de634c" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/83/1b/6f2697b51eaca81f08852fd2734745af15718fea10222a1d40f8a239c4ea/markdownify-1.2.0.tar.gz", hash = "sha256:f6c367c54eb24ee953921804dfe6d6575c5e5b42c643955e7242034435de634c", size = 18771, upload-time = "2025-08-09T17:44:15.302Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/6a/e2/7af643acb4cae0741dffffaa7f3f7c9e7ab4046724543ba1777c401d821c/markdownify-1.2.0-py3-none-any.whl", hash = "sha256:48e150a1c4993d4d50f282f725c0111bd9eb25645d41fa2f543708fd44161351" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/6a/e2/7af643acb4cae0741dffffaa7f3f7c9e7ab4046724543ba1777c401d821c/markdownify-1.2.0-py3-none-any.whl", hash = "sha256:48e150a1c4993d4d50f282f725c0111bd9eb25645d41fa2f543708fd44161351", size = 15561, upload-time = "2025-08-09T17:44:14.074Z" },
 ]

 [[package]]
@ -3499,14 +3499,14 @@ wheels = [
 [[package]]
 name = "msoffcrypto-tool"
 version = "5.4.2"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "cryptography" },
    { name = "olefile" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/d2/b7/0fd6573157e0ec60c0c470e732ab3322fba4d2834fd24e1088d670522a01/msoffcrypto_tool-5.4.2.tar.gz", hash = "sha256:44b545adba0407564a0cc3d6dde6ca36b7c0fdf352b85bca51618fa1d4817370" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d2/b7/0fd6573157e0ec60c0c470e732ab3322fba4d2834fd24e1088d670522a01/msoffcrypto_tool-5.4.2.tar.gz", hash = "sha256:44b545adba0407564a0cc3d6dde6ca36b7c0fdf352b85bca51618fa1d4817370", size = 41183, upload-time = "2024-08-08T15:50:28.462Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/03/54/7f6d3d9acad083dae8c22d9ab483b657359a1bf56fee1d7af88794677707/msoffcrypto_tool-5.4.2-py3-none-any.whl", hash = "sha256:274fe2181702d1e5a107ec1b68a4c9fea997a44972ae1cc9ae0cb4f6a50fef0e" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/03/54/7f6d3d9acad083dae8c22d9ab483b657359a1bf56fee1d7af88794677707/msoffcrypto_tool-5.4.2-py3-none-any.whl", hash = "sha256:274fe2181702d1e5a107ec1b68a4c9fea997a44972ae1cc9ae0cb4f6a50fef0e", size = 48713, upload-time = "2024-08-08T15:50:27.093Z" },
 ]

 [[package]]
@ -3861,13 +3861,13 @@ wheels = [
 [[package]]
 name = "olefile"
 version = "0.46"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/34/81/e1ac43c6b45b4c5f8d9352396a14144bba52c8fec72a80f425f6a4d653ad/olefile-0.46.zip", hash = "sha256:133b031eaf8fd2c9399b78b8bc5b8fcbe4c31e85295749bb17a87cba8f3c3964" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/34/81/e1ac43c6b45b4c5f8d9352396a14144bba52c8fec72a80f425f6a4d653ad/olefile-0.46.zip", hash = "sha256:133b031eaf8fd2c9399b78b8bc5b8fcbe4c31e85295749bb17a87cba8f3c3964" }

 [[package]]
 name = "oletools"
 version = "0.60.2"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "colorclass" },
    { name = "easygui" },
@ -3876,9 +3876,9 @@ dependencies = [
    { name = "pcodedmp" },
    { name = "pyparsing" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/5c/2f/037f40e44706d542b94a2312ccc33ee2701ebfc9a83b46b55263d49ce55a/oletools-0.60.2.zip", hash = "sha256:ad452099f4695ffd8855113f453348200d195ee9fa341a09e197d66ee7e0b2c3" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/5c/2f/037f40e44706d542b94a2312ccc33ee2701ebfc9a83b46b55263d49ce55a/oletools-0.60.2.zip", hash = "sha256:ad452099f4695ffd8855113f453348200d195ee9fa341a09e197d66ee7e0b2c3", size = 3433750, upload-time = "2024-07-02T14:50:38.242Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/ac/ff/05257b7183279b80ecec6333744de23f48f0faeeba46c93e6d13ce835515/oletools-0.60.2-py2.py3-none-any.whl", hash = "sha256:72ad8bd748fd0c4e7b5b4733af770d11543ebb2bf2697455f99f975fcd50cc96" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ac/ff/05257b7183279b80ecec6333744de23f48f0faeeba46c93e6d13ce835515/oletools-0.60.2-py2.py3-none-any.whl", hash = "sha256:72ad8bd748fd0c4e7b5b4733af770d11543ebb2bf2697455f99f975fcd50cc96", size = 989449, upload-time = "2024-07-02T14:50:29.122Z" },
 ]

 [[package]]
@ -4346,14 +4346,14 @@ wheels = [
 [[package]]
 name = "pcodedmp"
 version = "1.2.6"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "oletools" },
    { name = "win-unicode-console", marker = "platform_python_implementation != 'PyPy' and sys_platform == 'win32'" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/3d/20/6d461e29135f474408d0d7f95b2456a9ba245560768ee51b788af10f7429/pcodedmp-1.2.6.tar.gz", hash = "sha256:025f8c809a126f45a082ffa820893e6a8d990d9d7ddb68694b5a9f0a6dbcd955" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/3d/20/6d461e29135f474408d0d7f95b2456a9ba245560768ee51b788af10f7429/pcodedmp-1.2.6.tar.gz", hash = "sha256:025f8c809a126f45a082ffa820893e6a8d990d9d7ddb68694b5a9f0a6dbcd955", size = 35549, upload-time = "2019-07-30T18:05:42.516Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/ba/72/b380fb5c89d89c3afafac8cf02a71a45f4f4a4f35531ca949a34683962d1/pcodedmp-1.2.6-py2.py3-none-any.whl", hash = "sha256:4441f7c0ab4cbda27bd4668db3b14f36261d86e5059ce06c0828602cbe1c4278" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ba/72/b380fb5c89d89c3afafac8cf02a71a45f4f4a4f35531ca949a34683962d1/pcodedmp-1.2.6-py2.py3-none-any.whl", hash = "sha256:4441f7c0ab4cbda27bd4668db3b14f36261d86e5059ce06c0828602cbe1c4278", size = 30939, upload-time = "2019-07-30T18:05:40.483Z" },
 ]

 [[package]]
@ -5436,7 +5436,7 @@ wheels = [

 [[package]]
 name = "ragflow"
-version = "0.20.5"
+version = "0.21.0"
 source = { virtual = "." }
 dependencies = [
    { name = "akshare" },
@ -5644,7 +5644,7 @@ requires-dist = [
    { name = "httpx", extras = ["socks"], specifier = "==0.27.2" },
    { name = "huggingface-hub", specifier = ">=0.25.0,<0.26.0" },
    { name = "infinity-emb", specifier = ">=0.0.66,<0.0.67" },
-    { name = "infinity-sdk", specifier = "==0.6.0.dev7" },
+    { name = "infinity-sdk", specifier = "==0.6.0" },
    { name = "itsdangerous", specifier = "==2.1.2" },
    { name = "json-repair", specifier = "==0.35.0" },
    { name = "langfuse", specifier = ">=2.60.0" },
@ -5809,8 +5809,8 @@ wheels = [
 [[package]]
 name = "red-black-tree-mod"
 version = "1.20"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/34/12/944f61bc67a1e918953741c0b3b75a28f96d8060d08fd3614233309ced3b/red-black-tree-mod-1.20.tar.gz", hash = "sha256:2448e6fc9cbf1be204c753f352c6ee49aa8156dbf1faa57dfc26bd7705077e0a" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/34/12/944f61bc67a1e918953741c0b3b75a28f96d8060d08fd3614233309ced3b/red-black-tree-mod-1.20.tar.gz", hash = "sha256:2448e6fc9cbf1be204c753f352c6ee49aa8156dbf1faa57dfc26bd7705077e0a", size = 28589, upload-time = "2013-11-04T16:58:20.788Z" }

 [[package]]
 name = "referencing"
@ -6068,14 +6068,14 @@ wheels = [
 [[package]]
 name = "rtfde"
 version = "0.0.2"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "lark-parser" },
    { name = "oletools" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/81/ea/28f5ab6b46a072887c8c8fd8c8a1f7b54025fc4bb2e09024668ea6686044/RTFDE-0.0.2.tar.gz", hash = "sha256:b86b5d734950fe8745a5b89133f50554252dbd67c6d1b9265e23ee140e7ea8a2" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/81/ea/28f5ab6b46a072887c8c8fd8c8a1f7b54025fc4bb2e09024668ea6686044/RTFDE-0.0.2.tar.gz", hash = "sha256:b86b5d734950fe8745a5b89133f50554252dbd67c6d1b9265e23ee140e7ea8a2", size = 18891, upload-time = "2020-12-28T15:15:35.981Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/5d/3f/39ba5a72620c43656bc80cb1f7afe0d498df4a48947d75ea0ca0752ffbf4/RTFDE-0.0.2-py3-none-any.whl", hash = "sha256:18386e4f060cee12a2a8035b0acf0cc99689f5dff1bf347bab7e92351860a21d" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/5d/3f/39ba5a72620c43656bc80cb1f7afe0d498df4a48947d75ea0ca0752ffbf4/RTFDE-0.0.2-py3-none-any.whl", hash = "sha256:18386e4f060cee12a2a8035b0acf0cc99689f5dff1bf347bab7e92351860a21d", size = 34626, upload-time = "2020-12-28T15:15:35Z" },
 ]

 [[package]]
@ -7131,13 +7131,13 @@ wheels = [
 [[package]]
 name = "tzlocal"
 version = "5.3.1"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
    { name = "tzdata", marker = "sys_platform == 'win32'" },
 ]
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/8b/2e/c14812d3d4d9cd1773c6be938f89e5735a1f11a9f184ac3639b93cef35d5/tzlocal-5.3.1.tar.gz", hash = "sha256:cceffc7edecefea1f595541dbd6e990cb1ea3d19bf01b2809f362a03dd7921fd" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/8b/2e/c14812d3d4d9cd1773c6be938f89e5735a1f11a9f184ac3639b93cef35d5/tzlocal-5.3.1.tar.gz", hash = "sha256:cceffc7edecefea1f595541dbd6e990cb1ea3d19bf01b2809f362a03dd7921fd", size = 30761, upload-time = "2025-03-05T21:17:41.549Z" }
 wheels = [
-    { url = "https://mirrors.aliyun.com/pypi/packages/c2/14/e2a54fabd4f08cd7af1c07030603c3356b74da07f7cc056e600436edfa17/tzlocal-5.3.1-py3-none-any.whl", hash = "sha256:eb1a66c3ef5847adf7a834f1be0800581b683b5608e74f86ecbcef8ab91bb85d" },
+    { url = "https://pypi.tuna.tsinghua.edu.cn/packages/c2/14/e2a54fabd4f08cd7af1c07030603c3356b74da07f7cc056e600436edfa17/tzlocal-5.3.1-py3-none-any.whl", hash = "sha256:eb1a66c3ef5847adf7a834f1be0800581b683b5608e74f86ecbcef8ab91bb85d", size = 18026, upload-time = "2025-03-05T21:17:39.857Z" },
 ]

 [[package]]
@ -7387,8 +7387,8 @@ sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/67/35/25e68fbc99e672
 [[package]]
 name = "win-unicode-console"
 version = "0.5"
-source = { registry = "https://mirrors.aliyun.com/pypi/simple" }
-sdist = { url = "https://mirrors.aliyun.com/pypi/packages/89/8d/7aad74930380c8972ab282304a2ff45f3d4927108bb6693cabcc9fc6a099/win_unicode_console-0.5.zip", hash = "sha256:d4142d4d56d46f449d6f00536a73625a871cba040f0bc1a2e305a04578f07d1e" }
+source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
+sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/89/8d/7aad74930380c8972ab282304a2ff45f3d4927108bb6693cabcc9fc6a099/win_unicode_console-0.5.zip", hash = "sha256:d4142d4d56d46f449d6f00536a73625a871cba040f0bc1a2e305a04578f07d1e", size = 31420, upload-time = "2016-06-25T19:48:54.05Z" }

 [[package]]
 name = "win32-setctime"
--- a/web/src/app.tsx
+++ b/web/src/app.tsx
@ -104,7 +104,7 @@ const RootProvider = ({ children }: React.PropsWithChildren) => {
    <TooltipProvider>
      <QueryClientProvider client={queryClient}>
        <ThemeProvider
-          defaultTheme={ThemeEnum.Light}
+          defaultTheme={ThemeEnum.Dark}
          storageKey="ragflow-ui-theme"
        >
          <Root>{children}</Root>
--- a/web/src/assets/logo-with-text.png
+++ b/web/src/assets/logo-with-text.png
--- a/web/src/assets/logo-with-text.svg
+++ b/web/src/assets/logo-with-text.svg
@ -0,0 +1,50 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<svg width="1500px" height="500px" viewBox="0 0 1500 500" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+    <title>RAG- logos</title>
+    <defs>
+        <rect id="path-1" x="0" y="0" width="480.282452" height="480"></rect>
+        <linearGradient x1="-19.6945332%" y1="78.7580689%" x2="78.6511106%" y2="-14.5268659%" id="linearGradient-3">
+            <stop stop-color="#43CDE9" offset="0%"></stop>
+            <stop stop-color="#4E40EC" offset="100%"></stop>
+        </linearGradient>
+        <linearGradient x1="-19.8760293%" y1="78.7580689%" x2="78.7257229%" y2="-14.5268659%" id="linearGradient-4">
+            <stop stop-color="#43CDE9" offset="0%"></stop>
+            <stop stop-color="#4E40EC" offset="100%"></stop>
+        </linearGradient>
+        <linearGradient x1="-20.3066254%" y1="78.7580689%" x2="78.902739%" y2="-14.5268659%" id="linearGradient-5">
+            <stop stop-color="#43CDE9" offset="0%"></stop>
+            <stop stop-color="#4E40EC" offset="100%"></stop>
+        </linearGradient>
+    </defs>
+    <g id="logos" stroke="none" stroke-width="1" fill="none" fill-rule="evenodd">
+        <g id="GitHub-bg" transform="translate(-67, -129)">
+            <g id="RAG--logos" transform="translate(67, 129)">
+                <rect id="矩形" fill-opacity="0" fill="#D8D8D8" x="0" y="0" width="1500" height="500"></rect>
+                <g id="rag-logo" transform="translate(22, 10)">
+                    <mask id="mask-2" fill="white">
+                        <use xlink:href="#path-1"></use>
+                    </mask>
+                    <use id="矩形" fill-opacity="0" fill="#D8D8D8" xlink:href="#path-1"></use>
+                    <path d="M91.6742598,285.524755 C100.185587,294.225017 100.18547,308.330398 91.673908,317.029482 L91.2380792,317.474803 C82.7265175,326.175066 68.9266032,326.175066 60.4152408,317.474803 C51.9038397,308.774541 51.903971,294.670338 60.415534,285.970076 L60.8514331,285.524755 C69.3630183,276.824493 83.1628154,276.825671 91.6742598,285.524755 Z" id="路径" fill="#53F3FD" mask="url(#mask-2)"></path>
+                    <path d="M195.695835,291.612434 C204.281754,300.406974 204.256864,314.640147 195.638943,323.402589 L134.628661,385.444627 C126.011688,394.207069 112.065969,394.182105 103.48005,385.387566 C94.8941307,376.593026 94.9193764,362.359853 103.536349,353.597411 L164.546987,291.555373 C173.164315,282.792931 187.109915,282.817895 195.695835,291.612434 Z" id="路径" fill="#43CDE9" mask="url(#mask-2)"></path>
+                    <path d="M278.834919,398.04199 C285.530372,387.681543 299.098865,384.881873 309.142632,391.789235 L310.453967,392.690662 C320.496557,399.598024 323.210961,413.596372 316.515509,423.95801 C309.820057,434.318457 296.250387,437.118127 286.207797,430.210765 L284.896462,429.309338 C274.852695,422.401976 272.139467,408.403628 278.834919,398.04199 Z" id="路径" fill="#53F3FD" mask="url(#mask-2)"></path>
+                    <path d="M423.798774,283.537264 C432.404449,292.254797 432.404449,306.388179 423.798774,315.104532 L393.828923,345.46185 C385.223249,354.179383 371.269415,354.179383 362.663741,345.46185 C354.058067,336.745498 354.058067,322.612116 362.663741,313.894583 L392.633592,283.537264 C401.239266,274.820912 415.1931,274.820912 423.798774,283.537264 Z" id="路径" fill="#43CDE9" mask="url(#mask-2)"></path>
+                    <path d="M423.808132,170.562223 C432.401329,179.313039 432.401329,193.500381 423.808132,202.250012 L202.660892,427.436888 C194.067695,436.187704 180.135277,436.187704 171.54208,427.436888 C162.948764,418.687257 162.948764,404.499915 171.54208,395.749099 L392.689319,170.562223 C401.282517,161.812592 415.214935,161.812592 423.808132,170.562223 Z" id="路径" fill="url(#linearGradient-3)" mask="url(#mask-2)"></path>
+                    <path d="M382.786724,101.550556 C391.376297,110.284631 391.376297,124.445361 382.786724,133.179436 L255.670977,262.449089 C247.082587,271.183637 233.157963,271.183637 224.569573,262.449089 C215.981183,253.714541 215.981183,239.554757 224.569573,230.820209 L351.68532,101.550556 C360.27371,92.8164813 374.198334,92.8164813 382.786724,101.550556 Z" id="路径" fill="url(#linearGradient-4)" mask="url(#mask-2)"></path>
+                    <path d="M315.714683,58.5360848 C324.339423,67.2590474 324.348924,81.4113439 315.736061,90.1460086 L151.753781,256.442347 C143.14068,265.177603 129.166512,265.187059 120.541773,256.463624 C111.917034,247.740189 111.907651,233.588838 120.52087,224.853582 L284.501963,58.5572433 C293.116014,49.8225431 307.089944,49.8130277 315.714683,58.5360848 Z" id="路径" fill="url(#linearGradient-5)" mask="url(#mask-2)"></path>
+                    <path d="M152.674088,111.602489 C161.254247,120.384434 161.230323,134.598332 152.620437,143.350004 L88.574692,208.451766 C79.9649246,217.203675 66.0297171,217.179931 57.4494992,208.397155 C48.8693051,199.615567 48.8932882,185.401431 57.5030675,176.649522 L121.548836,111.547879 C130.158604,102.796089 144.093811,102.820545 152.674088,111.602489 Z" id="路径" fill="#43CDE9" mask="url(#mask-2)"></path>
+                    <path d="M192.112981,46 C204.270268,46 214.125927,56.0203121 214.125927,68.380962 L214.125927,70.6190025 C214.125927,82.9796524 204.270268,93 192.112981,93 C179.955694,93 170.100035,82.9796524 170.100035,70.6190025 L170.100035,68.380962 C170.100035,56.0203121 179.955694,46 192.112981,46 Z" id="路径" fill="#53F3FD" mask="url(#mask-2)"></path>
+                </g>
+                <g id="RAG-Flow" transform="translate(558, 174)" fill="#66686A" fill-rule="nonzero">
+                    <path d="M60.4847896,91.2109375 L29.4118282,91.2109375 L29.4118282,147.65625 L0,147.65625 L0,3.7109375 L70.647016,3.7109375 C80.7440998,3.90625 88.5123402,5.14322917 93.951737,7.421875 C99.3911338,9.70052083 103.999964,13.0533854 107.778228,17.4804688 C110.905067,21.1263021 113.380481,25.1627604 115.20447,29.5898438 C117.028459,34.0169271 117.940454,39.0625 117.940454,44.7265625 C117.940454,51.5625 116.214178,58.2845052 112.761627,64.8925781 C109.309076,71.500651 103.609109,76.171875 95.661727,78.90625 C102.30626,81.5755208 107.012804,85.3678385 109.781359,90.2832031 C112.549914,95.1985677 113.934192,102.701823 113.934192,112.792969 L113.934192,122.460938 C113.934192,129.036458 114.194762,133.496094 114.715901,135.839844 C115.497611,139.550781 117.3216,142.285156 120.187869,144.042969 L120.187869,147.65625 L87.0629201,147.65625 C86.1509254,144.466146 85.4995006,141.894531 85.1086458,139.941406 C84.3269361,135.904948 83.90351,131.770833 83.8383675,127.539063 L83.64294,114.160156 C83.5126551,104.980469 81.8352363,98.8606771 78.6106837,95.8007812 C75.3861311,92.7408854 69.3441664,91.2109375 60.4847896,91.2109375 Z M78.7572543,65.0390625 C84.7503622,62.3046875 87.7469161,56.9010417 87.7469161,48.828125 C87.7469161,40.1041667 84.8480759,34.2447917 79.0503954,31.25 C75.7932716,29.5572917 70.9075859,28.7109375 64.3933382,28.7109375 L29.4118282,28.7109375 L29.4118282,67.3828125 L63.5139148,67.3828125 C70.2887323,67.3828125 75.3698455,66.6015625 78.7572543,65.0390625 Z" id="形状"></path>
+                    <path d="M228.161525,118.066406 L175.102977,118.066406 L165.136178,147.65625 L133.672362,147.65625 L185.069776,3.7109375 L219.074149,3.7109375 L270.080708,147.65625 L237.444327,147.65625 L228.161525,118.066406 Z M219.758145,93.2617188 L201.778821,36.6210938 L183.213216,93.2617188 L219.758145,93.2617188 Z" id="形状"></path>
+                    <path d="M376.686371,144.140625 C368.738989,149.023437 358.967618,151.464844 347.372257,151.464844 C328.285511,151.464844 312.651317,144.856771 300.469674,131.640625 C287.766891,118.359375 281.415499,100.195313 281.415499,77.1484375 C281.415499,53.8411458 287.832033,35.15625 300.665101,21.09375 C313.498169,7.03125 330.467784,0 351.573947,0 C369.878983,0 384.584897,4.63867188 395.691689,13.9160156 C406.798481,23.1933594 413.166158,34.765625 414.79472,48.6328125 L385.187465,48.6328125 C382.907478,38.8020833 377.337796,31.9335937 368.478419,28.0273438 C363.527591,25.8789062 358.023052,24.8046875 351.964801,24.8046875 C340.369441,24.8046875 330.842353,29.1829427 323.38354,37.9394531 C315.924726,46.6959635 312.195319,59.8632813 312.195319,77.4414063 C312.195319,95.1497396 316.234153,107.682292 324.31182,115.039063 C332.389487,122.395833 341.574576,126.074219 351.867088,126.074219 C361.964172,126.074219 370.237266,123.160807 376.686371,117.333984 C383.135477,111.507161 387.109168,103.873698 388.607445,94.4335938 L355.287068,94.4335938 L355.287068,70.4101563 L415.283289,70.4101563 L415.283289,147.65625 L395.349691,147.65625 L392.320566,129.6875 C386.522885,136.523438 381.311487,141.341146 376.686371,144.140625 Z" id="路径"></path>
+                    <polygon id="路径" points="550.729209 29.1992188 478.518773 29.1992188 478.518773 62.3046875 541.739547 62.3046875 541.739547 87.3046875 478.518773 87.3046875 478.518773 147.65625 448.618377 147.65625 448.618377 3.90625 550.729209 3.90625"></polygon>
+                    <polygon id="路径" points="569.197101 3.7109375 597.04551 3.7109375 597.04551 147.65625 569.197101 147.65625"></polygon>
+                    <path d="M713.031689,54.6875 C722.021351,65.9505208 726.516182,79.2643229 726.516182,94.6289063 C726.516182,110.253906 722.021351,123.616536 713.031689,134.716797 C704.042028,145.817057 690.394679,151.367188 672.089643,151.367188 C653.784607,151.367188 640.137258,145.817057 631.147596,134.716797 C622.157935,123.616536 617.663104,110.253906 617.663104,94.6289063 C617.663104,79.2643229 622.157935,65.9505208 631.147596,54.6875 C640.137258,43.4244792 653.784607,37.7929688 672.089643,37.7929688 C690.394679,37.7929688 704.042028,43.4244792 713.031689,54.6875 Z M671.991929,61.328125 C663.84912,61.328125 657.579156,64.2089844 653.182039,69.9707031 C648.784922,75.7324219 646.586363,83.9518229 646.586363,94.6289062 C646.586363,105.30599 648.784922,113.541667 653.182039,119.335938 C657.579156,125.130208 663.84912,128.027344 671.991929,128.027344 C680.134739,128.027344 686.388417,125.130208 690.752962,119.335938 C695.117508,113.541667 697.299781,105.30599 697.299781,94.6289062 C697.299781,83.9518229 695.117508,75.7324219 690.752962,69.9707031 C686.388417,64.2089844 680.134739,61.328125 671.991929,61.328125 Z" id="形状"></path>
+                    <polygon id="路径" points="827.259022 147.65625 810.549977 70.1171875 793.645504 147.65625 764.722245 147.65625 734.821848 41.2109375 764.722245 41.2109375 781.333576 117.578125 796.67463 41.2109375 824.913893 41.2109375 841.13437 117.871094 857.745701 41.2109375 886.766675 41.2109375 855.889141 147.65625"></polygon>
+                </g>
+            </g>
+        </g>
+    </g>
+</svg>
--- a/web/src/components/canvas/background.tsx
+++ b/web/src/components/canvas/background.tsx
@ -0,0 +1,11 @@
+import { Background } from '@xyflow/react';
+
+export function AgentBackground() {
+  return (
+    <Background
+      color="var(--text-primary)"
+      bgColor="rgb(var(--bg-canvas))"
+      className="rounded-lg"
+    />
+  );
+}
--- a/web/src/components/chunk-method-dialog/index.tsx
+++ b/web/src/components/chunk-method-dialog/index.tsx
@ -20,6 +20,7 @@ import { IParserConfig } from '@/interfaces/database/document';
 import { IChangeParserConfigRequestBody } from '@/interfaces/request/document';
 import {
  ChunkMethodItem,
+  EnableTocToggle,
  ParseTypeItem,
 } from '@/pages/dataset/dataset-setting/configuration/common-item';
 import { zodResolver } from '@hookform/resolvers/zod';
@ -113,6 +114,7 @@ export function ChunkMethodDialog({
        auto_keywords: z.coerce.number().optional(),
        auto_questions: z.coerce.number().optional(),
        html4excel: z.boolean().optional(),
+        toc_extraction: z.boolean().optional(),
        // raptor: z
        //   .object({
        //     use_raptor: z.boolean().optional(),
@ -247,7 +249,7 @@ export function ChunkMethodDialog({
  }, [parseType, form]);
  return (
    <Dialog open onOpenChange={hideModal}>
-      <DialogContent className="max-w-[50vw]">
+      <DialogContent className="max-w-[50vw] text-text-primary">
        <DialogHeader>
          <DialogTitle>{t('knowledgeDetails.chunkMethod')}</DialogTitle>
        </DialogHeader>
@ -338,6 +340,7 @@ export function ChunkMethodDialog({
                  show={showAutoKeywords(selectedTag) || showExcelToHtml}
                  className="space-y-3"
                >
+                  <EnableTocToggle />
                  {showAutoKeywords(selectedTag) && (
                    <>
                      <AutoKeywordsFormField></AutoKeywordsFormField>
--- a/Show More
+++ b/Show More