From 9a4cd818918be272f2665e5e0ab1b0a56d8efaee Mon Sep 17 00:00:00 2001
From: writinwaters <93570324+writinwaters@users.noreply.github.com>
Date: Tue, 21 Oct 2025 20:11:23 +0800
Subject: [PATCH] Docs: Added token chunker and title chunker components
 (#10711)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
---
 .../chunker_title.md                          | 40 +++++++++++++++++++
 .../chunker_token.md                          | 34 ++++++++++++++--
 2 files changed, 70 insertions(+), 4 deletions(-)
 create mode 100644 docs/guides/agent/agent_component_reference/chunker_title.md
diff --git a/docs/guides/agent/agent_component_reference/chunker_title.md b/docs/guides/agent/agent_component_reference/chunker_title.md
new file mode 100644
index 000000000..9ec692db5
--- /dev/null
+++ b/docs/guides/agent/agent_component_reference/chunker_title.md
@@ -0,0 +1,40 @@
+---
+sidebar_position: 31
+slug: /chunker_title_component
+---
+
+# Title chunker component
+
+A component that splits texts into chunks by heading level.
+
+---
+
+A **Token chunker** component is a text splitter that uses specified heading level as delimiter to define chunk boundaries and create chunks.
+
+## Scenario
+
+A **Title chunker** component is optional, usually placed immediately after **Parser**.
+
+:::caution WARNING
+Placing a **Title chunker** after a **Token chunker** is invalid and will cause an error. Please note that this restriction is not currently system-enforced and requires your attention.
+:::
+
+## Configurations
+
+### Hierarchy
+
+Specifies the heading level to define chunk boundaries: 
+
+- H1
+- H2
+- H3 (Default)
+- H4
+
+Click **+ Add** to add heading levels here or update the corresponding **Regular Expressions** fields for custom heading patterns.
+
+### Output
+
+The global variable name for the output of the **Title chunkder** component, which can be referenced by subsequent components in the ingestion pipeline.
+
+- Default: `chunks`
+- Type: `Array<Object>`
\ No newline at end of file
diff --git a/docs/guides/agent/agent_component_reference/chunker_token.md b/docs/guides/agent/agent_component_reference/chunker_token.md
index 8d29d4fa6..bcdc272df 100644
--- a/docs/guides/agent/agent_component_reference/chunker_token.md
+++ b/docs/guides/agent/agent_component_reference/chunker_token.md
@@ -3,15 +3,41 @@ sidebar_position: 32
 slug: /chunker_token_component
 ---
 
-# Parser component
+# Token chunker component
 
-A component that sets the parsing rules for your dataset.
+A component that splits texts into chunks, respecting a maximum token limit and using delimiters to find optimal breakpoints.
 
 ---
 
-A **Parser** component defines how various file types should be parsed, including parsing methods for PDFs , fields to parse for Emails, and OCR methods for images.
+A **Token chunker** component is a text splitter that creates chunks by respecting a recommended maximum token length, using delimiters to ensure logical chunk breakpoints. It splits long texts into appropriately-sized, semantically related chunks.
 
 
 ## Scenario
 
-A **Parser** component is auto-populated on the ingestion pipeline canvas and required in all ingestion pipeline workflows.
\ No newline at end of file
+A **Token chunker** component is optional, usually placed immediately after **Parser** or **Title chunker**.
+
+## Configurations
+
+### Recommended chunk size
+
+The recommended maximum token limit for each created chunk. The **Token chunker** component creates chunks at specified delimiters. If this token limit is reached before a delimiter, a chunk is created at that point.
+
+### Overlapped percent (%)
+
+This defines the overlap percentage between chunks. An appropriate degree of overlap ensures semantic coherence without creating excessive, redundant tokens for the LLM.
+
+- Default: 0
+- Maximum: 30%
+
+
+### Delimiters
+
+Defaults to `\n`. Click the right-hand **Recycle bin** button to remove it, or click **+ Add** to add a delimiter.
+
+
+### Output
+
+The global variable name for the output of the **Token chunkder** component, which can be referenced by subsequent components in the ingestion pipeline.
+
+- Default: `chunks`
+- Type: `Array<Object>`
\ No newline at end of file