mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-23 19:46:39 +08:00

Files

Jimmy Ben Klieve 6814ace1aa docs: update docs icons (#12465 )

### What problem does this PR solve?

Update icons for docs.
Trailing spaces are auto truncated by the editor, does not affect real
content.

### Type of change

- [x] Documentation Update

2026-01-07 10:00:09 +08:00

1.4 KiB

Raw Blame History

sidebar_position, slug, sidebar_custom_props

sidebar_position

slug

sidebar_custom_props

/chunker_token_component

categoryIcon
LucideBlocks

Token chunker component

A component that splits texts into chunks, respecting a maximum token limit and using delimiters to find optimal breakpoints.

A Token chunker component is a text splitter that creates chunks by respecting a recommended maximum token length, using delimiters to ensure logical chunk breakpoints. It splits long texts into appropriately-sized, semantically related chunks.

Scenario

A Token chunker component is optional, usually placed immediately after Parser or Title chunker.

Configurations

Recommended chunk size

The recommended maximum token limit for each created chunk. The Token chunker component creates chunks at specified delimiters. If this token limit is reached before a delimiter, a chunk is created at that point.

Overlapped percent (%)

This defines the overlap percentage between chunks. An appropriate degree of overlap ensures semantic coherence without creating excessive, redundant tokens for the LLM.

Default: 0
Maximum: 30%

Delimiters

Defaults to \n. Click the right-hand Recycle bin button to remove it, or click + Add to add a delimiter.

Output

The global variable name for the output of the Token chunker component, which can be referenced by subsequent components in the ingestion pipeline.

Default: chunks
Type: Array<Object>

1.4 KiB Raw Blame History