mirror of
https://github.com/infiniflow/ragflow.git
synced 2025-12-08 20:42:30 +08:00
Docs: minor (#10630)
### What problem does this PR solve? ### Type of change - [x] Documentation Update
This commit is contained in:
@ -3,15 +3,15 @@ sidebar_position: 30
|
||||
slug: /parser_component
|
||||
---
|
||||
|
||||
# Message component
|
||||
# Parser component
|
||||
|
||||
A component that sets the parsing rules for your dataset.
|
||||
|
||||
---
|
||||
|
||||
A **Parser** component sets the parsing rules for various file types, including parsing methods for PDFs , fields to parse for Emails, and OCR methods for images.
|
||||
A **Parser** component defines how various file types should be parsed, including parsing methods for PDFs , fields to parse for Emails, and OCR methods for images.
|
||||
|
||||
|
||||
## Scenario
|
||||
|
||||
An **parser** component is auto-populated on the ingestion pipeline canvase and always required in an ingestion pipeline workflow.
|
||||
A **Parser** component is auto-populated on the ingestion pipeline canvas and required in all ingestion pipeline workflows.
|
||||
29
docs/guides/agent/indexer.md
Normal file
29
docs/guides/agent/indexer.md
Normal file
@ -0,0 +1,29 @@
|
||||
---
|
||||
sidebar_position: 30
|
||||
slug: /indexer_component
|
||||
---
|
||||
|
||||
# Indexer component
|
||||
|
||||
A component that defines how chunks are indexed.
|
||||
|
||||
---
|
||||
|
||||
An **Indexer** component indexes chunks and configures their storage formats in the document engine.
|
||||
|
||||
## Scenario
|
||||
|
||||
An **Indexer** component is the mandatory ending component for all ingestion pipelines.
|
||||
|
||||
## Configurations
|
||||
|
||||
### Search method
|
||||
|
||||
This setting configures how chunks are stored in the document engine: as full-text, embeddings, or both.
|
||||
|
||||
### Filename embedding weight
|
||||
|
||||
This setting defines the filename's contribution to the final embedding, which is a weighted combination of both the chunk content and the filename. Essentially, a higher value gives the filename more influence in the final *composite* embedding.
|
||||
|
||||
- 0.1: Filename contributes 10% (chunk content 90%)
|
||||
- 0.5 (maximum): Filename contributes 50% (chunk content 90%)
|
||||
Reference in New Issue
Block a user