mirror of
https://github.com/infiniflow/ragflow.git
synced 2025-12-08 20:42:30 +08:00
Restructured guides (#5555)
### What problem does this PR solve? ### Type of change - [x] Documentation Update
This commit is contained in:
8
docs/guides/chat/_category_.json
Normal file
8
docs/guides/chat/_category_.json
Normal file
@ -0,0 +1,8 @@
|
||||
{
|
||||
"label": "Chat",
|
||||
"position": 1,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Chat-specific guides."
|
||||
}
|
||||
}
|
||||
41
docs/guides/chat/accelerate_question_answering.mdx
Normal file
41
docs/guides/chat/accelerate_question_answering.mdx
Normal file
@ -0,0 +1,41 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
slug: /accelerate_question_answering
|
||||
---
|
||||
|
||||
# Accelerate question answering
|
||||
import APITable from '@site/src/components/APITable';
|
||||
|
||||
A checklist to speed up document parsing and question answering.
|
||||
|
||||
---
|
||||
|
||||
Please note that some of your settings may consume a significant amount of time. If you often find that your question answering is time-consuming, here is a checklist to consider:
|
||||
|
||||
- In the **Prompt Engine** tab of your **Chat Configuration** dialogue, disabling **Multi-turn optimization** will reduce the time required to get an answer from the LLM.
|
||||
- In the **Prompt Engine** tab of your **Chat Configuration** dialogue, leaving the **Rerank model** field empty will significantly decrease retrieval time.
|
||||
- In the **Assistant Setting** tab of your **Chat Configuration** dialogue, disabling **Keyword analysis** will reduce the time to receive an answer from the LLM.
|
||||
- When chatting with your chat assistant, click the light bulb icon above the *current* dialogue and scroll down the popup window to view the time taken for each task:
|
||||

|
||||
|
||||
|
||||
```mdx-code-block
|
||||
<APITable>
|
||||
```
|
||||
|
||||
| Item name | Description |
|
||||
| ----------------- | ------------------------------------------------------------ |
|
||||
| Total | Total time spent on this conversation round, including chunk retrieval and answer generation. |
|
||||
| Check LLM | Time to validate the specified LLM. |
|
||||
| Create retriever | Time to create a chunk retriever. |
|
||||
| Bind embedding | Time to initialize an embedding model instance. |
|
||||
| Bind LLM | Time to initialize an LLM instance. |
|
||||
| Tune question | Time to optimize the user query using the context of the mult-turn conversation. |
|
||||
| Bind reranker | Time to initialize an reranker model instance for chunk retrieval. |
|
||||
| Generate keywords | Time to extract keywords from the user query. |
|
||||
| Retrieval | Time to retrieve the chunks. |
|
||||
| Generate answer | Time to generate the answer. |
|
||||
|
||||
```mdx-code-block
|
||||
</APITable>
|
||||
```
|
||||
94
docs/guides/chat/start_chat.md
Normal file
94
docs/guides/chat/start_chat.md
Normal file
@ -0,0 +1,94 @@
|
||||
---
|
||||
sidebar_position: 1
|
||||
slug: /start_chat
|
||||
---
|
||||
|
||||
# Chat
|
||||
|
||||
Initiate an AI-powered chat with a configured chat assistant.
|
||||
|
||||
---
|
||||
|
||||
Knowledge base, hallucination-free chat, and file management are the three pillars of RAGFlow. Chats in RAGFlow are based on a particular knowledge base or multiple knowledge bases. Once you have created your knowledge base and finished file parsing, you can go ahead and start an AI conversation.
|
||||
|
||||
## Start an AI chat
|
||||
|
||||
You start an AI conversation by creating an assistant.
|
||||
|
||||
1. Click the **Chat** tab in the middle top of the page **>** **Create an assistant** to show the **Chat Configuration** dialogue *of your next dialogue*.
|
||||
|
||||
> RAGFlow offers you the flexibility of choosing a different chat model for each dialogue, while allowing you to set the default models in **System Model Settings**.
|
||||
|
||||
2. Update **Assistant Setting**:
|
||||
|
||||
- **Assistant name** is the name of your chat assistant. Each assistant corresponds to a dialogue with a unique combination of knowledge bases, prompts, hybrid search configurations, and large model settings.
|
||||
- **Empty response**:
|
||||
- If you wish to *confine* RAGFlow's answers to your knowledge bases, leave a response here. Then, when it doesn't retrieve an answer, it *uniformly* responds with what you set here.
|
||||
- If you wish RAGFlow to *improvise* when it doesn't retrieve an answer from your knowledge bases, leave it blank, which may give rise to hallucinations.
|
||||
- **Show quote**: This is a key feature of RAGFlow and enabled by default. RAGFlow does not work like a black box. Instead, it clearly shows the sources of information that its responses are based on.
|
||||
- Select the corresponding knowledge bases. You can select one or multiple knowledge bases, but ensure that they use the same embedding model, otherwise an error would occur.
|
||||
|
||||
3. Update **Prompt Engine**:
|
||||
|
||||
- In **System**, you fill in the prompts for your LLM, you can also leave the default prompt as-is for the beginning.
|
||||
- **Similarity threshold** sets the similarity "bar" for each chunk of text. The default is 0.2. Text chunks with lower similarity scores are filtered out of the final response.
|
||||
- **Keyword similarity weight** is set to 0.7 by default. RAGFlow uses a hybrid score system to evaluate the relevance of different text chunks. This value sets the weight assigned to the keyword similarity component in the hybrid score.
|
||||
- If **Rerank model** is left empty, the hybrid score system uses keyword similarity and vector similarity, and the default weight assigned to the vector similarity component is 1-0.7=0.3.
|
||||
- If **Rerank model** is selected, the hybrid score system uses keyword similarity and reranker score, and the default weight assigned to the reranker score is 1-0.7=0.3.
|
||||
- **Top N** determines the *maximum* number of chunks to feed to the LLM. In other words, even if more chunks are retrieved, only the top N chunks are provided as input.
|
||||
- **Multi-turn optimization** enhances user queries using existing context in a multi-round conversation. It is enabled by default. When enabled, it will consume additional LLM tokens and significantly increase the time to generate answers.
|
||||
- **Rerank model** sets the reranker model to use. It is left empty by default.
|
||||
- If **Rerank model** is left empty, the hybrid score system uses keyword similarity and vector similarity, and the default weight assigned to the vector similarity component is 1-0.7=0.3.
|
||||
- If **Rerank model** is selected, the hybrid score system uses keyword similarity and reranker score, and the default weight assigned to the reranker score is 1-0.7=0.3.
|
||||
- **Variable** refers to the variables (keys) to be used in the system prompt. `{knowledge}` is a reserved variable. Click **Add** to add more variables for the system prompt.
|
||||
- If you are uncertain about the logic behind **Variable**, leave it *as-is*.
|
||||
|
||||
4. Update **Model Setting**:
|
||||
|
||||
- In **Model**: you select the chat model. Though you have selected the default chat model in **System Model Settings**, RAGFlow allows you to choose an alternative chat model for your dialogue.
|
||||
- **Preset configurations** refers to the level that the LLM improvises. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.
|
||||
- **Temperature**: Level of the prediction randomness of the LLM. The higher the value, the more creative the LLM is.
|
||||
- **Top P** is also known as "nucleus sampling". See [here](https://en.wikipedia.org/wiki/Top-p_sampling) for more information.
|
||||
- **Max Tokens**: The maximum length of the LLM's responses. Note that the responses may be curtailed if this value is set too low.
|
||||
|
||||
5. Now, let's start the show:
|
||||
|
||||

|
||||
|
||||
:::tip NOTE
|
||||
|
||||
1. Click the light bulb icon above the answer to view the expanded system prompt:
|
||||
|
||||

|
||||
|
||||
*The light bulb icon is available only for the current dialogue.*
|
||||
|
||||
2. Scroll down the expanded prompt to view the time consumed for each task:
|
||||
|
||||

|
||||
:::
|
||||
|
||||
## Update settings of an existing chat assistant
|
||||
|
||||
Hover over an intended chat assistant **>** **Edit** to show the chat configuration dialogue:
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
## Integrate chat capabilities into your application or webpage
|
||||
|
||||
RAGFlow offers HTTP and Python APIs for you to integrate RAGFlow's capabilities into your applications. Read the following documents for more information:
|
||||
|
||||
- [Acquire a RAGFlow API key](./models/llm_api_key_setup.md)
|
||||
- [HTTP API reference](../references/http_api_reference.md)
|
||||
- [Python API reference](../references/python_api_reference.md)
|
||||
|
||||
You can use iframe to embed the created chat assistant into a third-party webpage:
|
||||
|
||||
1. Before proceeding, you must [acquire an API key](./models/llm_api_key_setup.md); otherwise, an error message would appear.
|
||||
2. Hover over an intended chat assistant **>** **Edit** to show the **iframe** window:
|
||||
|
||||

|
||||
|
||||
3. Copy the iframe and embed it into a specific location on your webpage.
|
||||
Reference in New Issue
Block a user