diff --git a/docker/README.md b/docker/README.md index 6ead9a287..e3fa4381d 100644 --- a/docker/README.md +++ b/docker/README.md @@ -1,81 +1,73 @@ +# README -# Docker Environment Variable + + +## Docker environment variables Look into [.env](./.env), there're some important variables. -## MYSQL_PASSWORD +- `STACK_VERSION` + The Elasticsearch version. Defaults to `8.11.3` -The mysql password could be changed by this variable. But you need to change *mysql.password* in [service_conf.yaml](./service_conf.yaml) at the same time. +- `ES_PORT` + Port to expose Elasticsearch HTTP API to the host. Defaults to `1200`. + +- `ELASTIC_PASSWORD` + The Elasticsearch password. + +- `MYSQL_PASSWORD` + The MySQL password. When updated, you must also revise the `mysql.password` entry in [service_conf.yaml](./service_conf.yaml) accordingly. + +- `MYSQL_PORT` + The exported port number of MySQL Docker container, needed when you access the database from outside the docker containers. + +- `MINIO_USER` + The MinIO username. When updated, you must also revise the `minio.user` entry in [service_conf.yaml](./service_conf.yaml) accordingly. + +- `MINIO_PASSWORD` + The MinIO password. When updated, you must also revise the `minio.password` entry in [service_conf.yaml](./service_conf.yaml) accordingly. -## MYSQL_PORT -It refers to exported port number of mysql docker container, it's useful if you want to access the database outside the docker containers. -## MINIO_USER -It refers to user name of [Mino](https://github.com/minio/minio). The modification should be synchronous updating at minio.user of [service_conf.yaml](./service_conf.yaml). +- `SVR_HTTP_PORT` + The port number on which RAGFlow's backend API server listens. -## MINIO_PASSWORD -It refers to user password of [Mino](https://github.com/minio/minio). The modification should be synchronous updating at minio.password of [service_conf.yaml](./service_conf.yaml). +- `RAGFLOW-IMAGE` + The Docker image edition. Available options: + - `infiniflow/ragflow:dev-slim` (default): The RAGFlow Docker image without embedding models + - `infiniflow/ragflow:dev`: The RAGFlow Docker image with embedding models. See the + +- `TIMEZONE` + The local time zone. -## SVR_HTTP_PORT -It refers to The API server serving port. +## Service Configuration +[service_conf.yaml](./service_conf.yaml) defines the system-level configuration for RAGFlow and is used by RAGFlow's *API server* and *task executor*. -# Service Configuration -[service_conf.yaml](./service_conf.yaml) is used by the *API server* and *task executor*. It's the most important configuration of the system. +- `ragflow` + - `host`: The IP address of the API server. + - `port`: The serving port of API server. -## ragflow +- `mysql` + - `name`: The database name in MySQL used by RAGFlow. + - `user`: The database name in MySQL used by RAGFlow. + - `password`: The database password. When updated, you must also revise the `MYSQL_PASSWORD` variable in [.env](./.env) accordingly. + - `port`: The serving port of MySQL inside the container. When updated, you must also revise the `MYSQL_PORT` variable in [.env](./.env) accordingly. + - `max_connections`: The maximum database connection. + - `stale_timeout`: The timeout duration in seconds. -### host -The IP address used by the API server. +- `minio` + - `user`: The MinIO username. When updated, you must also revise the `MINIO_USER` variable in [.env](./.env) accordingly. + - `password`: The MinIO password. When updated, you must also revise the `MINIO_PASSWORD` variable in [.env](./.env) accordingly. + - `host`: The serving IP and port inside the docker container. This is not updating until changing the minio part in [docker-compose.yml](./docker-compose.yml) -### port -The serving port of API server. +- `user_default_llm` + Newly signed-up users use LLM configured by this part; otherwise, you need to configure your own LLM on the *Settings* page. + - `factory`: The LLM suppliers. "OpenAI", "Tongyi-Qianwen", "ZHIPU-AI", "Moonshot", "DeepSeek", "Baichuan", and "VolcEngine" are supported. + - `api_key`: The API key for the specified LLM. -## mysql - -### name -The database name in mysql used by this system. - -### user -The database user name. - -### password -The database password. The modification should be synchronous updating at *MYSQL_PASSWORD* in [.env](./.env). - -### port -The serving port of mysql inside the container. The modification should be synchronous updating at [docker-compose.yml](./docker-compose.yml) - -### max_connections -The max database connection. - -### stale_timeout -The timeout duration in seconds. - -## minio - -### user -The username of minio. The modification should be synchronous updating at *MINIO_USER* in [.env](./.env). - -### password -The password of minio. The modification should be synchronous updating at *MINIO_PASSWORD* in [.env](./.env). - -### host -The serving IP and port inside the docker container. This is not updating until changing the minio part in [docker-compose.yml](./docker-compose.yml) - -## user_default_llm -Newly signed-up users use LLM configured by this part. Otherwise, user need to configure his own LLM in *setting*. - -### factory -The LLM suppliers. "OpenAI", "Tongyi-Qianwen", "ZHIPU-AI", "Moonshot", "DeepSeek", "Baichuan", and "VolcEngine" are supported. - -### api_key -The corresponding API key of your assigned LLM vendor. - -## oauth -This is OAuth configuration which allows your system using the third-party account to sign-up and sign-in to the system. - -### github -Got to [Github](https://github.com/settings/developers), register new application, the *client_id* and *secret_key* will be given. +- `oauth` + The OAuth configuration for signing up or signing in to RAGFlow using a third-party account. + - `github`: Go to [Github](https://github.com/settings/developers), register a new application, the *client_id* and *secret_key* will be given. diff --git a/docs/references/http_api_reference.md b/docs/references/http_api_reference.md index 484087bd7..31f4ce1d1 100644 --- a/docs/references/http_api_reference.md +++ b/docs/references/http_api_reference.md @@ -1,6 +1,5 @@ --- -sidebar_position: 0 - +sidebar_position: 1 slug: /http_api_reference --- @@ -93,7 +92,7 @@ curl --request POST \ - `"picture"`: Picture - `"one"`: One - `"knowledge_graph"`: Knowledge Graph - Ensure your LLM is properly configured on the **Settings** page before selecting this. Please note that Knowledge Graph consumes a large number of Tokens! + Ensure your LLM is properly configured on the **Settings** page before selecting this. Please also note that Knowledge Graph consumes a large number of Tokens! - `"email"`: Email - `"parser_config"`: (*Body parameter*), `object` @@ -269,7 +268,9 @@ curl --request PUT \ - `"presentation"`: Presentation - `"picture"`: Picture - `"one"`:One - - `"knowledge_graph"`: Knowledge Graph + - `"email"`: Email + - `"knowledge_graph"`: Knowledge Graph + Ensure your LLM is properly configured on the **Settings** page before selecting this. Please also note that Knowledge Graph consumes a large number of Tokens! ### Response @@ -318,7 +319,7 @@ curl --request GET \ - `page`: (*Filter parameter*) Specifies the page on which the datasets will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*) - The number of datasets on each page. Defaults to `1024`. + The number of datasets on each page. Defaults to `30`. - `orderby`: (*Filter parameter*) The field by which datasets should be sorted. Available options: - `create_time` (default) @@ -524,7 +525,7 @@ curl --request PUT \ - `"picture"`: Picture - `"one"`: One - `"knowledge_graph"`: Knowledge Graph - Ensure your LLM is properly configured on the **Settings** page before selecting this. Please note that Knowledge Graph consumes a large number of Tokens! + Ensure your LLM is properly configured on the **Settings** page before selecting this. Please also note that Knowledge Graph consumes a large number of Tokens! - `"email"`: Email - `"parser_config"`: (*Body parameter*), `object` The configuration settings for the dataset parser. The attributes in this JSON object vary with the selected `"chunk_method"`: @@ -645,7 +646,7 @@ curl --request GET \ - `page`: (*Filter parameter*), `integer` Specifies the page on which the documents will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` - The maximum number of documents on each page. Defaults to `1024`. + The maximum number of documents on each page. Defaults to `30`. - `orderby`: (*Filter parameter*), `string` The field by which documents should be sorted. Available options: - `create_time` (default) @@ -1245,7 +1246,7 @@ curl --request POST \ - `"page"`: (*Body parameter*), `integer` Specifies the page on which the chunks will be displayed. Defaults to `1`. - `"page_size"`: (*Body parameter*) - The maximum number of chunks on each page. Defaults to `1024`. + The maximum number of chunks on each page. Defaults to `30`. - `"similarity_threshold"`: (*Body parameter*) The minimum similarity score. Defaults to `0.2`. - `"vector_similarity_weight"`: (*Body parameter*), `float` @@ -1628,7 +1629,7 @@ curl --request GET \ - `page`: (*Filter parameter*), `integer` Specifies the page on which the chat assistants will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` - The number of chat assistants on each page. Defaults to `1024`. + The number of chat assistants on each page. Defaults to `30`. - `orderby`: (*Filter parameter*), `string` The attribute by which the results are sorted. Available options: - `create_time` (default) @@ -1860,7 +1861,7 @@ curl --request GET \ - `page`: (*Filter parameter*), `integer` Specifies the page on which the sessions will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` - The number of sessions on each page. Defaults to `1024`. + The number of sessions on each page. Defaults to `30`. - `orderby`: (*Filter parameter*), `string` The field by which sessions should be sorted. Available options: - `create_time` (default) diff --git a/docs/references/python_api_reference.md b/docs/references/python_api_reference.md index a6355cd6e..92f6f9898 100644 --- a/docs/references/python_api_reference.md +++ b/docs/references/python_api_reference.md @@ -1,5 +1,5 @@ -from Demos.mmapfile_demo import page_sizefrom Demos.mmapfile_demo import page_sizesidebar_position: 1 - +--- +sidebar_position: 2 slug: /python_api_reference --- @@ -58,7 +58,7 @@ A brief description of the dataset to create. Defaults to `""`. The language setting of the dataset to create. Available options: -- `"English"` (Default) +- `"English"` (default) - `"Chinese"` #### permission @@ -80,7 +80,7 @@ The chunking method of the dataset to create. Available options: - `"picture"`: Picture - `"one"`: One - `"knowledge_graph"`: Knowledge Graph - Ensure your LLM is properly configured on the **Settings** page before selecting this. Please note that Knowledge Graph consumes a large number of Tokens! + Ensure your LLM is properly configured on the **Settings** page before selecting this. Please also note that Knowledge Graph consumes a large number of Tokens! - `"email"`: Email #### parser_config @@ -160,7 +160,7 @@ rag_object.delete_datasets(ids=["id_1","id_2"]) ```python RAGFlow.list_datasets( page: int = 1, - page_size: int = 1024, + page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str = None, @@ -178,7 +178,7 @@ Specifies the page on which the datasets will be displayed. Defaults to `1`. #### page_size: `int` -The number of datasets on each page. Defaults to `1024`. +The number of datasets on each page. Defaults to `30`. #### orderby: `str` @@ -250,8 +250,9 @@ A dictionary representing the attributes to update, with the following keys: - `"presentation"`: Presentation - `"picture"`: Picture - `"one"`: One + - `"email"`: Email - `"knowledge_graph"`: Knowledge Graph - Ensure your LLM is properly configured on the **Settings** page before selecting this. Please note that Knowledge Graph consumes a large number of Tokens! + Ensure your LLM is properly configured on the **Settings** page before selecting this. Please also note that Knowledge Graph consumes a large number of Tokens! ### Returns @@ -334,7 +335,7 @@ A dictionary representing the attributes to update, with the following keys: - `"picture"`: Picture - `"one"`: One - `"knowledge_graph"`: Knowledge Graph - Ensure your LLM is properly configured on the **Settings** page before selecting this. Please note that Knowledge Graph consumes a large number of Tokens! + Ensure your LLM is properly configured on the **Settings** page before selecting this. Please also note that Knowledge Graph consumes a large number of Tokens! - `"email"`: Email - `"parser_config"`: `dict[str, Any]` The parsing configuration for the document. Its attributes vary based on the selected `"chunk_method"`: - `"chunk_method"`=`"naive"`: @@ -413,7 +414,7 @@ print(doc) ## List documents ```python -Dataset.list_documents(id:str =None, keywords: str=None, page: int=1, page_size:int = 1024,order_by:str = "create_time", desc: bool = True) -> list[Document] +Dataset.list_documents(id:str =None, keywords: str=None, page: int=1, page_size:int = 30, order_by:str = "create_time", desc: bool = True) -> list[Document] ``` Lists documents in the current dataset. @@ -434,7 +435,7 @@ Specifies the page on which the documents will be displayed. Defaults to `1`. #### page_size: `int` -The maximum number of documents on each page. Defaults to `1024`. +The maximum number of documents on each page. Defaults to `30`. #### orderby: `str` @@ -689,7 +690,7 @@ chunk = doc.add_chunk(content="xxxxxxx") ## List chunks ```python -Document.list_chunks(keywords: str = None, page: int = 1, page_size: int = 1024, id : str = None) -> list[Chunk] +Document.list_chunks(keywords: str = None, page: int = 1, page_size: int = 30, id : str = None) -> list[Chunk] ``` Lists chunks in the current document. @@ -706,7 +707,7 @@ Specifies the page on which the chunks will be displayed. Defaults to `1`. #### page_size: `int` -The maximum number of chunks on each page. Defaults to `1024`. +The maximum number of chunks on each page. Defaults to `30`. #### id: `str` @@ -811,7 +812,7 @@ chunk.update({"content":"sdfx..."}) ## Retrieve chunks ```python -RAGFlow.retrieve(question:str="", dataset_ids:list[str]=None, document_ids=list[str]=None, page:int=1, page_size:int=1024, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk] +RAGFlow.retrieve(question:str="", dataset_ids:list[str]=None, document_ids=list[str]=None, page:int=1, page_size:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk] ``` Retrieves chunks from specified datasets. @@ -836,7 +837,7 @@ The starting index for the documents to retrieve. Defaults to `1`. #### page_size: `int` -The maximum number of chunks to retrieve. Defaults to `1024`. +The maximum number of chunks to retrieve. Defaults to `30`. #### Similarity_threshold: `float` @@ -1078,7 +1079,7 @@ rag_object.delete_chats(ids=["id_1","id_2"]) ```python RAGFlow.list_chats( page: int = 1, - page_size: int = 1024, + page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str = None, @@ -1096,7 +1097,7 @@ Specifies the page on which the chat assistants will be displayed. Defaults to ` #### page_size: `int` -The number of chat assistants on each page. Defaults to `1024`. +The number of chat assistants on each page. Defaults to `30`. #### orderby: `str` @@ -1216,7 +1217,7 @@ session.update({"name": "updated_name"}) ```python Chat.list_sessions( page: int = 1, - page_size: int = 1024, + page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str = None, @@ -1234,7 +1235,7 @@ Specifies the page on which the sessions will be displayed. Defaults to `1`. #### page_size: `int` -The number of sessions on each page. Defaults to `1024`. +The number of sessions on each page. Defaults to `30`. #### orderby: `str` diff --git a/docs/references/supported_models.mdx b/docs/references/supported_models.mdx new file mode 100644 index 000000000..4d624d962 --- /dev/null +++ b/docs/references/supported_models.mdx @@ -0,0 +1,66 @@ +--- +sidebar_position: 0 +slug: /supported_models +--- + +# Supported models +import APITable from '../../src/components/APITable'; + +A complete list of models supported by RAGFlow, which will continue to expand. + +```mdx-code-block + +``` + +| Provider | Chat | Embedding | Rerank | Multi-modal | ASR/STT | TTS | +| --------------------- | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | +| Anthropic | :heavy_check_mark: | | | | | | +| Azure-OpenAI | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | :heavy_check_mark: | | +| BAAI | | :heavy_check_mark: | :heavy_check_mark: | | | | +| BaiChuan | :heavy_check_mark: | :heavy_check_mark: | | | | | +| BaiduYiyan | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | +| Bedrock | :heavy_check_mark: | :heavy_check_mark: | | | | | +| cohere | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | +| DeepSeek | :heavy_check_mark: | | | | | | +| FastEmbed | | :heavy_check_mark: | | | | | +| Fish Audio | | | | | | :heavy_check_mark: | +| Gemini | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | | | +| Google Cloud | :heavy_check_mark: | | | | | | +| Groq | :heavy_check_mark: | | | | | | +| HuggingFace | :heavy_check_mark: | :heavy_check_mark: | | | | | +| Jina | | :heavy_check_mark: | :heavy_check_mark: | | | | +| LeptonAI | :heavy_check_mark: | | | | | | +| LocalAI | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | | | +| LM-Studio | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | +| MiniMax | :heavy_check_mark: | | | | | | +| Mistral | :heavy_check_mark: | :heavy_check_mark: | | | | | +| Moonshot | :heavy_check_mark: | | | :heavy_check_mark: | | | +| novita.ai | :heavy_check_mark: | | | | | | +| NVIDIA | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | +| Ollama | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | | | +| OpenAI | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| OpenAI-API-Compatible | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | +| OpenRouter | :heavy_check_mark: | | | :heavy_check_mark: | | | +| PerfXCloud | :heavy_check_mark: | :heavy_check_mark: | | | | | +| Replicate | :heavy_check_mark: | :heavy_check_mark: | | | | | +| SILICONFLOW | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | +| StepFun | :heavy_check_mark: | | | | | | +| Tencent Hunyuan | :heavy_check_mark: | | | | | | +| Tencent Cloud | | | | | :heavy_check_mark: | | +| TogetherAI | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | +| Tongyi-Qianwen | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| Upstage | :heavy_check_mark: | :heavy_check_mark: | | | | | +| VolcEngine | :heavy_check_mark: | | | | | | +| Voyage AI | | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | +| Xinference | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| XunFei Spark | :heavy_check_mark: | | | | | :heavy_check_mark: | +| Youdao | | :heavy_check_mark: | :heavy_check_mark: | | | | +| ZHIPU-AI | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | | | +| 01.AI | :heavy_check_mark: | | | | | | + +```mdx-code-block + +``` + + +