mirror of
https://github.com/infiniflow/ragflow.git
synced 2025-12-08 20:42:30 +08:00
Reorganized docs for docusaurus publish (#860)
### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update
This commit is contained in:
75
docs/guides/deploy_local_llm.md
Normal file
75
docs/guides/deploy_local_llm.md
Normal file
@ -0,0 +1,75 @@
|
||||
---
|
||||
sidebar_position: 5
|
||||
slug: /deploy_local_llm
|
||||
---
|
||||
|
||||
# Deploy a local LLM
|
||||
|
||||
RAGFlow supports deploying LLMs locally using Ollama or Xinference.
|
||||
|
||||
## Ollama
|
||||
|
||||
One-click deployment of local LLMs, that is [Ollama](https://github.com/ollama/ollama).
|
||||
|
||||
### Install
|
||||
|
||||
- [Ollama on Linux](https://github.com/ollama/ollama/blob/main/docs/linux.md)
|
||||
- [Ollama Windows Preview](https://github.com/ollama/ollama/blob/main/docs/windows.md)
|
||||
- [Docker](https://hub.docker.com/r/ollama/ollama)
|
||||
|
||||
### Launch Ollama
|
||||
|
||||
Decide which LLM you want to deploy ([here's a list for supported LLM](https://ollama.com/library)), say, **mistral**:
|
||||
```bash
|
||||
$ ollama run mistral
|
||||
```
|
||||
Or,
|
||||
```bash
|
||||
$ docker exec -it ollama ollama run mistral
|
||||
```
|
||||
|
||||
### Use Ollama in RAGFlow
|
||||
|
||||
- Go to 'Settings > Model Providers > Models to be added > Ollama'.
|
||||
|
||||

|
||||
|
||||
> Base URL: Enter the base URL where the Ollama service is accessible, like, `http://<your-ollama-endpoint-domain>:11434`.
|
||||
|
||||
- Use Ollama Models.
|
||||
|
||||

|
||||
|
||||
## Xinference
|
||||
|
||||
Xorbits Inference([Xinference](https://github.com/xorbitsai/inference)) empowers you to unleash the full potential of cutting-edge AI models.
|
||||
|
||||
### Install
|
||||
|
||||
- [pip install "xinference[all]"](https://inference.readthedocs.io/en/latest/getting_started/installation.html)
|
||||
- [Docker](https://inference.readthedocs.io/en/latest/getting_started/using_docker_image.html)
|
||||
|
||||
To start a local instance of Xinference, run the following command:
|
||||
```bash
|
||||
$ xinference-local --host 0.0.0.0 --port 9997
|
||||
```
|
||||
### Launch Xinference
|
||||
|
||||
Decide which LLM you want to deploy ([here's a list for supported LLM](https://inference.readthedocs.io/en/latest/models/builtin/)), say, **mistral**.
|
||||
Execute the following command to launch the model, remember to replace ${quantization} with your chosen quantization method from the options listed above:
|
||||
```bash
|
||||
$ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization}
|
||||
```
|
||||
|
||||
### Use Xinference in RAGFlow
|
||||
|
||||
- Go to 'Settings > Model Providers > Models to be added > Xinference'.
|
||||
|
||||

|
||||
|
||||
> Base URL: Enter the base URL where the Xinference service is accessible, like, `http://<your-xinference-endpoint-domain>:9997/v1`.
|
||||
|
||||
- Use Xinference Models.
|
||||
|
||||

|
||||

|
||||
Reference in New Issue
Block a user