Reorganized docs for docusaurus publish (#860)

### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update
2026-01-30 15:16:45 +08:00 · 2024-05-21 20:53:55 +08:00
parent 1797f5ce31
commit 3cae87a902
19 changed files with 713 additions and 660 deletions
--- a/docs/guides/deploy_local_llm.md
+++ b/docs/guides/deploy_local_llm.md
@ -0,0 +1,75 @@
+---
+sidebar_position: 5
+slug: /deploy_local_llm
+---
+
+# Deploy a local LLM
+
+RAGFlow supports deploying LLMs locally using Ollama or Xinference.
+
+## Ollama
+
+One-click deployment of local LLMs, that is [Ollama](https://github.com/ollama/ollama).
+
+### Install
+
+- [Ollama on Linux](https://github.com/ollama/ollama/blob/main/docs/linux.md)
+- [Ollama Windows Preview](https://github.com/ollama/ollama/blob/main/docs/windows.md)
+- [Docker](https://hub.docker.com/r/ollama/ollama)
+
+### Launch Ollama
+
+Decide which LLM you want to deploy ([here's a list for supported LLM](https://ollama.com/library)), say, **mistral**:
+```bash
+$ ollama run mistral
+```
+Or,
+```bash
+$ docker exec -it ollama ollama run mistral
+```
+
+### Use Ollama in RAGFlow
+
+- Go to 'Settings > Model Providers > Models to be added > Ollama'.
+    
+![](https://github.com/infiniflow/ragflow/assets/12318111/a9df198a-226d-4f30-b8d7-829f00256d46)
+
+> Base URL: Enter the base URL where the Ollama service is accessible, like, `http://<your-ollama-endpoint-domain>:11434`.
+
+- Use Ollama Models.
+
+![](https://github.com/infiniflow/ragflow/assets/12318111/60ff384e-5013-41ff-a573-9a543d237fd3)
+
+## Xinference
+
+Xorbits Inference([Xinference](https://github.com/xorbitsai/inference)) empowers you to unleash the full potential of cutting-edge AI models. 
+
+### Install
+
+- [pip install "xinference[all]"](https://inference.readthedocs.io/en/latest/getting_started/installation.html)
+- [Docker](https://inference.readthedocs.io/en/latest/getting_started/using_docker_image.html)
+
+To start a local instance of Xinference, run the following command:
+```bash
+$ xinference-local --host 0.0.0.0 --port 9997
+```
+### Launch Xinference
+
+Decide which LLM you want to deploy ([here's a list for supported LLM](https://inference.readthedocs.io/en/latest/models/builtin/)), say, **mistral**.
+Execute the following command to launch the model, remember to replace ${quantization} with your chosen quantization method from the options listed above:
+```bash
+$ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization}
+```
+
+### Use Xinference in RAGFlow
+
+- Go to 'Settings > Model Providers > Models to be added > Xinference'.
+    
+![](https://github.com/infiniflow/ragflow/assets/12318111/bcbf4d7a-ade6-44c7-ad5f-0a92c8a73789)
+
+> Base URL: Enter the base URL where the Xinference service is accessible, like, `http://<your-xinference-endpoint-domain>:9997/v1`.
+
+- Use Xinference Models.
+
+![](https://github.com/infiniflow/ragflow/assets/12318111/b01fcb6f-47c9-4777-82e0-f1e947ed615a)
+![](https://github.com/infiniflow/ragflow/assets/12318111/1763dcd1-044f-438d-badd-9729f5b3a144)