Feat: add gpustack model provider (#4469)

### What problem does this PR solve? Add GPUStack as a new model provider. [GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU cluster manager for running LLMs. Currently, locally deployed models in GPUStack cannot integrate well with RAGFlow. GPUStack provides both OpenAI compatible APIs (Models / Chat Completions / Embeddings / Speech2Text / TTS) and other APIs like Rerank. We would like to use GPUStack as a model provider in ragflow. [GPUStack Docs](https://docs.gpustack.ai/latest/quickstart/) Related issue: https://github.com/infiniflow/ragflow/issues/4064. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Testing Instructions 1. Install GPUStack and deploy the `llama-3.2-1b-instruct` llm, `bge-m3` text embedding model, `bge-reranker-v2-m3` rerank model, `faster-whisper-medium` Speech-to-Text model, `cosyvoice-300m-sft` in GPUStack. 2. Add provider in ragflow settings. 3. Testing in ragflow.
2026-02-01 08:05:07 +08:00 · 2025-01-15 14:15:58 +08:00
parent e478586a8e
commit 7944aacafa
12 changed files with 159 additions and 3 deletions
--- a/web/src/assets/svg/llm/gpustack.svg
+++ b/web/src/assets/svg/llm/gpustack.svg
--- a/web/src/constants/setting.ts
+++ b/web/src/constants/setting.ts
@ -72,6 +72,7 @@ export const IconMap = {
  'nomic-ai': 'nomic-ai',
  jinaai: 'jina',
  'sentence-transformers': 'sentence-transformers',
+  GPUStack: 'gpustack',
 };

 export const TimezoneList = [
--- a/web/src/pages/user-setting/constants.tsx
+++ b/web/src/pages/user-setting/constants.tsx
@ -31,6 +31,7 @@ export const LocalLlmFactories = [
  'Replicate',
  'OpenRouter',
  'HuggingFace',
+  'GPUStack',
 ];

 export enum TenantRole {
--- a/web/src/pages/user-setting/setting-model/ollama-modal/index.tsx
+++ b/web/src/pages/user-setting/setting-model/ollama-modal/index.tsx
@ -29,6 +29,7 @@ const llmFactoryToUrlMap = {
  OpenRouter: 'https://openrouter.ai/docs',
  HuggingFace:
    'https://huggingface.co/docs/text-embeddings-inference/quick_tour',
+  GPUStack: 'https://docs.gpustack.ai/latest/quickstart',
 };
 type LlmFactory = keyof typeof llmFactoryToUrlMap;

@ -76,6 +77,13 @@ const OllamaModal = ({
      { value: 'speech2text', label: 'sequence2text' },
      { value: 'tts', label: 'tts' },
    ],
+    GPUStack: [
+      { value: 'chat', label: 'chat' },
+      { value: 'embedding', label: 'embedding' },
+      { value: 'rerank', label: 'rerank' },
+      { value: 'speech2text', label: 'sequence2text' },
+      { value: 'tts', label: 'tts' },
+    ],
    Default: [
      { value: 'chat', label: 'chat' },
      { value: 'embedding', label: 'embedding' },