Fix: default model base url extraction logic (#11263)

### What problem does this PR solve?

Fixes an issue where default models which used the same factory but
different base URLs would all be initialised with the default chat
model's base URL and would ignore e.g. the embedding model's base URL
config.

For example, with the following service config, the embedding and
reranker models would end up using the base URL for the default chat
model (i.e. `llm1.example.com`):

```yaml
ragflow:
  service_conf:
    user_default_llm:
      factory: OpenAI-API-Compatible
      api_key: not-used
      default_models:
        chat_model:
          name: llm1
          base_url: https://llm1.example.com/v1
        embedding_model:
          name: llm2
          base_url: https://llm2.example.com/v1
        rerank_model:
          name: llm3
          base_url: https://llm3.example.com/v1/rerank

  llm_factories:
    factory_llm_infos:
    - name: OpenAI-API-Compatible
      logo: ""
      tags: "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION"
      status: "1"
      llm:
        - llm_name: llm1
          base_url: 'https://llm1.example.com/v1'
          api_key: not-used
          tags: "LLM,CHAT,IMAGE2TEXT"
          max_tokens: 100000
          model_type: chat
          is_tools: false

        - llm_name: llm2
          base_url: https://llm2.example.com/v1
          api_key: not-used
          tags: "TEXT EMBEDDING"
          max_tokens: 10000
          model_type: embedding

        - llm_name: llm3
          base_url: https://llm3.example.com/v1/rerank
          api_key: not-used
          tags: "RERANK,1k"
          max_tokens: 10000
          model_type: rerank
```

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
This commit is contained in:
Scott Davidson
2025-11-17 06:21:27 +00:00
committed by GitHub
parent 9cef3a2625
commit 6b64641042

View File

@ -19,6 +19,7 @@ import re
from common.token_utils import num_tokens_from_string from common.token_utils import num_tokens_from_string
from functools import partial from functools import partial
from typing import Generator from typing import Generator
from common.constants import LLMType
from api.db.db_models import LLM from api.db.db_models import LLM
from api.db.services.common_service import CommonService from api.db.services.common_service import CommonService
from api.db.services.tenant_llm_service import LLM4Tenant, TenantLLMService from api.db.services.tenant_llm_service import LLM4Tenant, TenantLLMService
@ -32,6 +33,14 @@ def get_init_tenant_llm(user_id):
from common import settings from common import settings
tenant_llm = [] tenant_llm = []
model_configs = {
LLMType.CHAT: settings.CHAT_CFG,
LLMType.EMBEDDING: settings.EMBEDDING_CFG,
LLMType.SPEECH2TEXT: settings.ASR_CFG,
LLMType.IMAGE2TEXT: settings.IMAGE2TEXT_CFG,
LLMType.RERANK: settings.RERANK_CFG,
}
seen = set() seen = set()
factory_configs = [] factory_configs = []
for factory_config in [ for factory_config in [
@ -54,8 +63,8 @@ def get_init_tenant_llm(user_id):
"llm_factory": factory_config["factory"], "llm_factory": factory_config["factory"],
"llm_name": llm.llm_name, "llm_name": llm.llm_name,
"model_type": llm.model_type, "model_type": llm.model_type,
"api_key": factory_config["api_key"], "api_key": model_configs.get(llm.model_type, {}).get("api_key", factory_config["api_key"]),
"api_base": factory_config["base_url"], "api_base": model_configs.get(llm.model_type, {}).get("base_url", factory_config["base_url"]),
"max_tokens": llm.max_tokens if llm.max_tokens else 8192, "max_tokens": llm.max_tokens if llm.max_tokens else 8192,
} }
) )