From b091ff2730928756a969dd6cda08725d753fa85b Mon Sep 17 00:00:00 2001
From: Pegasus <42954461+leonace924@users.noreply.github.com>
Date: Wed, 14 Jan 2026 03:35:46 -0500
Subject: [PATCH] Fix enable_thinking parameter for Qwen3 models (#12603)

### Issue

When using Qwen3 models (`qwen3-32b`, `qwen3-max`) through the
Tongyi-Qianwen provider for non-streaming calls (e.g., knowledge graph
generation), the API fails with:

Closes #12424

```
parameter.enable_thinking must be set to false for non-streaming calls
```

### Root Cause

In `LiteLLMBase.async_chat()`, the `extra_body={"enable_thinking":
False}` was set in `kwargs` but never forwarded to
`_construct_completion_args()`.

### What problem does this PR solve?

Pass merged kwargs to `_construct_completion_args()` using
`**{**gen_conf, **kwargs}` to safely handle potential duplicate
parameters.

### Changes

- `rag/llm/chat_model.py`: Forward kwargs containing `extra_body` to
`_construct_completion_args()` in `async_chat()`


_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Contribution by Gittensor, see my contribution statistics at
https://gittensor.io/miners/details?githubId=42954461
---
 rag/llm/chat_model.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/rag/llm/chat_model.py b/rag/llm/chat_model.py
index dc59e1fb8..eb1a0f826 100644
--- a/rag/llm/chat_model.py
+++ b/rag/llm/chat_model.py
@@ -1263,7 +1263,7 @@ class LiteLLMBase(ABC):
         if self.model_name.lower().find("qwen3") >= 0:
             kwargs["extra_body"] = {"enable_thinking": False}
 
-        completion_args = self._construct_completion_args(history=hist, stream=False, tools=False, **gen_conf)
+        completion_args = self._construct_completion_args(history=hist, stream=False, tools=False, **{**gen_conf, **kwargs})
 
         for attempt in range(self.max_retries + 1):
             try: