fix: prevent redundant retries in async_chat_streamly upon success (#11832)

## What changes were proposed in this pull request?
Added a return statement after the successful completion of the async
for loop in async_chat_streamly.

## Why are the changes needed?
Previously, the code lacked a break/return mechanism inside the try
block. This caused the retry loop (for attempt in range...) to continue
executing even after the LLM response was successfully generated and
yielded, resulting in duplicate requests (up to max_retries times).

## Does this PR introduce any user-facing change?
No (it fixes an internal logic bug).
This commit is contained in:
N0bodycan
2025-12-09 17:14:30 +08:00
committed by GitHub
parent bb6022477e
commit 9863862348

View File

@ -187,6 +187,9 @@ class Base(ABC):
ans = delta_ans ans = delta_ans
total_tokens += tol total_tokens += tol
yield ans yield ans
yield total_tokens
return
except Exception as e: except Exception as e:
e = await self._exceptions_async(e, attempt) e = await self._exceptions_async(e, attempt)
if e: if e:
@ -194,8 +197,6 @@ class Base(ABC):
yield total_tokens yield total_tokens
return return
yield total_tokens
def _length_stop(self, ans): def _length_stop(self, ans):
if is_chinese([ans]): if is_chinese([ans]):
return ans + LENGTH_NOTIFICATION_CN return ans + LENGTH_NOTIFICATION_CN