Fix: Use tiktoken for proper token counting in OpenAI-compatible endpoint #7850 (#12760)

### What problem does this PR solve? The OpenAI-compatible chat endpoint (`/chats_openai/<chat_id>/chat/completions`) was not returning accurate token usage in streaming responses. The token counts were either missing or inaccurate because the underlying LLM API responses weren't being properly parsed for usage data. This PR adds proper token counting using tiktoken (cl100k_base encoding) as a fallback when the LLM API doesn't provide usage data in streaming chunks. This ensures clients always receive token usage information in the response, which is essential for billing and quota management. **Changes:** - Add tiktoken-based token counting for streaming responses in OpenAI-compatible endpoint - Ensure `usage` field is always populated in the final streaming chunk - Add unit tests for token usage calculation Fixes #7850 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)
2026-01-29 22:56:36 +08:00 · 2026-01-23 02:36:21 +01:00
parent 8dd4a41bf8
commit 6be197cbb6
3 changed files with 163 additions and 11 deletions
--- a/test/testcases/test_http_api/common.py
+++ b/test/testcases/test_http_api/common.py
@ -374,3 +374,22 @@ def chat_completions(auth, chat_id, payload=None):
    url = f"{HOST_ADDRESS}/api/{VERSION}/chats/{chat_id}/completions"
    res = requests.post(url=url, headers=HEADERS, auth=auth, json=payload)
    return res.json()
+
+
+def chat_completions_openai(auth, chat_id, payload=None):
+    """
+    Send a request to the OpenAI-compatible chat completions endpoint.
+
+    Args:
+        auth: Authentication object
+        chat_id: Chat assistant ID
+        payload: Dictionary in OpenAI chat completions format containing:
+            - messages: list (required) - List of message objects with 'role' and 'content'
+            - stream: bool (optional) - Whether to stream responses, default False
+
+    Returns:
+        Response JSON in OpenAI chat completions format with usage information
+    """
+    url = f"{HOST_ADDRESS}/api/{VERSION}/chats_openai/{chat_id}/chat/completions"
+    res = requests.post(url=url, headers=HEADERS, auth=auth, json=payload)
+    return res.json()