ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-24 03:56:37 +08:00

Files

Julien Deveaux 6be197cbb6 Fix: Use tiktoken for proper token counting in OpenAI-compatible endpoint #7850 (#12760 )

### What problem does this PR solve?
The OpenAI-compatible chat endpoint
(`/chats_openai/<chat_id>/chat/completions`) was not returning accurate
token
usage in streaming responses. The token counts were either missing or
inaccurate because the underlying LLM API
responses weren't being properly parsed for usage data.
This PR adds proper token counting using tiktoken (cl100k_base encoding)
as a fallback when the LLM API doesn't provide usage data in streaming
chunks. This ensures clients always receive token usage information in
the
response, which is essential for billing and quota management.
**Changes:**
- Add tiktoken-based token counting for streaming responses in
OpenAI-compatible endpoint
- Ensure `usage` field is always populated in the final streaming chunk
- Add unit tests for token usage calculation
  Fixes #7850

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

2026-01-23 09:36:21 +08:00