Fix: Handle whitespace-only question in /retrieval endpoint (#12831)

## Description

This PR fixes issue #12805 by adding validation to handle
whitespace-only questions in the `/retrieval` endpoint.

## Problem

Sending a single space `" "` as the `question` parameter to `/retrieval`
crashes the request with an `AssertionError`. This happens because:
1. The endpoint doesn't trim or validate the question parameter
2. A whitespace-only string is treated as valid input
3. The retrieval logic only checks for empty strings (which are falsy),
but `" "` is truthy
4. Invalid match expressions are constructed, causing an assertion
failure in the Elasticsearch layer

## Solution

- Trim whitespace from the question parameter before processing
- Return an empty result for whitespace-only or empty questions
- Prevents the AssertionError and provides expected behavior

## Changes

- Added whitespace trimming and validation in `api/apps/sdk/doc.py`
- Returns empty result early if question is empty after trimming

## Testing

- Tested with single space input - now returns empty result instead of
crashing
- Tested with empty string - returns empty result
- Tested with normal questions - works as expected

Fixes #12805

Co-authored-by: Daniel <daniel@example.com>
This commit is contained in:
Angel98518
2026-01-27 15:57:47 +08:00
committed by GitHub
parent 52da81cf9e
commit e77168feba

View File

@ -1514,6 +1514,12 @@ async def retrieval_test(tenant_id):
page = int(req.get("page", 1))
size = int(req.get("page_size", 30))
question = req["question"]
# Trim whitespace and validate question
if isinstance(question, str):
question = question.strip()
# Return empty result if question is empty or whitespace-only
if not question:
return get_result(data={"total": 0, "chunks": [], "doc_aggs": {}})
doc_ids = req.get("document_ids", [])
use_kg = req.get("use_kg", False)
toc_enhance = req.get("toc_enhance", False)