Fix: unexpected truncated Excel files (#9500)

### What problem does this PR solve?

Handle unexpected truncated Excel files.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
This commit is contained in:
Yongteng Lei
2025-08-15 17:00:34 +08:00
committed by GitHub
parent 5a4dfecfbe
commit eef43fa25c
2 changed files with 11 additions and 2 deletions

View File

@ -490,6 +490,7 @@ def chunk(filename, binary=None, from_page=0, to_page=100000,
sections = [(_, "") for _ in excel_parser.html(binary, 12) if _]
else:
sections = [(_, "") for _ in excel_parser(binary) if _]
parser_config["chunk_token_num"] = 12800
elif re.search(r"\.(txt|py|js|java|c|cpp|h|php|go|ts|sh|cs|kt|sql)$", filename, re.IGNORECASE):
callback(0.1, "Start to parse.")