Fix: set default chunk_token_num in html_parser (#10118)

### What problem does this PR solve?

issue:
[Bug]: Agent component (HTTP Request) "'>' not supported between
instances of 'int' and 'NoneType'"
[#10096](https://github.com/infiniflow/ragflow/issues/10096)

Change:
When the Invoke class instantiates HtmlParser without providing the
chunk_token_num parameter, the value defaults to None, leading to a
comparison error with block_token_count.

This change sets the default chunk_token_num to 512 to prevent such
errors.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: BadwomanCraZY <511528396@qq.com>
This commit is contained in:
buua436
2025-09-17 09:36:31 +08:00
committed by GitHub
parent 152111fd9d
commit c9ea22ef69

View File

@ -37,7 +37,7 @@ TITLE_TAGS = {"h1": "#", "h2": "##", "h3": "###", "h4": "#####", "h5": "#####",
class RAGFlowHtmlParser:
def __call__(self, fnm, binary=None, chunk_token_num=None):
def __call__(self, fnm, binary=None, chunk_token_num=512):
if binary:
encoding = find_codec(binary)
txt = binary.decode(encoding, errors="ignore")