mirror of
https://github.com/infiniflow/ragflow.git
synced 2025-12-08 20:42:30 +08:00
Fix: set default chunk_token_num in html_parser (#10118)
### What problem does this PR solve? issue: [Bug]: Agent component (HTTP Request) "'>' not supported between instances of 'int' and 'NoneType'" [#10096](https://github.com/infiniflow/ragflow/issues/10096) Change: When the Invoke class instantiates HtmlParser without providing the chunk_token_num parameter, the value defaults to None, leading to a comparison error with block_token_count. This change sets the default chunk_token_num to 512 to prevent such errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: BadwomanCraZY <511528396@qq.com>
This commit is contained in:
@ -37,7 +37,7 @@ TITLE_TAGS = {"h1": "#", "h2": "##", "h3": "###", "h4": "#####", "h5": "#####",
|
||||
|
||||
|
||||
class RAGFlowHtmlParser:
|
||||
def __call__(self, fnm, binary=None, chunk_token_num=None):
|
||||
def __call__(self, fnm, binary=None, chunk_token_num=512):
|
||||
if binary:
|
||||
encoding = find_codec(binary)
|
||||
txt = binary.decode(encoding, errors="ignore")
|
||||
|
||||
Reference in New Issue
Block a user