Change default error message to English (#3838)

### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
This commit is contained in:
Jin Hai
2024-12-04 09:34:49 +08:00
committed by GitHub
parent 87455d79e4
commit 6657ca7cde
2 changed files with 63 additions and 27 deletions

View File

@ -230,6 +230,14 @@ def is_english(texts):
return True
return False
def is_chinese(text):
chinese = 0
for ch in text:
if '\u4e00' <= ch <= '\u9fff':
chinese += 1
if chinese / len(text) > 0.2:
return True
return False
def tokenize(d, t, eng):
d["content_with_weight"] = t