Minor: improve doc and rm unused file (#5634)

### What problem does this PR solve?

The `ocr.res` file is already included in the model directory
`rag/res/deepdoc`, but it doesn't seem to be utilized here.

### Type of change

- [x] Documentation Update
This commit is contained in:
非法操作
2025-03-05 12:59:54 +08:00
committed by GitHub
parent b0c21b00d9
commit ca04ae9540
3 changed files with 12 additions and 6635 deletions

View File

@ -42,6 +42,17 @@ if LOCK_KEY_pdfplumber not in sys.modules:
class RAGFlowPdfParser:
def __init__(self):
"""
If you have trouble downloading HuggingFace models, -_^ this might help!!
For Linux:
export HF_ENDPOINT=https://hf-mirror.com
For Windows:
Good luck
^_-
"""
self.ocr = OCR()
if hasattr(self, "model_speciess"):
self.layouter = LayoutRecognizer("layout." + self.model_speciess)
@ -72,17 +83,6 @@ class RAGFlowPdfParser:
model_dir, "updown_concat_xgb.model"))
self.page_from = 0
"""
If you have trouble downloading HuggingFace models, -_^ this might help!!
For Linux:
export HF_ENDPOINT=https://hf-mirror.com
For Windows:
Good luck
^_-
"""
def __char_width(self, c):
return (c["x1"] - c["x0"]) // max(len(c["text"]), 1)