Commit Graph

81 Commits

Author SHA1 Message Date
f631073ac2 Fix OCR GPU provider mem limit handling (#10407)
### What problem does this PR solve?

- Running DeepDoc OCR on large PDFs inside the GPU docker-compose setup
would intermittently fail with
[ONNXRuntimeError] ... p2o.Clip.6 ... Available memory of 0 is smaller
than requested bytes ...
- Root cause: load_model() in deepdoc/vision/ocr.py treated
device_id=None as-is.
torch.cuda.device_count() > device_id then raised a TypeError, the
helper returned False, and ONNXRuntime quietly fell back to
CPUExecutionProvider with
the hard-coded 512 MB limit, which then triggered the allocator failure.
- Environment where this reproduces: Windows 11, AMD 5900x, 64 GB RAM,
RTX 3090 (24 GB), docker-compose-gpu.yml from upstream, default DeepDoc
+ GraphRAG
parser settings, ingesting heavy PDF such as 《内科学》(第10版).pdf (~180 MB).

  Fixes:

- Normalize device_id to 0 when it is None before calling any CUDA APIs,
so the GPU path is considered available.
- Allow configuring the CUDA provider’s memory cap via
OCR_GPU_MEM_LIMIT_MB (default 2048 MB) and expose
OCR_ARENA_EXTEND_STRATEGY; the calculated byte
  limit is logged to confirm the effective settings.

  After the change, ragflow_server.log shows for example
load_model ... uses GPU (device 0, gpu_mem_limit=21474836480,
arena_strategy=kNextPowerOfTwo) and the same document finishes OCR
without allocator errors.

  ### Type of change

  - [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-10 11:03:12 +08:00
b0b866c8fd Refactor: move some functions out of api/utils/__init__.py (#10216)
### What problem does this PR solve?

Refactor import modules.

### Type of change

- [x] Refactoring

---------

Signed-off-by: jinhai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-09-25 18:04:49 +08:00
86f6da2f74 Feat: add support for the Ascend table structure recognizer (#10110)
### What problem does this PR solve?

Add support for the Ascend table structure recognizer.

Use the environment variable `TABLE_STRUCTURE_RECOGNIZER_TYPE=ascend` to
enable the Ascend table structure recognizer.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-16 13:57:06 +08:00
bc0281040b Feat: add support for the Ascend layout recognizer (#10105)
### What problem does this PR solve?

Supports Ascend layout recognizer.

Use the environment variable `LAYOUT_RECOGNIZER_TYPE=ascend` to enable
the Ascend layout recognizer, and `ASCEND_LAYOUT_RECOGNIZER_DEVICE_ID=n`
(for example, n=0) to specify the Ascend device ID.

Ensure that you have installed the [ais
tools](https://gitee.com/ascend/tools/tree/master/ais-bench_workload/tool/ais_bench)
properly.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-16 09:51:15 +08:00
341a7b1473 Fix: judge not empty before delete (#10099)
### What problem does this PR solve?

judge not empty before delete session.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-15 17:49:52 +08:00
2a88ce6be1 Fix: terminate onnx inference session manually (#10076)
### What problem does this PR solve?

terminate onnx inference session and release memory manually.

Issue #5050 
Issue #9992 
Issue #8805

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-12 17:18:26 +08:00
5abd0bbac1 Fix typo (#9766)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-08-27 18:56:40 +08:00
2ae8f2cf00 Fix: exception layout_type in is_caption (#9028)
### What problem does this PR solve?

Exception layout_type in is_caption.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-24 17:06:56 +08:00
e470645efd Refactor code (#8341)
### What problem does this PR solve?

1. rename var
2. update if statement

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-18 16:40:30 +08:00
4a2ff633e0 Fix typo in code (#8327)
### What problem does this PR solve?

Fix typo in code

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-06-18 09:41:09 +08:00
e6d36f3a3a Improve image rotation logic for text recognition (#8167)
### What problem does this PR solve?

Enhanced the image rotation handling by evaluating the original
orientation, clockwise 90°, and counter-clockwise 90° rotations. The
image with the highest text recognition score is now selected, improving
accuracy for text detection in images with aspect ratios >= 1.5.

#8166

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wenrui.cao <wenrui.cao@univers.com>
2025-06-11 09:20:30 +08:00
6ba5a4348a set PARALLEL_DEVICES default value= 0 (#7935)
### What problem does this PR solve?


it would be fail if PARALLEL_DEVICES = None in OCR class , because it
pass 0 to TextDetector and TextRecognizer init method.

and It would be simpler to set 0 as the default value for
PARALLEL_DEVICES.

### Type of change

- [x] Refactoring
2025-05-29 13:32:16 +08:00
ed5f81b02e Fix: abnormal cell mergeing. (#6991)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-14 11:00:11 +08:00
3bb1e012e6 Fix: assistant deleteion issue. (#6906)
### What problem does this PR solve?

#6875

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-09 20:29:40 +08:00
2caf15b24c Refa: trival. (#6802)
### What problem does this PR solve?


### Type of change


- [x] Refactoring
2025-04-03 19:01:24 +08:00
b0b4b7ba33 Feat: Improve Recognizer.py performance (#6185)
### What problem does this PR solve?

For the create_inputs method based on np operation to replace for loop

### Type of change

- [x] Performance Improvement
2025-03-18 09:39:49 +08:00
3a99c2b5f4 Refa: PARALLEL_DEVICES is a static parameter. (#6168)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-03-17 16:49:54 +08:00
3e19044dee Feat: add OCR's muti-gpus and parallel processing support (#5972)
### What problem does this PR solve?

Add OCR's muti-gpus and parallel processing support

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

@yuzhichang I've tried to resolve the comments in #5697. OCR jobs can
now be done on both CPU and GPU. ( By the way, I've encountered a
“Generate embedding error” issue #5954 that might be due to my outdated
GPUs? idk. ) Please review it and give me suggestions.

GPU:

![gpu_ocr](https://github.com/user-attachments/assets/0ee2ecfb-a665-4e50-8bc7-15941b9cd80e)

![smi](https://github.com/user-attachments/assets/a2312f8c-cf24-443d-bf89-bec50503546d)

CPU:

![cpu_ocr](https://github.com/user-attachments/assets/1ba6bb0b-94df-41ea-be79-790096da4bf1)
2025-03-17 11:58:40 +08:00
4ff609b6a8 Fix: optimize OCR garbage identification to reduce unnecessary filtering (#6027)
### What problem does this PR solve?

Optimize OCR garbage identification to reduce unnecessary filtering.
#5713

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-13 18:48:32 +08:00
4326873af6 refactor: no need to inherit in python3 clean the code (#5659)
### What problem does this PR solve?

As title

### Type of change


- [x] Refactoring

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-05 18:03:53 +08:00
ca04ae9540 Minor: improve doc and rm unused file (#5634)
### What problem does this PR solve?

The `ocr.res` file is already included in the model directory
`rag/res/deepdoc`, but it doesn't seem to be utilized here.

### Type of change

- [x] Documentation Update
2025-03-05 12:59:54 +08:00
c813c1ff4c Made task_executor async to speedup parsing (#5530)
### What problem does this PR solve?

Made task_executor async to speedup parsing

### Type of change

- [x] Performance Improvement
2025-03-03 18:59:49 +08:00
8a2542157f Fix: possible memory leaks close #5277 (#5500)
### What problem does this PR solve?

close #5277 by make sure the file close

### Type of change

- [x] Performance Improvement

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-03 10:26:45 +08:00
37aacb3960 Refa: drop useless fasttext (#5470)
### What problem does this PR solve?

This patch drop useless fastext which is seems useless in the code base 
and its very kind of hard install
should close #4498


### Type of change

- [x] Refactoring

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-28 14:30:56 +08:00
db42d0e0ae Optimize ocr (#5297)
### What problem does this PR solve?

Introduced OCR.recognize_batch

### Type of change

- [x] Performance Improvement
2025-02-24 16:21:55 +08:00
0151d42156 Reuse loaded modules if possible (#5231)
### What problem does this PR solve?

Reuse loaded modules if possible

### Type of change

- [x] Refactoring
2025-02-21 17:21:01 +08:00
c326f14fed Optimized Recognizer.sort_X_firstly and Recognizer.sort_Y_firstly (#5182)
### What problem does this PR solve?

Optimized Recognizer.sort_X_firstly and Recognizer.sort_Y_firstly

### Type of change

- [x] Performance Improvement
2025-02-20 15:41:12 +08:00
b08bb56f6c Display thinking for deepseek r1 (#4904)
### What problem does this PR solve?
#4903
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-02-12 15:43:13 +08:00
6b389e01b5 Remove use of eval() from operators.py (#4888)
Use `np.float32()` instead.

### What problem does this PR solve?

Using `eval()` can lead to code injections.

I think `eval()` is only used to parse a floating point number here.
This change preserves the correct behavior if the string `"None"` is
supplied. But if that behavior isn't intended then this part could be
just deleted instead, since `np.float32()` is parsing strings anyway:

```Python
        if isinstance(scale, str):
            scale = eval(scale)
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-12 12:53:42 +08:00
3411d0a2ce Added cuda_is_available (#4725)
### What problem does this PR solve?

Added cuda_is_available

### Type of change

- [x] Refactoring
2025-02-05 18:01:23 +08:00
e1526846da Fixed GPU detection on CPU only environment (#4711)
### What problem does this PR solve?

Fixed GPU detection on CPU only environment. Close #4692

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-05 12:02:43 +08:00
1bff6b7333 Fix t_ocr.py for PNG image. (#4625)
### What problem does this PR solve?
#4586

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-01-24 11:47:27 +08:00
4230402fbb deepdoc use GPU if possible (#4618)
### What problem does this PR solve?

deepdoc use GPU if possible

### Type of change

- [x] Refactoring
2025-01-24 09:48:02 +08:00
1a367664f1 Remove usage of eval() from postprocess.py (#4571)
Remove usage of `eval()` from postprocess.py

### What problem does this PR solve?

The use of `eval()` is a potential security risk. While the use of
`eval()` is guarded and thus not a security risk normally, `assert`s
aren't run if `-O` or `-OO` is passed to the interpreter, and as such
then the guard would not apply. In any case there is no reason to use
`eval()` here at all.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Other (please describe):

Potential security fix if somehow the passed `modul_name` could be user
controlled.
2025-01-22 19:37:24 +08:00
3894de895b Update comments (#4569)
### What problem does this PR solve?

Add license statement.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-01-21 20:52:28 +08:00
75e1981e13 Remove use of eval() from recognizer.py (#4480)
`eval(op_type)` -> `getattr(operators, op_type)`

### What problem does this PR solve?

Using `eval()` can lead to code injections and is entirely unnecessary
here.

### Type of change

- [x] Other (please describe):

Best practice code improvement, preventing the possibility of code
injection.
2025-01-20 09:52:47 +08:00
4f9f9405b8 Remove use of eval() from ocr.py (#4481)
`eval(op_name)` -> `getattr(operators, op_name)`

### What problem does this PR solve?

Using `eval()` can lead to code injections and is entirely unnecessary
here.

### Type of change

- [x] Other (please describe):

Best practice code improvement, preventing the possibility of code
injection.
2025-01-20 09:52:30 +08:00
c852a6dfbf Accelerate titles' embeddings. (#4492)
### What problem does this PR solve?


### Type of change

- [x] Performance Improvement
2025-01-15 15:20:29 +08:00
b7ce4e7e62 fix:t_recognizer TypeError: 'super' object is not callable (#4404)
### What problem does this PR solve?

[Bug]: layout recognizer failed for wrong boxes class type #4230
(https://github.com/infiniflow/ragflow/issues/4230)

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: youzhiqiang <zhiqiang.you@aminer.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-01-08 10:59:35 +08:00
2e40c2a6f6 Fix t_recognizer issue. (#4387)
### What problem does this PR solve?

#4230

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-01-07 13:17:46 +08:00
983ec0666c Fix param error. (#4355)
### What problem does this PR solve?

#4230

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-01-06 13:54:17 +08:00
59a78408be Fix t_recognizer.py after model updating. (#4330)
### What problem does this PR solve?

#4230

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-01-02 17:00:11 +08:00
2cbe064080 Add Llama3.3 (#4174)
### What problem does this PR solve?

#4168

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-12-23 11:18:01 +08:00
ce1e855328 Upgrades Document Layout Analysis model. (#4054)
### What problem does this PR solve?

#4052

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-12-17 11:27:19 +08:00
1254ecf445 Added static check at PR CI (#3921)
### What problem does this PR solve?

Added static check at PR CI

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2024-12-08 21:23:51 +08:00
0d68a6cd1b Fix errors detected by Ruff (#3918)
### What problem does this PR solve?

Fix errors detected by Ruff

### Type of change

- [x] Refactoring
2024-12-08 14:21:12 +08:00
bc701d7b4c Edit chunk shall update instead of insert it (#3709)
### What problem does this PR solve?

Edit chunk shall update instead of insert it. Close #3679 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-28 13:00:38 +08:00
2249d5d413 Always open text file for write with UTF-8 (#3688)
### What problem does this PR solve?

Always open text file for write with UTF-8. Close #932 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-27 16:24:16 +08:00
30f6421760 Use consistent log file names, introduced initLogger (#3403)
### What problem does this PR solve?

Use consistent log file names, introduced initLogger

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2024-11-14 17:13:48 +08:00
a2a5631da4 Rework logging (#3358)
Unified all log files into one.

### What problem does this PR solve?

Unified all log files into one.

### Type of change

- [x] Refactoring
2024-11-12 17:35:13 +08:00