ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-26 13:16:34 +08:00

Author	SHA1	Message	Date
apps-lycusinc	678392c040	feat(deepdoc): add configurable ONNX thread counts and GPU memory shrinkage (#12777 ) ### What problem does this PR solve? This PR addresses critical memory and CPU resource management issues in high-concurrency environments (multi-worker setups): GPU Memory Exhaustion (OOM): Currently, onnxruntime-gpu uses an aggressive memory arena that does not effectively release VRAM back to the system after a task completes. In multi-process worker setups ($WS > 4), this leads to BFCArena allocation failures and OOM errors as workers "hoard" VRAM even when idle. This PR introduces an optional GPU Memory Arena Shrinkage toggle to mitigate this issue. CPU Oversubscription: ONNX intra_op and inter_op thread counts are currently hardcoded to 2. When running many workers, this causes significant CPU context-switching overhead and degrades performance. This PR makes these values configurable to match the host's actual CPU core density. Multi-GPU Support: The memory management logic has been improved to dynamically target the correct device_id, ensuring stability on systems with multiple GPUs. Transparency: Added detailed initialization logs to help administrators verify and troubleshoot their ONNX session configurations. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: shakeel <shakeel@lollylaw.com>	2026-01-23 11:36:28 +08:00
Jin Hai	6546f86b4e	Fix errors (#11795 ) ### What problem does this PR solve? - typos - IDE warnings ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-12-08 09:42:10 +08:00
Kevin Hu	ba71160b14	Refa: rm useless code. (#11238 ) ### Type of change - [x] Refactoring	2025-11-13 09:59:55 +08:00
Jin Hai	f98b24c9bf	Move api.settings to common.settings (#11036 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-06 09:36:38 +08:00
Jin Hai	78631a3fd3	Move some functions out of 'api/utils/common.py' (#10948 ) ### What problem does this PR solve? as title. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-03 12:34:47 +08:00
Jin Hai	44f2d6f5da	Move 'get_project_base_directory' to common directory (#10940 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-02 21:05:28 +08:00
Zhichang Yu	73144e278b	Don't release full image (#10654 ) ### What problem does this PR solve? Introduced gpu profile in .env Added Dockerfile_tei fix datrie Removed LIGHTEN flag ### Type of change - [x] Documentation Update - [x] Refactoring	2025-10-23 23:02:27 +08:00
XIANG LI	f631073ac2	Fix OCR GPU provider mem limit handling (#10407 ) ### What problem does this PR solve? - Running DeepDoc OCR on large PDFs inside the GPU docker-compose setup would intermittently fail with [ONNXRuntimeError] ... p2o.Clip.6 ... Available memory of 0 is smaller than requested bytes ... - Root cause: load_model() in deepdoc/vision/ocr.py treated device_id=None as-is. torch.cuda.device_count() > device_id then raised a TypeError, the helper returned False, and ONNXRuntime quietly fell back to CPUExecutionProvider with the hard-coded 512 MB limit, which then triggered the allocator failure. - Environment where this reproduces: Windows 11, AMD 5900x, 64 GB RAM, RTX 3090 (24 GB), docker-compose-gpu.yml from upstream, default DeepDoc + GraphRAG parser settings, ingesting heavy PDF such as 《内科学》（第10版）.pdf (~180 MB). Fixes: - Normalize device_id to 0 when it is None before calling any CUDA APIs, so the GPU path is considered available. - Allow configuring the CUDA provider’s memory cap via OCR_GPU_MEM_LIMIT_MB (default 2048 MB) and expose OCR_ARENA_EXTEND_STRATEGY; the calculated byte limit is logged to confirm the effective settings. After the change, ragflow_server.log shows for example load_model ... uses GPU (device 0, gpu_mem_limit=21474836480, arena_strategy=kNextPowerOfTwo) and the same document finishes OCR without allocator errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-10 11:03:12 +08:00
Jin Hai	b0b866c8fd	Refactor: move some functions out of api/utils/__init__.py (#10216 ) ### What problem does this PR solve? Refactor import modules. ### Type of change - [x] Refactoring --------- Signed-off-by: jinhai <haijin.chn@gmail.com> Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-09-25 18:04:49 +08:00
Lynn	341a7b1473	Fix: judge not empty before delete (#10099 ) ### What problem does this PR solve? judge not empty before delete session. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-15 17:49:52 +08:00
Lynn	2a88ce6be1	Fix: terminate onnx inference session manually (#10076 ) ### What problem does this PR solve? terminate onnx inference session and release memory manually. Issue #5050 Issue #9992 Issue #8805 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-12 17:18:26 +08:00
cwr31	e6d36f3a3a	Improve image rotation logic for text recognition (#8167 ) ### What problem does this PR solve? Enhanced the image rotation handling by evaluating the original orientation, clockwise 90°, and counter-clockwise 90° rotations. The image with the highest text recognition score is now selected, improving accuracy for text detection in images with aspect ratios >= 1.5. #8166 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenrui.cao <wenrui.cao@univers.com>	2025-06-11 09:20:30 +08:00
giiiiiithub	6ba5a4348a	set PARALLEL_DEVICES default value= 0 (#7935 ) ### What problem does this PR solve? it would be fail if PARALLEL_DEVICES = None in OCR class , because it pass 0 to TextDetector and TextRecognizer init method. and It would be simpler to set 0 as the default value for PARALLEL_DEVICES. ### Type of change - [x] Refactoring	2025-05-29 13:32:16 +08:00
Kevin Hu	3a99c2b5f4	Refa: PARALLEL_DEVICES is a static parameter. (#6168 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-03-17 16:49:54 +08:00
Debug Doctor	3e19044dee	Feat: add OCR's muti-gpus and parallel processing support (#5972 ) ### What problem does this PR solve? Add OCR's muti-gpus and parallel processing support ### Type of change - [x] New Feature (non-breaking change which adds functionality) @yuzhichang I've tried to resolve the comments in #5697. OCR jobs can now be done on both CPU and GPU. ( By the way, I've encountered a “Generate embedding error” issue #5954 that might be due to my outdated GPUs? idk. ) Please review it and give me suggestions. GPU: ![gpu_ocr](https://github.com/user-attachments/assets/0ee2ecfb-a665-4e50-8bc7-15941b9cd80e) ![smi](https://github.com/user-attachments/assets/a2312f8c-cf24-443d-bf89-bec50503546d) CPU: ![cpu_ocr](https://github.com/user-attachments/assets/1ba6bb0b-94df-41ea-be79-790096da4bf1)	2025-03-17 11:58:40 +08:00
yihong	4326873af6	refactor: no need to inherit in python3 clean the code (#5659 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2025-03-05 18:03:53 +08:00
Zhichang Yu	db42d0e0ae	Optimize ocr (#5297 ) ### What problem does this PR solve? Introduced OCR.recognize_batch ### Type of change - [x] Performance Improvement	2025-02-24 16:21:55 +08:00
Zhichang Yu	0151d42156	Reuse loaded modules if possible (#5231 ) ### What problem does this PR solve? Reuse loaded modules if possible ### Type of change - [x] Refactoring	2025-02-21 17:21:01 +08:00
Zhichang Yu	3411d0a2ce	Added cuda_is_available (#4725 ) ### What problem does this PR solve? Added cuda_is_available ### Type of change - [x] Refactoring	2025-02-05 18:01:23 +08:00
Zhichang Yu	e1526846da	Fixed GPU detection on CPU only environment (#4711 ) ### What problem does this PR solve? Fixed GPU detection on CPU only environment. Close #4692 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-02-05 12:02:43 +08:00
Zhichang Yu	4230402fbb	deepdoc use GPU if possible (#4618 ) ### What problem does this PR solve? deepdoc use GPU if possible ### Type of change - [x] Refactoring	2025-01-24 09:48:02 +08:00
Jin Hai	3894de895b	Update comments (#4569 ) ### What problem does this PR solve? Add license statement. ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-01-21 20:52:28 +08:00
Mathias Panzenböck	4f9f9405b8	Remove use of eval() from ocr.py (#4481 ) `eval(op_name)` -> `getattr(operators, op_name)` ### What problem does this PR solve? Using `eval()` can lead to code injections and is entirely unnecessary here. ### Type of change - [x] Other (please describe): Best practice code improvement, preventing the possibility of code injection.	2025-01-20 09:52:30 +08:00
Zhichang Yu	1254ecf445	Added static check at PR CI (#3921 ) ### What problem does this PR solve? Added static check at PR CI ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2024-12-08 21:23:51 +08:00
Zhichang Yu	0d68a6cd1b	Fix errors detected by Ruff (#3918 ) ### What problem does this PR solve? Fix errors detected by Ruff ### Type of change - [x] Refactoring	2024-12-08 14:21:12 +08:00
Kevin Hu	99adeabc85	remove dependency (#1536 ) ### What problem does this PR solve? #702 ### Type of change - [x] Refactoring	2024-07-16 16:30:17 +08:00
KevinHuSh	453c29170f	make sure the models will not be load twice (#422 ) ### What problem does this PR solve? #381 ### Type of change - [x] Refactoring	2024-04-18 09:37:23 +08:00
KevinHuSh	a5384446e3	let's load model from local (#163 )	2024-03-28 16:10:47 +08:00
KevinHuSh	979b3a5b4b	support snapshot download from local (#153 ) * support snapshot download from local * let snapshot download from local	2024-03-27 09:53:42 +08:00
KevinHuSh	da21320b88	fix plainPdf bugs (#152 )	2024-03-26 15:11:07 +08:00
KevinHuSh	9da671b951	refine manul parser (#131 )	2024-03-19 12:26:04 +08:00
KevinHuSh	675a9f8d9a	add dockerfile for cuda envirement. Refine table search strategy, (#123 )	2024-03-14 19:45:29 +08:00
KevinHuSh	8f86ab9f7f	refine pdf parser, add time zone to userinfo (#112 )	2024-03-08 11:24:24 +08:00
KevinHuSh	7fd1eca582	init README of deepdoc, add picture processer. (#71 ) * init README of deepdoc, add picture processer. * add resume parsing	2024-02-23 18:28:12 +08:00
KevinHuSh	d32322c081	rename vision, add layour and tsr recognizer (#70 ) * rename vision, add layour and tsr recognizer * trivial fixing	2024-02-22 19:11:37 +08:00

35 Commits