ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-02-02 08:35:08 +08:00

Author	SHA1	Message	Date
aidan	33a189f620	Feat: add TCADP Parser (#10775 ) ### What problem does this PR solve? This PR adds a new TCADP (Tencent Cloud Advanced Document Processing) parser to RAGFlow, enabling users to leverage Tencent Cloud's document parsing capabilities for more accurate and structured document processing. The implementation includes: New TCADP Parser: A complete implementation of Tencent Cloud's document parsing API without SDK dependency Configuration Support: Added configuration options in service_conf.yaml for Tencent Cloud API credentials Frontend Integration: Updated UI components to support the new TCADP parser option Error Handling: Comprehensive error handling and retry mechanisms for API calls Result Processing: Support for both SSE streaming and JSON response formats from Tencent Cloud API ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-10-27 15:14:58 +08:00
Stephen Hu	56def59c2b	Fix:Error retrieving DOCX image (docx.image.exceptions.UnrecognizedImageError) (#10794 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/10776 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-10-27 13:23:16 +08:00
Kevin Hu	3bd0b99495	Fix: gemini cv model chat issue. (#10799 ) ### What problem does this PR solve? #10787 #10781 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-27 11:43:56 +08:00
Stephen Hu	50e93d1528	Fix: Opendal miss tenant id (#10774 ) ### What problem does this PR solve? as https://github.com/infiniflow/ragflow/pull/10712 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-27 10:28:08 +08:00
Billy Bao	501b7d4d01	Fix: prio synonym match than wordnet for english (#10762 ) ### What problem does this PR solve? Fix: prio synonym match than wordnet for english ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-27 09:32:55 +08:00
Stephen Hu	1d57801c0c	Fix:ERROR 20 Method rag.nlp.search.Dealer.search() parameter highlight="None" violates type hint bool \| list, as <class "builtins.NoneType"> "None" not list or bool. (#10743 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/10733 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-27 09:29:39 +08:00
Zhichang Yu	73144e278b	Don't release full image (#10654 ) ### What problem does this PR solve? Introduced gpu profile in .env Added Dockerfile_tei fix datrie Removed LIGHTEN flag ### Type of change - [x] Documentation Update - [x] Refactoring	2025-10-23 23:02:27 +08:00
buua436	0ff2042fc1	Feat: add Docling parser (#10759 ) ### What problem does this PR solve? issue: #3945 change: add Docling parser ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-23 19:44:25 +08:00
Kevin Hu	ea73f13ebf	Fix: infinity rerank error. (#10760 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-23 17:38:54 +08:00
Kevin Hu	5fb5a51b2e	Fix: create KB initial embedding. (#10751 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-23 16:17:43 +08:00
Kevin Hu	f24d464a53	Fix: video file suffix (#10740 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-23 11:13:09 +08:00
Yongteng Lei	f7112acd97	Feat: pipeline supports MinerU PDF parser (#10736 ) ### What problem does this PR solve? Pipeline supports MinerU PDF parser. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-23 09:24:31 +08:00
Kevin Hu	de4f75dcd8	Fix: add video parser (#10735 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-23 09:24:16 +08:00
Stephen Hu	b30f0be858	Refactor: How LiteLLMBase Calculate total count (#10532 ) ### What problem does this PR solve? How LiteLLMBase Calculate total count ### Type of change - [x] Refactoring	2025-10-22 12:25:31 +08:00
Billy Bao	a82e9b3d91	Fix: can't upload image in ollama model #10447 (#10717 ) ### What problem does this PR solve? Fix: can't upload image in ollama model #10447 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) ### Change all `image=[]` to `image = None` Changing `image=[]` to `images=None` avoids Python’s mutable default parameter issue. If you keep `images=[]`, all calls share the same list, so modifying it (e.g., images.append()) will affect later calls. Using images=None and creating a new list inside the function ensures each call is independent. This change does not affect current behavior — it simply makes the code safer and more predictable. 把 `images=[]` 改成 `images=None` 是为了避免 Python 默认参数的可变对象问题。如果保留 `images=[]`，所有调用都会共用同一个列表，一旦修改就会影响后续调用。改成 None 并在函数内部重新创建列表，可以确保每次调用都是独立的。这个修改不会影响现有运行结果，只是让代码更安全、更可控。	2025-10-22 12:24:12 +08:00
Stephen Hu	307cdc62ea	fix:RAGFlowOSS.put() got an unexpected keyword argument 'tenant_id' (#10712 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/10700 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-22 09:30:41 +08:00
Yongteng Lei	2d491188b8	Refa: improve flow of GraphRAG and RAPTOR (#10709 ) ### What problem does this PR solve? Improve flow of GraphRAG and RAPTOR. ### Type of change - [x] Refactoring	2025-10-22 09:29:20 +08:00
buua436	41fade3fe6	Fix:wrong param in manual chunk (#10710 ) ### What problem does this PR solve? change: wrong param in manual chunk ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-21 20:10:54 +08:00
Yongteng Lei	cd77425b87	Fix: potential negative max_tokens in RAPTOR (#10701 ) ### What problem does this PR solve? Fix potential negative max_tokens in RAPTOR. #10235. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue	2025-10-21 15:49:51 +08:00
Billy Bao	863c3e3d9c	Fix: tree merge (#10691 ) ### What problem does this PR solve? Fix: Fix tree merge, solved #10636 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-21 13:02:01 +08:00
Jin Hai	deb81810e9	Update message printout when start ingestion server (#10677 ) ### What problem does this PR solve? ``` ____ __ _ / _/ ____ ____ _ ___ _____ / /_ (_) ____ ____ _____ ___ _____ _ __ ___ _____ / / / __ \ / __ `/ / _ \ / ___/ / __/ / / / __ \ / __ \ / ___/ / _ \ / ___/\| \| / / / _ \ / ___/ _/ / / / / / / /_/ / / __/ (__ ) / /_ / / / /_/ / / / / / (__ ) / __/ / / \| \|/ / / __/ / / /___/ /_/ /_/ \__, / \___/ /____/ \__/ /_/ \____/ /_/ /_/ /____/ \___/ /_/ \|___/ \___/ /_/ /____/ ``` ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-10-21 09:38:20 +08:00
buua436	6ab96287c9	Feat:Vision Model Image Enhancement in Manual/Paper/Book/One chunker (#10640 ) ### What problem does this PR solve? issue: [#7472](https://github.com/infiniflow/ragflow/issues/7472) change: Vision Model Image Enhancement in Manual chunker ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-21 09:36:27 +08:00
Yongteng Lei	aaa4776657	Feat: Qwen-VL series supports video parsing (#10676 ) ### What problem does this PR solve? Qwen-VL series supports video parsing. #10617. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-21 09:36:13 +08:00
Yongteng Lei	5b2e5dd334	Feat: Gemini supports video parsing (#10671 ) ### What problem does this PR solve? Gemini supports video parsing. ![img_v3_02r8_adbd5adc-d665-4756-9a00-3ae0f12224fg](https://github.com/user-attachments/assets/30d8d296-c336-4b55-9823-803979e705ca) ![img_v3_02r8_ab60c046-1727-4029-ad2e-66097fd3ccbg](https://github.com/user-attachments/assets/441b1487-a970-427e-98b6-6e1e002f2bad) Close: #10617 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-20 16:49:47 +08:00
Billy Bao	8ee0b6ea54	File: Now parsing support all types of embedded documents, solved #10059 (#10635 ) ### What problem does this PR solve? File: Now parsing support all types of embedded documents, solved #10059 Fix: Incomplete words in chat #10530 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-17 18:46:47 +08:00
buua436	b15643bd80	Feat:VolcEngine Model type add IMAGE2TEXT (#10629 ) ### What problem does this PR solve? issue: [#9004](https://github.com/infiniflow/ragflow/issues/9004) change: VolcEngine Model type add IMAGE2TEXT ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-17 11:43:22 +08:00
Yongteng Lei	387baf858f	Feat: add MinerU parser (#10621 ) ### What problem does this PR solve? Add MinerU parser. #3945, #8092. Set `MINERU_EXECUTABLE` to the MinerU executable path, defaults to `mineru`. Set `MINERU_DELETE_OUTPUT=0` to preserve MinerU's output, default is 1, which deletes temporary output. Set `MINERU_OUTPUT_DIR` to choose the MinerU output directory (uses the temporary directory if unset). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-17 09:55:39 +08:00
Kevin Hu	43ea312144	Fix: search highlight. (#10616 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-16 18:45:43 +08:00
Liu An	8af769de41	Fix: add toc_kwd field and update page_num_int type (#10596 ) ### What problem does this PR solve? - Added new field 'toc_kwd' to infinity_mapping.json for table of contents keyword support - Changed page_num_int from integer to array type in task_executor.py to handle multiple page numbers ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-16 12:47:24 +08:00
buua436	4e86ee4ff9	Feat: Support Specifying OpenRouter Model Provider (#10550 ) ### What problem does this PR solve? issue: [#5787](https://github.com/infiniflow/ragflow/issues/5787) change: Support Specifying OpenRouter Model Provider ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-16 09:39:59 +08:00
Yongteng Lei	86b254d214	Improve file management (#10577 ) ### What problem does this PR solve? Improve file management. #10287. Passed tests: 1. Create folder `A` and `B`. 2. Upload a file inside `A`, called `file`. 3. Create a KB, called `K`. 3. Link `file` to `K`. 4. Parse `file` inside of `K`. (OK) 5. Move `file` from `A` to `B`. 6. Parse `file` inside of `K`. (OK) 7. Move `file` from `B` to `A`. 8. Parse `file` inside of `K`. (OK) 9. Move entire folder `A` into `B`. (B -> A -> file) 10. Parse `file` inside of `K`. (OK) 11. Delete folder `B`. 12. All clear. (There is no document inside of `K`) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-16 09:38:25 +08:00
Zhichang Yu	e48bec1cbf	Don't rerank for infinity (#10579 ) ### What problem does this PR solve? Don't need rerank for infinity since Infinity normalizes each way score before fusion. ### Type of change - [x] Refactoring	2025-10-15 20:15:49 +08:00
Günter Lukas	5037a28e4d	Fix problem with Google Cloud models with reasoning (like gemini) - Additional fix to issue #10474 (#10502 ) ### What problem does this PR solve? Issue #10474 - Update to PR #10477 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue)	2025-10-15 14:54:20 +08:00
Kevin Hu	16b5feadb7	Fix: canvas list with team. (#10549 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-14 19:38:54 +08:00
Kevin Hu	f92a45dcc4	Feat: let toc run asynchronizly... (#10513 ) ### What problem does this PR solve? #10436 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-14 14:14:52 +08:00
Yongteng Lei	9e73f799b2	Feat: add Zhipu GLM-ASR model (#10529 ) ### What problem does this PR solve? Add Zhipu GLM-ASR model ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-14 09:32:45 +08:00
buua436	21a62130c8	Fix: empty references in agent conversation (#10528 ) ### What problem does this PR solve? issue: #10495 change: fix empty references in agent conversation ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-14 09:32:13 +08:00
Yongteng Lei	65c3f0406c	Fix: maintain backward compatibility for KB tasks (#10508 ) ### What problem does this PR solve? Maintain backward compatibility for KB tasks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-13 11:53:48 +08:00
Lynn	7fb8b30cc2	fix: decode before format to json (#10506 ) ### What problem does this PR solve? Decode bytes before format to json. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-13 11:11:06 +08:00
Kevin Hu	2828e321bc	Fix: remove lang for autio. (#10496 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-11 19:38:07 +08:00
Yongteng Lei	5200711441	Feat: add support for multi-column PDF parsing (#10475 ) ### What problem does this PR solve? Add support for multi-columns PDF parsing. #9878, #9919. Two-column sample: <img width="1885" height="1020" alt="image" src="https://github.com/user-attachments/assets/0270c028-2db8-4ca6-a4b7-cd5830882d28" /> Three-column sample: <img width="1881" height="992" alt="image" src="https://github.com/user-attachments/assets/9ee88844-d5b1-4927-9e4e-3bd810d6e03a" /> Single-column sample: <img width="1883" height="1042" alt="image" src="https://github.com/user-attachments/assets/e93d3d18-43c3-4067-b5fa-e454ed0ab093" /> ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-10-11 18:46:09 +08:00
Kevin Hu	7d2f65671f	Feat: debugging toc part. (#10486 ) ### What problem does this PR solve? #10436 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-11 18:45:21 +08:00
Günter Lukas	fee757eb41	Fix: Disable reasoning on Gemini 2.5 Flash by default (#10477 ) ### What problem does this PR solve? Gemini 2.5 Flash Models use reasoning by default. There is currently no way to disable this behaviour. This leads to very long response times (> 1min). The default behaviour should be, that reasoning is disabled and configurable issue #10474 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue)	2025-10-11 10:22:51 +08:00
Billy Bao	534fa60b2a	Fix: Agent.reset() argument wrong #10463 & Unable to converse with agent through Python API. #10415 (#10472 ) ### What problem does this PR solve? Fix: Agent.reset() argument wrong #10463 & Unable to converse with agent through Python API. #10415 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-10 20:44:05 +08:00
Günter Lukas	0283e4098f	Fix #10408 (#10471 ) ### What problem does this PR solve? Google Cloud model does not work correctly with gemini-2.5 models Close #10408 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-10-10 19:18:24 +08:00
Kevin Hu	0d8791936e	Feat: TOC retrieval (#10456 ) ### What problem does this PR solve? #10436 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 17:07:55 +08:00
buua436	5d167cd772	feat: support qwq reasoning models with non-stream output (#10468 ) ### What problem does this PR solve? issue: [#6193](https://github.com/infiniflow/ragflow/issues/6193) change: support qwq reasoning models with non-stream output ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 16:38:04 +08:00
Billy Bao	c802a6ffdd	Feat: Add prompts for toc relevance check according to #10436 (#10457 ) ### What problem does this PR solve? Feat: Add prompts for toc relevance check according to #10436 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 11:44:46 +08:00
Stephen Hu	6ab4c1a6e9	Refactor: improve how NvidiaCV calculate res total token counts (#10455 ) ### What problem does this PR solve? improve how NvidiaCV calculate res total token counts ### Type of change - [x] Refactoring	2025-10-10 11:03:40 +08:00
Yongteng Lei	8aabc2807c	Feat: Pipeline Docx file supports Markdown output (#10439 ) ### What problem does this PR solve? Pipeline Docx file supports Markdown output. <img width="1242" height="755" alt="image" src="https://github.com/user-attachments/assets/63cca75b-20b9-4a90-a01c-c0c2fccf1f2a" /> <img width="1227" height="717" alt="image" src="https://github.com/user-attachments/assets/0dcb94b2-7ba0-48d5-9231-dc6e5c4b4192" /> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 09:39:15 +08:00

1 2 3 4 5 ...

1016 Commits