ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-23 03:26:53 +08:00

Author	SHA1	Message	Date
aidan	33a189f620	Feat: add TCADP Parser (#10775 ) ### What problem does this PR solve? This PR adds a new TCADP (Tencent Cloud Advanced Document Processing) parser to RAGFlow, enabling users to leverage Tencent Cloud's document parsing capabilities for more accurate and structured document processing. The implementation includes: New TCADP Parser: A complete implementation of Tencent Cloud's document parsing API without SDK dependency Configuration Support: Added configuration options in service_conf.yaml for Tencent Cloud API credentials Frontend Integration: Updated UI components to support the new TCADP parser option Error Handling: Comprehensive error handling and retry mechanisms for API calls Result Processing: Support for both SSE streaming and JSON response formats from Tencent Cloud API ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-10-27 15:14:58 +08:00
Stephen Hu	56def59c2b	Fix:Error retrieving DOCX image (docx.image.exceptions.UnrecognizedImageError) (#10794 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/10776 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-10-27 13:23:16 +08:00
buua436	0ff2042fc1	Feat: add Docling parser (#10759 ) ### What problem does this PR solve? issue: #3945 change: add Docling parser ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-23 19:44:25 +08:00
Kevin Hu	f24d464a53	Fix: video file suffix (#10740 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-23 11:13:09 +08:00
buua436	41fade3fe6	Fix:wrong param in manual chunk (#10710 ) ### What problem does this PR solve? change: wrong param in manual chunk ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-21 20:10:54 +08:00
buua436	6ab96287c9	Feat:Vision Model Image Enhancement in Manual/Paper/Book/One chunker (#10640 ) ### What problem does this PR solve? issue: [#7472](https://github.com/infiniflow/ragflow/issues/7472) change: Vision Model Image Enhancement in Manual chunker ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-21 09:36:27 +08:00
Yongteng Lei	5b2e5dd334	Feat: Gemini supports video parsing (#10671 ) ### What problem does this PR solve? Gemini supports video parsing. ![img_v3_02r8_adbd5adc-d665-4756-9a00-3ae0f12224fg](https://github.com/user-attachments/assets/30d8d296-c336-4b55-9823-803979e705ca) ![img_v3_02r8_ab60c046-1727-4029-ad2e-66097fd3ccbg](https://github.com/user-attachments/assets/441b1487-a970-427e-98b6-6e1e002f2bad) Close: #10617 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-20 16:49:47 +08:00
Billy Bao	8ee0b6ea54	File: Now parsing support all types of embedded documents, solved #10059 (#10635 ) ### What problem does this PR solve? File: Now parsing support all types of embedded documents, solved #10059 Fix: Incomplete words in chat #10530 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-17 18:46:47 +08:00
Yongteng Lei	387baf858f	Feat: add MinerU parser (#10621 ) ### What problem does this PR solve? Add MinerU parser. #3945, #8092. Set `MINERU_EXECUTABLE` to the MinerU executable path, defaults to `mineru`. Set `MINERU_DELETE_OUTPUT=0` to preserve MinerU's output, default is 1, which deletes temporary output. Set `MINERU_OUTPUT_DIR` to choose the MinerU output directory (uses the temporary directory if unset). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-17 09:55:39 +08:00
Yongteng Lei	5200711441	Feat: add support for multi-column PDF parsing (#10475 ) ### What problem does this PR solve? Add support for multi-columns PDF parsing. #9878, #9919. Two-column sample: <img width="1885" height="1020" alt="image" src="https://github.com/user-attachments/assets/0270c028-2db8-4ca6-a4b7-cd5830882d28" /> Three-column sample: <img width="1881" height="992" alt="image" src="https://github.com/user-attachments/assets/9ee88844-d5b1-4927-9e4e-3bd810d6e03a" /> Single-column sample: <img width="1883" height="1042" alt="image" src="https://github.com/user-attachments/assets/e93d3d18-43c3-4067-b5fa-e454ed0ab093" /> ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-10-11 18:46:09 +08:00
Yongteng Lei	8aabc2807c	Feat: Pipeline Docx file supports Markdown output (#10439 ) ### What problem does this PR solve? Pipeline Docx file supports Markdown output. <img width="1242" height="755" alt="image" src="https://github.com/user-attachments/assets/63cca75b-20b9-4a90-a01c-c0c2fccf1f2a" /> <img width="1227" height="717" alt="image" src="https://github.com/user-attachments/assets/0dcb94b2-7ba0-48d5-9231-dc6e5c4b4192" /> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 09:39:15 +08:00
Jin Hai	d931c33ced	Fix typos: retrievaler -> retriever (#10372 ) ### What problem does this PR solve? Fix typos ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-10-10 09:17:36 +08:00
Kevin Hu	cbf04ee470	Feat: Use data pipeline to visualize the parsing configuration of the knowledge base (#10423 ) ### What problem does this PR solve? #9869 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: jinhai <haijin.chn@gmail.com> Signed-off-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: chanx <1243304602@qq.com> Co-authored-by: balibabu <cike8899@users.noreply.github.com> Co-authored-by: Lynn <lynn_inf@hotmail.com> Co-authored-by: 纷繁下的无奈 <zhileihuang@126.com> Co-authored-by: huangzl <huangzl@shinemo.com> Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com> Co-authored-by: Wilmer <33392318@qq.com> Co-authored-by: Adrian Weidig <adrianweidig@gmx.net> Co-authored-by: Zhichang Yu <yuzhichang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yongteng Lei <yongtengrey@outlook.com> Co-authored-by: Liu An <asiro@qq.com> Co-authored-by: buua436 <66937541+buua436@users.noreply.github.com> Co-authored-by: BadwomanCraZY <511528396@qq.com> Co-authored-by: cucusenok <31804608+cucusenok@users.noreply.github.com> Co-authored-by: Russell Valentine <russ@coldstonelabs.org> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Billy Bao <newyorkupperbay@gmail.com> Co-authored-by: Zhedong Cen <cenzhedong2@126.com> Co-authored-by: TensorNull <129579691+TensorNull@users.noreply.github.com> Co-authored-by: TensorNull <tensor.null@gmail.com> Co-authored-by: TeslaZY <TeslaZY@outlook.com> Co-authored-by: Ajay <160579663+aybanda@users.noreply.github.com> Co-authored-by: AB <aj@Ajays-MacBook-Air.local> Co-authored-by: 天海蒼灆 <huangaoqin@tecpie.com> Co-authored-by: He Wang <wanghechn@qq.com> Co-authored-by: Atsushi Hatakeyama <atu729@icloud.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Mohamed Mathari <155896313+melmathari@users.noreply.github.com> Co-authored-by: Mohamed Mathari <nocodeventure@Mac-mini-van-Mohamed.fritz.box> Co-authored-by: Stephen Hu <stephenhu@seismic.com> Co-authored-by: Shaun Zhang <zhangwfjh@users.noreply.github.com> Co-authored-by: zhimeng123 <60221886+zhimeng123@users.noreply.github.com> Co-authored-by: mxc <mxc@example.com> Co-authored-by: Dominik Novotný <50611433+SgtMarmite@users.noreply.github.com> Co-authored-by: EVGENY M <168018528+rjohny55@users.noreply.github.com> Co-authored-by: mcoder6425 <mcoder64@gmail.com> Co-authored-by: lemsn <lemsn@msn.com> Co-authored-by: lemsn <lemsn@126.com> Co-authored-by: Adrian Gora <47756404+adagora@users.noreply.github.com> Co-authored-by: Womsxd <45663319+Womsxd@users.noreply.github.com> Co-authored-by: FatMii <39074672+FatMii@users.noreply.github.com>	2025-10-09 12:36:19 +08:00
Billy Bao	ca9f30e1a1	Add tree_merge for law parsers, significantly outperforming hierarchical_merge (#10202 ) ### What problem does this PR solve? Add tree_merge for law parsers, significantly outperforming hierarchical_merge, solved: #8637 1. Add tree_merge for law parsers, include build_tree and get_tree by dfs. 2. add Copyright statement for helath_utils ### Type of change - [x] Documentation Update - [x] Performance Improvement	2025-09-22 16:33:21 +08:00
He Wang	902703d145	Fix: skip tag query if tag kbs are invalid (#10168 ) ### What problem does this PR solve? Skip `tag_query` step if `tag_kbs` are empty. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-19 19:12:18 +08:00
Billy Bao	ea0f1d47a5	Support image recognition for url links in Markdown file, fix log error in code_exec (#10139 ) ### What problem does this PR solve? Support image recognition with image links in markdown files, solved issue: #8755 Fixed log info error in code_exec, solved issue: #10064 ### Type of change (8755) - [x] New Feature (non-breaking change which adds functionality) ### Type of change (10064) - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-18 09:44:17 +08:00
Stephen Hu	179091b1a4	Fix: In ragflow/rag/app /naive.py, if there are multiple images in one line, the other images will be lost (#9968 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9966 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-09-11 11:08:31 +08:00
Stephen Hu	0b456a18a3	Refactor: Improve the buffer close for vision_llm_chunk (#9845 ) ### What problem does this PR solve? Improve the buffer close for vision_llm_chunk ### Type of change - [x] Refactoring	2025-09-02 10:31:37 +08:00
pingguoCooler	cf0011be67	Feat: Upgrade html parser (#9675 ) ### What problem does this PR solve? parse more html content. ### Type of change - [x] Other (please describe):	2025-08-27 12:43:55 +08:00
Yongteng Lei	382458ace7	Feat: advanced markdown parsing (#9607 ) ### What problem does this PR solve? Using AST parsing to handle markdown more accurately, preventing components from being cut off by chunking. #9564 <img width="1746" height="993" alt="image" src="https://github.com/user-attachments/assets/4aaf4bf6-5714-4d48-a9cf-864f59633f7f" /> <img width="1739" height="982" alt="image" src="https://github.com/user-attachments/assets/dc00233f-7a55-434f-bbb7-74ce7f57a6cf" /> <img width="559" height="100" alt="image" src="https://github.com/user-attachments/assets/4a556b5b-d9c6-4544-a486-8ac342bd504e" /> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-08-21 09:36:18 +08:00
Kevin Hu	312f1a0477	Fix: enlarge raptor timeout limits. (#9600 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-20 17:29:15 +08:00
Yongteng Lei	787e0c6786	Refa: OpenAI whisper-1 (#9552 ) ### What problem does this PR solve? Refactor OpenAI to enable audio parsing. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-08-19 16:41:18 +08:00
Yongteng Lei	eef43fa25c	Fix: unexpected truncated Excel files (#9500 ) ### What problem does this PR solve? Handle unexpected truncated Excel files. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-15 17:00:34 +08:00
Jay Xu	6d1078b538	fix 'KeyError: "There is no item named 'word/NULL' in the archive"' (#9455 ) ### What problem does this PR solve? Issue referring to: https://github.com/python-openxml/python-docx/issues/797 Fix referring to: https://github.com/python-openxml/python-docx/issues/1105#issuecomment-1298075246 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-14 12:14:03 +08:00
HaiyangP	79399f7f25	Support the case of one cell split by multiple columns. (#9225 ) ### What problem does this PR solve? Support the case of one cell split by multiple columns. Besides, the codes are compatible with the common cell case. #8606 can be fixed. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) I provide a case of one cell split by multiple columns: [test.xlsx](https://github.com/user-attachments/files/21578693/test.xlsx) The chunk res: <img width="236" height="57" alt="2025-06-17 16-04-07 的屏幕截图" src="https://github.com/user-attachments/assets/b0a499ac-349d-4c3d-8c6e-0931c8fc26de" />	2025-08-11 17:17:56 +08:00
Jay Xu	7f08ba47d7	Fix "no `tc` element at grid_offset" (#9375 ) ### What problem does this PR solve? fix "no `tc` element at grid_offset", just log warning and ignore. stacktrace: ``` Traceback (most recent call last): File "/ragflow/rag/svr/task_executor.py", line 620, in handle_task await do_handle_task(task) File "/ragflow/rag/svr/task_executor.py", line 553, in do_handle_task chunks = await build_chunks(task, progress_callback) File "/ragflow/rag/svr/task_executor.py", line 257, in build_chunks cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"], File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync return msg_from_thread.unwrap() File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap raise captured_error File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result return result.unwrap() File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap raise captured_error File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn ret = context.run(sync_fn, *args) File "/ragflow/rag/svr/task_executor.py", line 257, in <lambda> cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"], File "/ragflow/rag/app/naive.py", line 384, in chunk sections, tables = Docx()(filename, binary) File "/ragflow/rag/app/naive.py", line 230, in __call__ while i < len(r.cells): File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 438, in cells return tuple(_iter_row_cells()) File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 436, in _iter_row_cells yield from iter_tc_cells(tc) File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 424, in iter_tc_cells yield from iter_tc_cells(tc._tc_above) # pyright: ignore[reportPrivateUsage] File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 741, in _tc_above return self._tr_above.tc_at_grid_offset(self.grid_offset) File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 98, in tc_at_grid_offset raise ValueError(f"no `tc` element at grid_offset={grid_offset}") ValueError: no `tc` element at grid_offset=10 ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-11 17:13:10 +08:00
yzz	550e65bb22	Fix: PlainParser using fix in presentation (#9239 ) ### What problem does this PR solve? tiny fix about the using of `deepdoc.pdf_parser.PlainParser` in `rag.app.presentation.chunk`, I referred to other ways of using this class. So tiny the fix is, a issue seems unnecessary. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-05 17:48:18 +08:00
Jay Xu	cae11201ef	fix "out of memory" if slide.get_thumbnail() to a huge image (#9211 ) ### What problem does this PR solve? fix "out of memory" if slide.get_thumbnail() to a huge image ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-04 16:08:24 +08:00
Kevin Hu	d9fe279dde	Feat: Redesign and refactor agent module (#9113 ) ### What problem does this PR solve? #9082 #6365 <u> WARNING: it's not compatible with the older version of `Agent` module, which means that `Agent` from older versions can not work anymore.</u> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-30 19:41:09 +08:00
Yongteng Lei	39ef2ffba9	Feat: parsing supports jsonl or ldjson format (#9087 ) ### What problem does this PR solve? Supports jsonl or ldjson format. Feature request from [discussion](https://github.com/orgs/infiniflow/discussions/8774). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-30 09:48:20 +08:00
Stephen Hu	92cfbcb382	Fix: when parse markdown support extract image at local (#8906 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8902 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-18 17:06:58 +08:00
Yongteng Lei	e9b14142a5	Fix: fixed invalid save() arguments for slide thumbnails (#8851 ) ### What problem does this PR solve? Fixed invalid save() arguments for slide thumbnails. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-15 17:19:45 +08:00
Yongteng Lei	51a8604dcb	Fix: fixed context loss caused by separating markdown tables from original text (#8844 ) ### What problem does this PR solve? Fix context loss caused by separating markdown tables from original text. #6871, #8804. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-15 13:03:01 +08:00
Stephen Hu	ce140f1393	Fix:Better Support Table Value Type (#8822 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8782 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-14 17:51:26 +08:00
Stephen Hu	2b7adbd2d1	Fix: Improve Memory Usage For Presentation (#8792 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8791 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-11 11:35:25 +08:00
wenxuan.zhang	f586dd0a96	Fix: docx parse error. (#8600 ) ### What problem does this PR solve? docx parse error. ![image](https://github.com/user-attachments/assets/efbe6d1b-10c8-415e-b693-a86f73e1ffa6) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### What problem does this PR solve? Some docx parse with naive cause error. `block.style.name` in Function `__get_nearest_title` will be None in some case. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenxuan.zhang <wenxuan.zhang@chinacreator.com>	2025-07-01 17:38:11 +08:00
Tuan Le	6b1221d2f6	Fix parser_config access for layout_recognize in presentation.py (#8492 ) ### What problem does this PR solve? This PR addresses an issue in the presentation parser where the `layout_recognize` configuration was incorrectly retrieved from `kwargs.get("layout_recognize", "DeepDOC")`. Instead, it should be sourced from the `parser_config` parameter, specifically `parser_config.get("layout_recognize", "DeepDOC")`. This mismatch could cause the parser to default to the "DeepDOC" layout recognizer, ignoring any alternative recognition method specified in the parser configuration. As a result, PDF document parsing might use an incorrect recognition engine. The fix ensures the presentation parser consistently uses the `layout_recognize` setting from `parser_config`, aligning with the configuration access patterns used elsewhere in the codebase. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-26 11:54:43 +08:00
liuzhenghua	5256980ffb	Fix: Solve the OOM issue when passing large PDF files while using QA chunking method. (#8464 ) ### What problem does this PR solve? Using the QA chunking method with a large PDF (e.g., 300+ pages) may lead to OOM in the ragflow-worker module. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-25 10:25:45 +08:00
HaiyangP	d6a941ebf5	Fix the bug of long type value overflow (#8313 ) ### What problem does this PR solve? This PR will fix the #8271 by extending int type to float type when there is any value out of long type range in a column. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-24 18:18:30 +08:00
Jin Hai	4a2ff633e0	Fix typo in code (#8327 ) ### What problem does this PR solve? Fix typo in code ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-06-18 09:41:09 +08:00
HaiyangP	baf32ee461	Display only the duplicate column names and corresponding original source. (#8138 ) ### What problem does this PR solve? This PR aims to slove #8120 which request a better error display of duplicate column names. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 10:16:38 +08:00
Kevin Hu	24625e0695	Fix: presentation of PDF using vlm. (#8133 ) ### What problem does this PR solve? #8109 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 15:01:52 +08:00
Yongteng Lei	bd4678bca6	Fix: Unnecessary truncation in markdown parser (#7972 ) ### What problem does this PR solve? Fix unnecessary truncation in markdown parser. So that markdown can work perfectly like [this](https://github.com/infiniflow/ragflow/issues/7824#issuecomment-2921312576) in #7824, supporting multiple special delimiters. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-30 15:04:21 +08:00
Kevin Hu	bfe97d896d	Fix: docx get image exception. (#7636 ) ### What problem does this PR solve? Close #7631 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-14 12:24:48 +08:00
Kevin Hu	321a280031	Feat: add image preview to retrieval test. (#7610 ) ### What problem does this PR solve? #7608 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-05-13 14:30:36 +08:00
alkscr	baa108f5cc	Fix: markdown table conversion error (#7570 ) ### What problem does this PR solve? Since `import markdown.markdown` has been changed to `import markdown` in `rag/app/naive.py`, previous code for converting markdown tables would call a markdown module instead of a callable function. This cause error. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-05-12 17:16:55 +08:00
WhiteBear	5352bdf4da	Error storing tag in Redis (#7541 ) ### What problem does this PR solve? The parameter positions were incorrect and have been corrected to use keyword argument passing ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-09 10:17:09 +08:00
Stephen Hu	1a5608d0f8	Fix: Add title_tks for Pictures (#7365 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/7362 append title_tks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-04-28 13:35:34 +08:00
Stephen Hu	1662c7eda3	Feat: Markdown add image (#7124 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/6984 1. Markdown parser supports get pictures 2. For Native, when handling Markdown, it will handle images 3. improve merge and ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-04-25 18:35:28 +08:00
QuintinTao	1b4016317e	fix bug chunking:expected string or bytes-like object (#7116 ) … bytes-like object ### What problem does this PR solve? fix bug #6990 internal server error ehile chunking:expected string or bytes-like object _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Co-authored-by: unknown <taoshi.ln@chinatelecom.cn>	2025-04-18 14:42:36 +08:00

1 2 3 4 5

205 Commits