### What problem does this PR solve?
Fixed invalid save() arguments for slide thumbnails.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix context loss caused by separating markdown tables from original
text. #6871, #8804.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
docx parse error.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Some docx parse with naive cause error. `block.style.name` in Function
`__get_nearest_title` will be None in some case.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenxuan.zhang <wenxuan.zhang@chinacreator.com>
### What problem does this PR solve?
This PR addresses an issue in the presentation parser where the
`layout_recognize` configuration was incorrectly retrieved from
`kwargs.get("layout_recognize", "DeepDOC")`. Instead, it should be
sourced from the `parser_config` parameter, specifically
`parser_config.get("layout_recognize", "DeepDOC")`.
This mismatch could cause the parser to default to the "DeepDOC" layout
recognizer, ignoring any alternative recognition method specified in the
parser configuration. As a result, PDF document parsing might use an
incorrect recognition engine.
The fix ensures the presentation parser consistently uses the
`layout_recognize` setting from `parser_config`, aligning with the
configuration access patterns used elsewhere in the codebase.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Using the QA chunking method with a large PDF (e.g., 300+ pages) may
lead to OOM in the ragflow-worker module.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR will fix the #8271 by extending int type to float type when
there is any value out of long type range in a column.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR aims to slove #8120 which request a better error display of
duplicate column names.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix unnecessary truncation in markdown parser. So that markdown can work
perfectly like
[this](https://github.com/infiniflow/ragflow/issues/7824#issuecomment-2921312576)
in #7824, supporting multiple special delimiters.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Since `import markdown.markdown` has been changed to `import markdown`
in `rag/app/naive.py`, previous code for converting markdown tables
would call a markdown module instead of a callable function. This cause
error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
The parameter positions were incorrect and have been corrected to use
keyword argument passing
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/6984
1. Markdown parser supports get pictures
2. For Native, when handling Markdown, it will handle images
3. improve merge and
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
… bytes-like object
### What problem does this PR solve?
fix bug #6990 internal server error ehile chunking:expected string or
bytes-like object
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: unknown <taoshi.ln@chinatelecom.cn>
### What problem does this PR solve?
Sometimes a slide may trigger a Proxy error (ArgumentException:
Parameter is not valid) due to issues in the original file, and this
error message can be confusing for users.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [x] Other (please describe):
### What problem does this PR solve?
Adds hierarchical title path tracking for tables in DOCX documents to
improve context association. Previously, extracted tables lacked
positional context within document structure.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add fallback for PDF figure parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add VLM-boosted PDF parser if VLM is set.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add vision LLM PDF parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Optimize OCR garbage identification to reduce unnecessary filtering.
#5713
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add CSV file parsing support #4552, #5849, #5870
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix some bugs in text2sql.(#4279)(#4281)
### What problem does this PR solve?
- The incorrect results in parsing CSV files of the QA knowledge base in
the text2sql scenario. Process CSV files using the csv library. Decouple
CSV parsing from TXT parsing
- Most llm return results in markdown format ```sql query ```, Fix
execution error caused by LLM output SQLmarkdown format.### Type of
change
- [x] Bug Fix (non-breaking change which fixes an issue)