### What problem does this PR solve?
Fix: opensearch retrieval error #10828
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Allow initialize Redis without password.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- rename rmSpace to remove_redundant_spaces
- move clean_markdown_block to common module
- add unit tests for remove_redundant_spaces and clean_markdown_block
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: parsing excel with chartsheet #10815
Fix: Clamp begin to a minimum of 0 to prevent negative indexing #10804
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
MinerU supports VLM-Transfomers backend.
Set `MINERU_BACKEND="pipeline"` to choose the backend. (Options:
pipeline | vlm-transformers, default is pipeline)
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
This PR adds a new TCADP (Tencent Cloud Advanced Document Processing)
parser to RAGFlow, enabling users to leverage Tencent Cloud's document
parsing capabilities for more accurate and structured document
processing. The implementation includes:
New TCADP Parser: A complete implementation of Tencent Cloud's document
parsing API without SDK dependency
Configuration Support: Added configuration options in service_conf.yaml
for Tencent Cloud API credentials
Frontend Integration: Updated UI components to support the new TCADP
parser option
Error Handling: Comprehensive error handling and retry mechanisms for
API calls
Result Processing: Support for both SSE streaming and JSON response
formats from Tencent Cloud API
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: prio synonym match than wordnet for english
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
issue:
#3945
change:
add Docling parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Pipeline supports MinerU PDF parser.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: can't upload image in ollama model #10447
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### Change all `image=[]` to `image = None`
Changing `image=[]` to `images=None` avoids Python’s mutable default
parameter issue.
If you keep `images=[]`, all calls share the same list, so modifying it
(e.g., images.append()) will affect later calls.
Using images=None and creating a new list inside the function ensures
each call is independent.
This change does not affect current behavior — it simply makes the code
safer and more predictable.
把 `images=[]` 改成 `images=None` 是为了避免 Python 默认参数的可变对象问题。
如果保留 `images=[]`,所有调用都会共用同一个列表,一旦修改就会影响后续调用。
改成 None 并在函数内部重新创建列表,可以确保每次调用都是独立的。
这个修改不会影响现有运行结果,只是让代码更安全、更可控。
### What problem does this PR solve?
Fix potential negative max_tokens in RAPTOR. #10235.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue
### What problem does this PR solve?
issue:
[#7472](https://github.com/infiniflow/ragflow/issues/7472)
change:
Vision Model Image Enhancement in Manual chunker
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Qwen-VL series supports video parsing. #10617.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
File: Now parsing support all types of embedded documents, solved #10059
Fix: Incomplete words in chat #10530
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
issue:
[#9004](https://github.com/infiniflow/ragflow/issues/9004)
change:
VolcEngine Model type add IMAGE2TEXT
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add MinerU parser. #3945, #8092.
Set `MINERU_EXECUTABLE` to the MinerU executable path, defaults to
`mineru`.
Set `MINERU_DELETE_OUTPUT=0` to preserve MinerU's output, default is 1,
which deletes temporary output.
Set `MINERU_OUTPUT_DIR` to choose the MinerU output directory (uses the
temporary directory if unset).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Added new field 'toc_kwd' to infinity_mapping.json for table of
contents keyword support
- Changed page_num_int from integer to array type in task_executor.py to
handle multiple page numbers
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
issue:
[#5787](https://github.com/infiniflow/ragflow/issues/5787)
change:
Support Specifying OpenRouter Model Provider
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Improve file management. #10287.
Passed tests:
1. Create folder `A` and `B`.
2. Upload a file inside `A`, called `file`.
3. Create a KB, called `K`.
3. Link `file` to `K`.
4. Parse `file` inside of `K`. (OK)
5. Move `file` from `A` to `B`.
6. Parse `file` inside of `K`. (OK)
7. Move `file` from `B` to `A`.
8. Parse `file` inside of `K`. (OK)
9. Move entire folder `A` into `B`. (B -> A -> file)
10. Parse `file` inside of `K`. (OK)
11. Delete folder `B`.
12. All clear. (There is no document inside of `K`)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Don't need rerank for infinity since Infinity normalizes each way score
before fusion.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
issue:
#10495
change:
fix empty references in agent conversation
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Maintain backward compatibility for KB tasks
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)