Commit Graph

867 Commits

Author SHA1 Message Date
65d5268439 Feat: implement novitaAI embedding and reranking. (#8250)
### What problem does this PR solve?

Close #8227

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-13 15:42:17 +08:00
6aa0b0819d Fix: unify opendal config key from ‎schema to ‎scheme (#8232)
### What problem does this PR solve?

This PR resolves the inconsistency in the opendal configuration where
both ‎`schema` and ‎`scheme` were used as keys. The code and
configuration file now consistently use ‎`scheme`, which helps prevent
configuration errors and runtime issues. This change improves code
clarity and maintainability.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Additional context
- Updated both ‎`conf/service_conf.yaml` and
‎`rag/utils/opendal_conn.py` to use ‎`scheme` instead of ‎`schema`
- No breaking changes to other configuration fields
2025-06-13 14:56:51 +08:00
3d0b440e9f fix(search.py):remove hard page_size (#8242)
### What problem does this PR solve?

Fix the restriction of forcing similarity_threshold=0 and page_size=30
when doc_ids is not empty

#8228

---------

Co-authored-by: shiqing.wusq <shiqing.wusq@dtzhejiang.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-13 14:56:25 +08:00
24ca4cc6b7 Refa: GraphRAG and explaining GraphRAG stalling behavior on large files (#8223)
### What problem does this PR solve?

This PR investigates the cause of #7957.

TL;DR: Incorrect similarity calculations lead to too many candidates.
Since candidate selection involves interaction with the LLM, this causes
significant delays in the program.

What this PR does:

1. **Fix similarity calculation**:
When processing a 64 pages government document, the corrected similarity
calculation reduces the number of candidates from over 100,000 to around
16,000. With a default batch size of 100 pairs per LLM call, this fix
reduces unnecessary LLM interactions from over 1,000 calls to around
160, a roughly 10x improvement.
2. **Add concurrency and timeout limits**: 
Up to 5 entity types are processed in "parallel", each with a 180-second
timeout. These limits may be configurable in future updates.
3. **Improve logging**:
The candidate resolution process now reports progress in real time.
4. **Mitigates potential concurrency risks**


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-06-12 19:09:50 +08:00
d36c8d18b1 Refa: make exception more clear. (#8224)
### What problem does this PR solve?

#8156

### Type of change
- [x] Refactoring
2025-06-12 17:53:59 +08:00
d5236b71f4 Refa: ollama keep alive issue. (#8216)
### What problem does this PR solve?

#8122

### Type of change

- [x] Refactoring
2025-06-12 15:09:40 +08:00
56ee69e9d9 Refa: chat with tools. (#8210)
### What problem does this PR solve?


### Type of change
- [x] Refactoring
2025-06-12 12:31:10 +08:00
44287fb05f Oss support opendal(including mysql) (#8204)
### What problem does this PR solve?

#8074
Oss support opendal(including mysql)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-12 11:37:42 +08:00
1a5f991d86 Fix: auto-keyword and auto-question fail with qwq model (#8190)
### What problem does this PR solve?

Fix auto-keyword and auto-question fail with qwq model. #8189 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 11:37:07 +08:00
69e1fc496d Refa: chat models (#8187)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-06-11 17:20:12 +08:00
a43adafc6b Refa: Add error handling for JSON decode in embedding models (#8162)
### What problem does this PR solve?

Improve robustness of Jina, Nvidia, and SILICONFLOW embedding models by:
1. Adding try-catch blocks for JSON decode errors
2. Logging error details including response content
3. Raising exceptions with meaningful error messages

### Type of change

- [x] Refactoring
2025-06-10 19:04:17 +08:00
60ab7027c0 fix: allow to do role auth for S3 bucket use. (#8149)
### What problem does this PR solve?

Close #8148 .

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-10 10:50:07 +08:00
baf32ee461 Display only the duplicate column names and corresponding original source. (#8138)
### What problem does this PR solve?
This PR aims to slove #8120 which request a better error display of
duplicate column names.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-10 10:16:38 +08:00
24625e0695 Fix: presentation of PDF using vlm. (#8133)
### What problem does this PR solve?

#8109

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-09 15:01:52 +08:00
1ed0b25910 Fix task_limiter in raptor.py (#8124)
### What problem does this PR solve?

Fix task_limiter in raptor.py

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-09 10:18:03 +08:00
7ed9efcd4e Fix: QWenCV issue. (#8106)
### What problem does this PR solve?

Close #8097

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-06 17:55:13 +08:00
0e03542db5 fix: single task executor getting all tasks from Redis queue (#7330)
### What problem does this PR solve?

Currently, as long as there are tasks in Redis, this loop will keep
getting the tasks. This will lead to a single task executor with many
tasks in the pending state. Then we need to wait for the pending tasks
to get them back in the queue.

In first place, if we set the `MAX_CONCURRENT_TASKS` to X, then only X
tasks should be picked from the queue, and others should be left in the
queue for other `task_executors` or be picked after 1 of the spots in
the current executor gets free. This PR ensures this behavior.

The additional changes were due to the Ruff linting in pre-commit. But I
believe these are expected to keep the coding style.

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2025-06-06 14:32:35 +08:00
31d2b3cb5a Fix: Grammar and clarity improvements in prompt templates (#8023)
## Summary
Fixed grammar errors and improved clarity in prompt templates throughout
`rag/prompts.py`.

## Changes Made
- **Fixed incomplete sentence**: `"If the user's latest question is
completely, don't do anything"` → `"If the user's latest question is
already complete, don't do anything"`
- **Improved phrasing**: `"of like [ID:i]"` → `"such as [ID:i]"`
- **Added missing articles**: `"give top 3"` → `"give the top 3"`
- **Fixed prepositions**: `"in language of"` → `"in the same language
as"`
- **Corrected spelling**: `"Jappanese"` → `"Japanese"`
- **Standardized formatting**: Consistent role descriptions and
punctuation

## Impact
These changes improve prompt readability and should make instructions
clearer for the underlying language models.

## Test Plan
- [x] Verified changes maintain original prompt functionality
- [x] No breaking changes to prompt structure or expected outputs

Co-authored-by: Adrian Altermatt <adrian.altermatt@fgcz.uzh.ch>
2025-06-03 19:41:59 +08:00
156290f8d0 Fix: url path join issue. (#8013)
### What problem does this PR solve?

Close #7980

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-03 14:18:40 +08:00
37998abef3 Update synonym dictionary file (#7997)
### What problem does this PR solve?

Update the synonym dictionary file with relevant time and date to
prevent synonyms from being mistakenly escaped.

### Type of change

- [x] Refactoring
2025-06-03 09:41:53 +08:00
93f5df716f Fix: order chunks from docx by positions. (#7979)
### What problem does this PR solve?

#7934

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-30 17:20:53 +08:00
bd4678bca6 Fix: Unnecessary truncation in markdown parser (#7972)
### What problem does this PR solve?

Fix unnecessary truncation in markdown parser. So that markdown can work
perfectly like
[this](https://github.com/infiniflow/ragflow/issues/7824#issuecomment-2921312576)
in #7824, supporting multiple special delimiters.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-30 15:04:21 +08:00
46963ab1ca Fix: add advanced delimiter detection for naive merge (#7941)
### What problem does this PR solve?

Add advanced delimiter detection for naive merge. #7824

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-05-29 16:17:22 +08:00
6ba5a4348a set PARALLEL_DEVICES default value= 0 (#7935)
### What problem does this PR solve?


it would be fail if PARALLEL_DEVICES = None in OCR class , because it
pass 0 to TextDetector and TextRecognizer init method.

and It would be simpler to set 0 as the default value for
PARALLEL_DEVICES.

### Type of change

- [x] Refactoring
2025-05-29 13:32:16 +08:00
0c562f0a9f Refa: change citation mark as [ID:n] (#7923)
### What problem does this PR solve?

Change citation mark as [ID:n], it's easier for LLMs to follow the
instruction :) #7904

### Type of change

- [x] Refactoring
2025-05-29 10:03:51 +08:00
273f36cc54 Perf: reduce upload to minio limiter scope (#7878)
### What problem does this PR solve?
reduce upload_to_minio limter scope

### Type of change
- [x] Performance Improvement
2025-05-27 17:49:37 +08:00
28cb4df127 Fix: raptor overloading (#7889)
### What problem does this PR solve?

#7840

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-27 17:41:35 +08:00
959793e83c Fix: task limiter issue. (#7873)
### What problem does this PR solve?

#7869

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-27 11:16:29 +08:00
5d6bf2224a Fix: Opensearch chunk management (#7802)
### What problem does this PR solve?

This PR solve the problems metioned in the
pr(https://github.com/infiniflow/ragflow/pull/7140) which is also
submitted by me

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):


### Introduction
I fixed the problems when using OpenSearch as the DOC_ENGINE, the
failures of pytest and the wrong API's return.
Mainly about delete chunk, list chunks, update chunk, retrieval chunk.
The pytest comand "cd sdk/python && uv sync --python 3.10 --group test
--frozen && source .venv/bin/activate && cd test/test_http_api &&
DOC_ENGINE=opensearch pytest test_chunk_management_within_dataset -s
--tb=short " is finally successful.

###Others
As some changes between Elasticsearch And Opensearch differ, some pytest
results about OpenSearch are correct and resonable. However, some pytest
params (skipif params) are incompatible. So I changed some pytest params
about skipif.

As a search engine programmer, I will still focus on the usage of vector
databases (especially OpenSearch) for the RAG stuff.
Thanks for your review
2025-05-26 16:57:58 +08:00
be83074131 Fix: restore task limiter. (#7844)
### What problem does this PR solve?

Close #7828

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-26 10:59:01 +08:00
2f4d803db1 Delete Corresponding Minio Bucket When Deleting a Knowledge Base (#7841)
### What problem does this PR solve?

Delete Corresponding Minio Bucket When Deleting a Knowledge Base
[issue #4113 ](https://github.com/infiniflow/ragflow/issues/4113)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-05-26 10:02:51 +08:00
Sol
0d7cfce6e1 Update rag/nlp/query.py (#7816)
### What problem does this PR solve?
Fix tokenizer resulting in low recall

![37743d3a495f734aa69f1e173fa77457](https://github.com/user-attachments/assets/1394757e-8fcb-4f87-96af-a92716144884)

![4aba633a17f34269a4e17e84fafb34c4](https://github.com/user-attachments/assets/a1828e32-3e17-4394-a633-ba3f09bd506d)

![image](https://github.com/user-attachments/assets/61308f32-2a4f-44d5-a034-d65bbec554ef)



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-05-23 17:13:37 +08:00
db4371c745 Fix: Improve First Chunk Size (#7806)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/7790

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-23 14:30:19 +08:00
d4a123d6dd Fix: resolve regex library warnings (#7782)
### What problem does this PR solve?
This small PR resolves the regex library warnings showing in Python3.11:
```python
DeprecationWarning: 'count' is passed as positional argument
```

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-05-22 10:06:28 +08:00
ce816edb5f Fix: improve task cancel lag (#7765)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/7761

but it may be difficult to achieve 0 delay (which need to pass the
cancel token to all parts)

Another solution is just 0 delay effect at UI.
And task will stop latter

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-22 09:28:08 +08:00
e3e7c7ddaa Feat: delete useless image blobs when task executor meet edge cases (#7727)
### What problem does this PR solve?

delete useless image blobs when the task executor meets edge cases

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-21 10:22:30 +08:00
5d21cc3660 fix: Fix the problem that concurrent execution limit in task executor fails and causes OOM (issue#7580) (#7700)
### What problem does this PR solve?

## Cause of the bug:
During the execution process, due to improper use of trio
CapacityLimiter, the configuration parameter MAX_CONCURRENT_TASKS is
invalid, causing the executor to take out a large number of tasks from
the Redis queue at one time.

This behavior will cause the task executor to occupy too much memory and
be killed by the OS when a large number of tasks exist at the same time.
As a result, all executing tasks are suspended.

## Fix:
Added the task_manager method to the entry of /rag/svr/task_executor.py
to make CapacityLimiter effective. Deleted the invalid async with
statement.

## Fix result:
After testing, the task executor execution meets expectations, that is:
concurrent execution of up to $MAX_CONCURRENT_TASKS tasks.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-19 10:25:56 +08:00
a1f06a4fdc Feat: Support tool calling in Generate component (#7572)
### What problem does this PR solve?

Hello, our use case requires LLM agent to invoke some tools, so I made a
simple implementation here.

This PR does two things:

1. A simple plugin mechanism based on `pluginlib`:

This mechanism lives in the `plugin` directory. It will only load
plugins from `plugin/embedded_plugins` for now.

A sample plugin `bad_calculator.py` is placed in
`plugin/embedded_plugins/llm_tools`, it accepts two numbers `a` and `b`,
then give a wrong result `a + b + 100`.

In the future, it can load plugins from external location with little
code change.

Plugins are divided into different types. The only plugin type supported
in this PR is `llm_tools`, which must implement the `LLMToolPlugin`
class in the `plugin/llm_tool_plugin.py`.
More plugin types can be added in the future.

2. A tool selector in the `Generate` component:

Added a tool selector to select one or more tools for LLM:


![image](https://github.com/user-attachments/assets/74a21fdf-9333-4175-991b-43df6524c5dc)

And with the `bad_calculator` tool, it results this with the `qwen-max`
model:


![image](https://github.com/user-attachments/assets/93aff9c4-8550-414a-90a2-1a15a5249d94)


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2025-05-16 16:32:19 +08:00
bfe97d896d Fix: docx get image exception. (#7636)
### What problem does this PR solve?

Close #7631

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-14 12:24:48 +08:00
01330fa428 Feat: let image citation being shown. (#7624)
### What problem does this PR solve?

#7623

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-13 19:30:05 +08:00
321a280031 Feat: add image preview to retrieval test. (#7610)
### What problem does this PR solve?

#7608

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-13 14:30:36 +08:00
573d46a4ef FIX:ZeroDivisionError when using large page_size in client.retrieve() (#7595)
### What problem does this PR solve?

Close #7592

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-13 10:46:31 +08:00
4ae8f87754 Fix: missing graph resolution and community extraction in graphrag tasks (#7586)
### What problem does this PR solve?

Info of whether applying graph resolution and community extraction is
storage in `task["kb_parser_config"]`. However, previous code get
`graphrag_conf` from `task["parser_config"]`, making `with_resolution`
and `with_community` are always false.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-13 09:21:03 +08:00
baa108f5cc Fix: markdown table conversion error (#7570)
### What problem does this PR solve?

Since `import markdown.markdown` has been changed to `import markdown`
in `rag/app/naive.py`, previous code for converting markdown tables
would call a markdown module instead of a callable function. This cause
error.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-12 17:16:55 +08:00
5b626870d0 Refa: remove ollama keep alive. (#7560)
### What problem does this PR solve?

#7518

### Type of change

- [x] Refactoring
2025-05-09 17:51:49 +08:00
2ccec93d71 Feat: support cross-lang search. (#7557)
### What problem does this PR solve?

#7376
#4503
#5710 
#7470

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-09 15:32:02 +08:00
a14865e6bb Fix: empty query issue. (#7551)
### What problem does this PR solve?

#5214

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-09 12:20:19 +08:00
332e6ffbd4 Fix:local_es_tag (#7534)
Two Case when local  Es tag search has result which is filtered by score
1: Doc has empty tag,and not visi LLM
2: Code may use empty examples in Prompt for LLM search tag

Co-authored-by: huangfuqunze <huangfuqunze.hfqz@alibaba-inc.com>
2025-05-09 10:17:24 +08:00
5352bdf4da Error storing tag in Redis (#7541)
### What problem does this PR solve?

The parameter positions were incorrect and have been corrected to use
keyword argument passing

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-09 10:17:09 +08:00
2f768b96e8 perf: optimze figure parser (#7392)
### What problem does this PR solve?

When parsing documents containing images, the current code uses a
single-threaded approach to call the VL model, resulting in extremely
slow parsing speed (e.g., parsing a Word document with dozens of images
takes over 20 minutes).

By switching to a multithreaded approach to call the VL model, the
parsing speed can be improved to an acceptable level.

### Type of change

- [x] Performance Improvement

---------

Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
2025-05-06 14:39:45 +08:00