Commit Graph

359 Commits

Author SHA1 Message Date
1936ad82d2 Refactor:Improve BytesIO usage for GeminiCV (#10042)
### What problem does this PR solve?
Improve BytesIO usage for GeminiCV

### Type of change
- [x] Refactoring
2025-09-11 11:07:15 +08:00
127af4e45c Refactor:Improve BytesIO usage for image2base64 (#9997)
### What problem does this PR solve?

Improve BytesIO usage for image2base64

### Type of change

- [x] Refactoring
2025-09-10 15:55:33 +08:00
0d9c1f1c3c Feat: dataflow supports Spreadsheet and Word processor document (#9996)
### What problem does this PR solve?

Dataflow supports Spreadsheet and Word processor document

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-10 13:02:53 +08:00
936f27e9e5 Feat: add LongCat-Flash-Chat (#9973)
### What problem does this PR solve?

Add LongCat-Flash-Chat from Meituan, deepseek v3.1 from SiliconFlow,
kimi-k2-09-05-preview and kimi-k2-turbo-preview from Moonshot.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-08 19:00:52 +08:00
91d6fb8061 Fix miscalculated token count (#9776)
### What problem does this PR solve?

The total token was incorrectly accumulated when using the
OpenAI-API-Compatible api.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-05 19:17:21 +08:00
b58e882eaa Feat: add exponential back-off for Chat LiteLLM (#9880)
### What problem does this PR solve?

Add exponential back-off for Chat LiteLLM. #9858.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-03 13:31:43 +08:00
2e00d8d3d4 Use 'float' explicitly for OpenAI's embedding "encoding_format" (#9838)
### What problem does this PR solve?

The default value for OpenAI '/v1/embeddings' parameter
'encoding_format' is 'base64'. Use 'float' explicitly to avoid base64
encoding & decoding, larger data size.


https://github.com/openai/openai-python/blob/main/src/openai/resources/embeddings.py
        if not is_given(encoding_format):
            params["encoding_format"] = "base64"

### Type of change

- [x] Performance Improvement
2025-09-02 10:31:51 +08:00
56cd576876 Refa: revise the implementation of LightRAG and enable response caching (#9828)
### What problem does this PR solve?

This revision performed a comprehensive check on LightRAG to ensure the
correctness of its implementation. It **did not involve** Entity
Resolution and Community Reports Generation. There is an example using
default entity types and the General chunking method, which shows good
results in both time and effectiveness. Moreover, response caching is
enabled for resuming failed tasks.


[The-Necklace.pdf](https://github.com/user-attachments/files/22042432/The-Necklace.pdf)

After:


![img_v3_02pk_177dbc6a-e7cc-4732-b202-ad4682d171fg](https://github.com/user-attachments/assets/5ef1d93a-9109-4fe9-8a7b-a65add16f82b)


```bash
Begin at:
Fri, 29 Aug 2025 16:48:03 GMT
Duration:
222.31 s
Progress:
16:48:04 Task has been received.
16:48:06 Page(1~7): Start to parse.
16:48:06 Page(1~7): OCR started
16:48:08 Page(1~7): OCR finished (1.89s)
16:48:11 Page(1~7): Layout analysis (3.72s)
16:48:11 Page(1~7): Table analysis (0.00s)
16:48:11 Page(1~7): Text merged (0.00s)
16:48:11 Page(1~7): Finish parsing.
16:48:12 Page(1~7): Generate 7 chunks
16:48:12 Page(1~7): Embedding chunks (0.29s)
16:48:12 Page(1~7): Indexing done (0.04s). Task done (7.84s)
16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: She had no dresses, no je...
16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: Her husband, already half...
16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: And this life lasted ten ...
16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: Then she asked, hesitatin...
16:49:30 Completed processing for f421fb06849e11f0bdd32724b93a52b2: She had no dresses, no je... after 1 gleanings, 21985 tokens.
16:49:30 Entities extraction of chunk 3 1/7 done, 12 nodes, 13 edges, 21985 tokens.
16:49:40 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Finally, she replied, hes... after 1 gleanings, 22584 tokens.
16:49:40 Entities extraction of chunk 5 2/7 done, 19 nodes, 19 edges, 22584 tokens.
16:50:02 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Then she asked, hesitatin... after 1 gleanings, 24610 tokens.
16:50:02 Entities extraction of chunk 0 3/7 done, 16 nodes, 28 edges, 24610 tokens.
16:50:03 Completed processing for f421fb06849e11f0bdd32724b93a52b2: And this life lasted ten ... after 1 gleanings, 24031 tokens.
16:50:04 Entities extraction of chunk 1 4/7 done, 24 nodes, 22 edges, 24031 tokens.
16:50:14 Completed processing for f421fb06849e11f0bdd32724b93a52b2: So they begged the jewell... after 1 gleanings, 24635 tokens.
16:50:14 Entities extraction of chunk 6 5/7 done, 27 nodes, 26 edges, 24635 tokens.
16:50:29 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Her husband, already half... after 1 gleanings, 25758 tokens.
16:50:29 Entities extraction of chunk 2 6/7 done, 25 nodes, 35 edges, 25758 tokens.
16:51:35 Completed processing for f421fb06849e11f0bdd32724b93a52b2: The Necklace By Guy de Ma... after 1 gleanings, 27491 tokens.
16:51:35 Entities extraction of chunk 4 7/7 done, 39 nodes, 37 edges, 27491 tokens.
16:51:35 Entities and relationships extraction done, 147 nodes, 177 edges, 171094 tokens, 198.58s.
16:51:35 Entities merging done, 0.01s.
16:51:35 Relationships merging done, 0.01s.
16:51:35 ignored 7 relations due to missing entities.
16:51:35 generated subgraph for doc f421fb06849e11f0bdd32724b93a52b2 in 198.68 seconds.
16:51:35 run_graphrag f421fb06849e11f0bdd32724b93a52b2 graphrag_task_lock acquired
16:51:35 set_graph removed 0 nodes and 0 edges from index in 0.00s.
16:51:35 Get embedding of nodes: 9/147
16:51:35 Get embedding of nodes: 109/147
16:51:37 Get embedding of edges: 9/170
16:51:37 Get embedding of edges: 109/170
16:51:40 set_graph converted graph change to 319 chunks in 4.21s.
16:51:40 Insert chunks: 4/319
16:51:40 Insert chunks: 104/319
16:51:40 Insert chunks: 204/319
16:51:40 Insert chunks: 304/319
16:51:40 set_graph added/updated 147 nodes and 170 edges from index in 0.53s.
16:51:40 merging subgraph for doc f421fb06849e11f0bdd32724b93a52b2 into the global graph done in 4.79 seconds.
16:51:40 Knowledge Graph done (204.29s)
```

Before:


![img_v3_02pk_63370edf-ecee-4ee8-8ac8-69c8d2c712fg](https://github.com/user-attachments/assets/1162eb0f-68c2-4de5-abe0-cdfa168f71de)

```bash
Begin at:
Fri, 29 Aug 2025 17:00:47 GMT
processDuration:
173.38 s
Progress:
17:00:49 Task has been received.
17:00:51 Page(1~7): Start to parse.
17:00:51 Page(1~7): OCR started
17:00:53 Page(1~7): OCR finished (1.82s)
17:00:57 Page(1~7): Layout analysis (3.64s)
17:00:57 Page(1~7): Table analysis (0.00s)
17:00:57 Page(1~7): Text merged (0.00s)
17:00:57 Page(1~7): Finish parsing.
17:00:57 Page(1~7): Generate 7 chunks
17:00:57 Page(1~7): Embedding chunks (0.31s)
17:00:57 Page(1~7): Indexing done (0.03s). Task done (7.88s)
17:00:57 created task graphrag
17:01:00 Task has been received.
17:02:17 Entities extraction of chunk 1 1/7 done, 9 nodes, 9 edges, 10654 tokens.
17:02:31 Entities extraction of chunk 2 2/7 done, 12 nodes, 13 edges, 11066 tokens.
17:02:33 Entities extraction of chunk 4 3/7 done, 9 nodes, 10 edges, 10433 tokens.
17:02:42 Entities extraction of chunk 5 4/7 done, 11 nodes, 14 edges, 11290 tokens.
17:02:52 Entities extraction of chunk 6 5/7 done, 13 nodes, 15 edges, 11039 tokens.
17:02:55 Entities extraction of chunk 3 6/7 done, 14 nodes, 13 edges, 11466 tokens.
17:03:32 Entities extraction of chunk 0 7/7 done, 19 nodes, 18 edges, 13107 tokens.
17:03:32 Entities and relationships extraction done, 71 nodes, 89 edges, 79055 tokens, 149.66s.
17:03:32 Entities merging done, 0.01s.
17:03:32 Relationships merging done, 0.01s.
17:03:32 ignored 1 relations due to missing entities.
17:03:32 generated subgraph for doc b1d9d3b6848711f0aacd7ddc0714c4d3 in 149.69 seconds.
17:03:32 run_graphrag b1d9d3b6848711f0aacd7ddc0714c4d3 graphrag_task_lock acquired
17:03:32 set_graph removed 0 nodes and 0 edges from index in 0.00s.
17:03:32 Get embedding of nodes: 9/71
17:03:33 Get embedding of edges: 9/88
17:03:34 set_graph converted graph change to 161 chunks in 2.27s.
17:03:34 Insert chunks: 4/161
17:03:34 Insert chunks: 104/161
17:03:34 set_graph added/updated 71 nodes and 88 edges from index in 0.28s.
17:03:34 merging subgraph for doc b1d9d3b6848711f0aacd7ddc0714c4d3 into the global graph done in 2.60 seconds.
17:03:34 Knowledge Graph done (153.18s)

```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
- [x] Performance Improvement
2025-08-29 17:58:36 +08:00
fcd18d7d87 Fix: Ollama chat cannot access remote deployment (#9816)
### What problem does this PR solve?

Fix Ollama chat can only access localhost instance. #9806.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-29 13:35:41 +08:00
ca320a8c30 Refactor: for total_token_count method use if to check first. (#9707)
### What problem does this PR solve?

for total_token_count method use if to check first, to improve the
performance when we need to handle exception cases

### Type of change

- [x] Refactoring
2025-08-26 10:47:20 +08:00
b6c1ca828e Refa: replace Chat Ollama implementation with LiteLLM (#9693)
### What problem does this PR solve?

replace Chat Ollama implementation with LiteLLM.

### Type of change

- [x] Refactoring
2025-08-25 17:56:31 +08:00
3947da10ae Fix: unexpected LLM parameters (#9661)
### What problem does this PR solve?

Remove unexpected LLM parameters.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-22 19:33:09 +08:00
787e0c6786 Refa: OpenAI whisper-1 (#9552)
### What problem does this PR solve?

Refactor OpenAI to enable audio parsing.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-08-19 16:41:18 +08:00
a0d630365c Refactor:Improve VoyageRerank not texts handling (#9539)
### What problem does this PR solve?

Improve VoyageRerank not texts handling

### Type of change

- [x] Refactoring
2025-08-19 10:31:04 +08:00
fe32952825 Fix: Gemini parameters error (#9520)
### What problem does this PR solve?

Fix Gemini parameters error.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-18 14:51:10 +08:00
fb77f9917b Refactor: Use Input Length In DefaultRerank (#9516)
### What problem does this PR solve?

1. Use input length to prepare res
2. Adjust torch_empty_cache code location

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-08-18 10:00:27 +08:00
762aa4b8c4 fix: preserve correct MIME & unify data URL handling for vision inputs (relates #9248) (#9474)
fix: preserve correct MIME & unify data URL handling for vision inputs
(relates #9248)

- Updated image2base64() to return a full data URL
(data:image/<fmt>;base64,...) with accurate MIME
- Removed hardcoded image/jpeg in Base._image_prompt(); pass through
data URLs and default raw base64 to image/png
- Set AnthropicCV._image_prompt() raw base64 media_type default to
image/png
- Ensures MIME type matches actual image content, fixing “cannot process
base64 image” errors on vLLM/OpenAI-compatible backends

### What problem does this PR solve?

This PR fixes a compatibility issue where base64-encoded images sent to
vision models (e.g., vLLM/OpenAI-compatible backends) were rejected due
to mismatched MIME type or incorrect decoding.
Previously, the backend:
- Always converted raw base64 into data:image/jpeg;base64,... even if
the actual content was PNG.
- In some cases, base64 decoding was attempted on the full data URL
string instead of the pure base64 part.
This caused errors like:
```
cannot process base64 image
failed to decode base64 string: illegal base64 data at input byte 0
```
by strict validators such as vLLM.
With this fix, the MIME type in the request now matches the actual image
content, and data URLs are correctly handled or passed through, ensuring
vision models can decode and process images reliably.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 17:00:56 +08:00
f2806a8332 Update cv_model.py (#9472)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9452

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 13:45:38 +08:00
da5cef0686 Refactor:Improve the float compare for LocalAIRerank (#9428)
### What problem does this PR solve?
Improve the float compare for LocalAIRerank

### Type of change

- [x] Refactoring
2025-08-13 10:26:42 +08:00
a0c2da1219 Fix: Patch LiteLLM (#9416)
### What problem does this PR solve?

Patch LiteLLM refactor. #9408

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 15:54:30 +08:00
83771e500c Refa: migrate chat models to LiteLLM (#9394)
### What problem does this PR solve?

All models pass the mock response tests, which means that if a model can
return the correct response, everything should work as expected.
However, not all models have been fully tested in a real environment,
the real API_KEY. I suggest actively monitoring the refactored models
over the coming period to ensure they work correctly and fixing them
step by step, or waiting to merge until most have been tested in
practical environment.

### Type of change

- [x] Refactoring
2025-08-12 10:59:20 +08:00
7713e14d6a Update chat_model.py (#9318)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9317
base on
https://discuss.ai.google.dev/t/valueerror-invalid-operation-the-response-text-quick-accessor-requires-the-response-to-contain-a-valid-part-but-none-were-returned/42866
should can be handled by retry 
### Type of change

- [x] Refactoring
2025-08-08 14:13:07 +08:00
a2e1f5618d Fix: bytes style image issue. (#9304)
### What problem does this PR solve?

#9302

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-07 15:20:01 +08:00
35539092d0 Add **kwargs to model base class constructors (#9252)
Updated constructors for base and derived classes in chat, embedding,
rerank, sequence2txt, and tts models to accept **kwargs. This change
improves extensibility and allows passing additional parameters without
breaking existing interfaces.

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: IT: Sop.Son <sop.son@feavn.local>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-07 09:45:37 +08:00
2124329e95 Fix: local variable issue. (#9255)
### What problem does this PR solve?

#9227

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 19:24:34 +08:00
0a303d9ae1 Refactor:Improve the chat stream logic for NvidiaCV (#9242)
### What problem does this PR solve?

Improve the chat stream logic for NvidiaCV

### Type of change


- [x] Refactoring
2025-08-05 17:47:00 +08:00
1deb0a2d42 Fix:local variable 'response' referenced before assignment (#9230)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9227

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-05 11:00:06 +08:00
30ccc4a66c Fix: correct single base64 image handling in image prompt (#9220)
### What problem does this PR solve?

Correct single base64 image handling in image prompt.


![img_v3_02or_ec4757c2-a9d4-4774-9a76-f7c6be633ebg](https://github.com/user-attachments/assets/872a86bf-e2a8-48d1-9b71-2a0c7a35ba9e)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-05 09:26:42 +08:00
e9cbf4611d Fix:Error when parsing files using Gemini: **ERROR**: GENERIC_ERROR - Unknown field for GenerationConfig: max_tokens (#9195)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9177
The reason should be due to the gemin internal use a different parameter
name
`
        max_output_tokens (int):
            Optional. The maximum number of tokens to include in a
            response candidate.

            Note: The default value varies by model, see the
            ``Model.output_token_limit`` attribute of the ``Model``
            returned from the ``getModel`` function.

            This field is a member of `oneof`_ ``_max_output_tokens``.
`
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-04 10:06:09 +08:00
5ccdb95008 Refactor:Introduce Image Close For GeminiCV (#9147)
### What problem does this PR solve?

Introduce Image Close For GeminiCV

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-08-01 12:38:13 +08:00
aeaeb169e4 Feat/support 302ai provider (#8742)
### What problem does this PR solve?

Support 302.AI provider.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-07-31 14:48:30 +08:00
20b4d88098 Refactor: Improve the try catch logic for XinferenceEmbed (#9128)
### What problem does this PR solve?

Improve the try catch logic for XinferenceEmbed

### Type of change


- [x] Refactoring
2025-07-31 12:14:50 +08:00
d9fe279dde Feat: Redesign and refactor agent module (#9113)
### What problem does this PR solve?

#9082 #6365

<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-07-30 19:41:09 +08:00
021e8b57ae Fix: fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model (#9106)
### What problem does this PR solve?

fix error 429 api rate limit when building knowledge graph for all chat
model and Mistral embedding model.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-30 11:37:49 +08:00
ba563f8095 Update embedding_model.py (#9083)
### What problem does this PR solve?

Reduce the logic scope for DefaultEmbedding

### Type of change

- [x] Refactoring
2025-07-30 09:44:30 +08:00
86b4da0844 Refactor: Remove Useless split for BedrockEmbed (#9067)
### What problem does this PR solve?

Remove Useless split for BedrockEmbed

### Type of change

- [x] Refactoring
2025-07-28 10:16:38 +08:00
53b0b0e583 get keep alive from env (#9039)
### What problem does this PR solve?

get keepalive from env

### Type of change

- [x] Refactoring
2025-07-25 12:16:33 +08:00
b47dcc9108 Fix issue with keep_alive=-1 for ollama chat model by allowing a user to set an additional configuration option (#9017)
### What problem does this PR solve?

fix issue with `keep_alive=-1` for ollama chat model by allowing a user
to set an additional configuration option. It is no-breaking change
because it still uses a previous default value such as: `keep_alive=-1`

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [X] Performance Improvement
- [X] Other (please describe):
- Additional configuration option has been added to control behavior of
RAGFlow while working with ollama LLM
2025-07-24 11:20:14 +08:00
a2f73af1a4 Fix: typo Bearer token (#8998)
### What problem does this PR solve?

Typo Bearer token. #8960

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-23 18:10:51 +08:00
7ebc1f0943 Feat: add model provider DeepInfra (#9003)
### What problem does this PR solve?

Add model provider DeepInfra. This model list comes from our community. 

NOTE: most endpoints haven't been tested, but they should work as OpenAI
does.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-07-23 18:10:35 +08:00
ec21d9a98f Refactor:remove use less convert for FastEmbed (#8984)
### What problem does this PR solve?

remove use less convert for FastEmbed

### Type of change

- [x] Refactoring
2025-07-23 10:51:48 +08:00
95b9208b13 Fix:Improve float operation when rerank (#8963)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/8915

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-22 10:04:00 +08:00
46caf6ae72 Refactor improve codes for ranker (#8936)
### What problem does this PR solve?
Use the normalize method directly

### Type of change

- [x] Refactoring
2025-07-21 10:22:20 +08:00
38b34116dd Refa: Remove useless conver and fix a bug for DefaultRerank (#8887)
### What problem does this PR solve?

1. bug when re-try, we need to reset i.
2. remove useless convert

### Type of change

- [x] Refactoring
2025-07-17 12:09:50 +08:00
9e45fcfdb3 Fix: fix typo in OpenAI error logging message (#8865)
### What problem does this PR solve?

Correct the logging message from "OpenAI cat_with_tools" to "OpenAI
chat_with_tools" in the `_exceptions` method of the `Base` class to
accurately reflect the method name and improve error traceability.

### Type of change

- [x] Typo
2025-07-16 15:31:57 +08:00
5fa6f2f151 Update embedding_model.py (#8836)
### What problem does this PR solve?

Remove useless covert for bge encode_queries

### Type of change

- [x] Performance Improvement
2025-07-15 14:04:58 +08:00
5383e254c4 Perf:Remove Useless Convert When BGE Embedding (#8816)
### What problem does this PR solve?

FlagModel internal support returns as numpy

### Type of change
- [x] Performance Improvement
2025-07-14 14:02:48 +08:00
07208e519b Fix: Wrong_Input_type_for_Gemin (#8783)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/8763#issuecomment-3055317110

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-11 11:34:04 +08:00
1895667573 Feat: add xAI provider (#8781)
### What problem does this PR solve?

Add xAI provider (experimental feature, requires user feedback).

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-07-11 10:35:23 +08:00
8281ceb406 Refa: refine retry gap. (#8773)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-07-10 14:28:57 +08:00