Commit Graph

419 Commits

Author SHA1 Message Date
1c06ec39ca fix cohere rerank base_url default (#11353)
### What problem does this PR solve?

**Cohere rerank base_url default handling**

- Background: When no rerank base URL is configured, the settings
pipeline was passing an empty string through RERANK_CFG →
TenantLLMService → CoHereRerank, so the Cohere client received
base_url="" and produced “missing protocol” errors during rerank calls.

- What changed: The CoHereRerank constructor now only forwards base_url
to the Cohere client when it isn’t empty/whitespace, causing the client
to fall back to its default API endpoint otherwise.

- Why it matters: This prevents invalid URL construction in the rerank
workflow and keeps tests/sanity checks that rely on the default Cohere
endpoint from failing when no custom base URL is specified.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Philipp Heyken Soares <philipp.heyken-soares@am.ai>
2025-11-20 09:46:39 +08:00
e8fe580d7a Feat: add Gemini 3 Pro preview (#11361)
### What problem does this PR solve?

Add Gemini 3 Pro preview.

Change `GenerativeModel` to `genai`.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-19 13:17:22 +08:00
0db00f70b2 Fix: add describe_image_with_prompt for ZHIPU AI (#11317)
### What problem does this PR solve?

Fix: add describe_image_with_prompt for ZHIPU AI  #11289 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-18 13:09:39 +08:00
3fcf2ee54c feat: add new LLM provider Jiekou.AI (#11300)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Jason <ggbbddjm@gmail.com>
2025-11-17 19:47:46 +08:00
bd4bc57009 Refactor: move mcp connection utilities to common (#11304)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-17 15:34:17 +08:00
f441f8ffc2 Fix: waitForResponse component. (#11172)
### What problem does this PR solve?

#10056

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-11-11 16:58:47 +08:00
dd5b8e2e1a Fix: add auto_parse to kb detail. (#11153)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-11 12:22:43 +08:00
68b952abb1 Don't select vector on infinity (#11151)
### What problem does this PR solve?

Don't select vector on infinity

### Type of change

- [x] Performance Improvement
2025-11-10 18:01:40 +08:00
82ca2e0378 Refactor: QWenCV release temp path (#11122)
### What problem does this PR solve?

QWenCV release temp path

### Type of change
- [x] Refactoring
2025-11-10 10:15:37 +08:00
660386d3b5 Fix: cannot parse images (#11044)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/11043

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-10 09:31:19 +08:00
9fcc4946e2 Feat: add kimi-k2-thinking and moonshot-v1-vision-preview (#11110)
### What problem does this PR solve?

Add kimi-k2-thinking and moonshot-v1-vision-preview.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-07 19:52:57 +08:00
f98b24c9bf Move api.settings to common.settings (#11036)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-06 09:36:38 +08:00
1a9215bc6f Move some vars to globals (#11017)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-05 14:14:38 +08:00
96c015fb85 Fix and refactor imports (#11010)
### What problem does this PR solve?

1. Move EMBEDDING_CFG to common.globals
2. Fix error imports
3. Move signal handles to common/signal_utils.py

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-05 11:07:54 +08:00
378bdfccfc Refactor log utils (#10973)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 20:25:02 +08:00
2d83c64eed Fix:wrong describe_with_prompt() in ollama (#10963)
### What problem does this PR solve?

change:
wrong describe_with_prompt() in ollama

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-11-03 19:16:41 +08:00
360f5c1179 Move token related functions to common (#10942)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 08:50:05 +08:00
fe4852cb71 TEI auto truncate inputs (#10916)
### What problem does this PR solve?

TEI auto truncate inputs

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-31 16:46:20 +08:00
0ecccd27eb Refactor:improve the logic for rerank models to cal the total token count (#10882)
### What problem does this PR solve?

improve the logic for rerank models to cal the total token count

### Type of change

- [x] Refactoring
2025-10-31 09:46:16 +08:00
c0c2a10680 Feat: allow initialize Redis without password (#10856)
### What problem does this PR solve?

Allow initialize Redis without password.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-29 09:45:28 +08:00
d86d7061ea Refactor: Improve how to get total token count for AnthropicCV (#10658)
### What problem does this PR solve?

 Improve how to get total token count for AnthropicCV

### Type of change

- [x] Refactoring
2025-10-29 09:41:15 +08:00
84d1ffe44c Feature/add new models for token pony and bug fix for use llm (#10823)
new models for token pony and bug fix for use llm

Co-authored-by: huangzl <huangzl@shinemo.com>
2025-10-28 10:04:41 +08:00
3bd0b99495 Fix: gemini cv model chat issue. (#10799)
### What problem does this PR solve?

#10787
#10781

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-27 11:43:56 +08:00
73144e278b Don't release full image (#10654)
### What problem does this PR solve?

Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
b30f0be858 Refactor: How LiteLLMBase Calculate total count (#10532)
### What problem does this PR solve?

How LiteLLMBase Calculate total count

### Type of change

- [x] Refactoring
2025-10-22 12:25:31 +08:00
a82e9b3d91 Fix: can't upload image in ollama model #10447 (#10717)
### What problem does this PR solve?

Fix: can't upload image in ollama model #10447

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)


### Change all `image=[]` to `image = None`

Changing `image=[]` to `images=None` avoids Python’s mutable default
parameter issue.
If you keep `images=[]`, all calls share the same list, so modifying it
(e.g., images.append()) will affect later calls.
Using images=None and creating a new list inside the function ensures
each call is independent.
This change does not affect current behavior — it simply makes the code
safer and more predictable.


把 `images=[]` 改成 `images=None` 是为了避免 Python 默认参数的可变对象问题。
如果保留 `images=[]`,所有调用都会共用同一个列表,一旦修改就会影响后续调用。
改成 None 并在函数内部重新创建列表,可以确保每次调用都是独立的。
这个修改不会影响现有运行结果,只是让代码更安全、更可控。
2025-10-22 12:24:12 +08:00
aaa4776657 Feat: Qwen-VL series supports video parsing (#10676)
### What problem does this PR solve?

Qwen-VL series supports video parsing. #10617.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-21 09:36:13 +08:00
5b2e5dd334 Feat: Gemini supports video parsing (#10671)
### What problem does this PR solve?

Gemini supports video parsing.


![img_v3_02r8_adbd5adc-d665-4756-9a00-3ae0f12224fg](https://github.com/user-attachments/assets/30d8d296-c336-4b55-9823-803979e705ca)


![img_v3_02r8_ab60c046-1727-4029-ad2e-66097fd3ccbg](https://github.com/user-attachments/assets/441b1487-a970-427e-98b6-6e1e002f2bad)

Close: #10617

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-20 16:49:47 +08:00
b15643bd80 Feat:VolcEngine Model type add IMAGE2TEXT (#10629)
### What problem does this PR solve?
issue:
[#9004](https://github.com/infiniflow/ragflow/issues/9004)
change:
VolcEngine Model type add IMAGE2TEXT

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-17 11:43:22 +08:00
4e86ee4ff9 Feat: Support Specifying OpenRouter Model Provider (#10550)
### What problem does this PR solve?
issue:
[#5787](https://github.com/infiniflow/ragflow/issues/5787)
change:
Support Specifying OpenRouter Model Provider

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-16 09:39:59 +08:00
5037a28e4d Fix problem with Google Cloud models with reasoning (like gemini) - Additional fix to issue #10474 (#10502)
### What problem does this PR solve?

Issue #10474  -  Update to PR #10477 

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
2025-10-15 14:54:20 +08:00
9e73f799b2 Feat: add Zhipu GLM-ASR model (#10529)
### What problem does this PR solve?

Add Zhipu GLM-ASR model

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-14 09:32:45 +08:00
fee757eb41 Fix: Disable reasoning on Gemini 2.5 Flash by default (#10477)
### What problem does this PR solve?

Gemini 2.5 Flash Models use reasoning by default. There is currently no
way to disable this behaviour. This leads to very long response times (>
1min). The default behaviour should be, that reasoning is disabled and
configurable

issue #10474 

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
2025-10-11 10:22:51 +08:00
0283e4098f Fix #10408 (#10471)
### What problem does this PR solve?

Google Cloud model does not work correctly with gemini-2.5 models
Close #10408

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-10-10 19:18:24 +08:00
0d8791936e Feat: TOC retrieval (#10456)
### What problem does this PR solve?

#10436

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-10 17:07:55 +08:00
5d167cd772 feat: support qwq reasoning models with non-stream output (#10468)
### What problem does this PR solve?
issue:
[#6193](https://github.com/infiniflow/ragflow/issues/6193)
change:
support qwq reasoning models with non-stream output

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-10 16:38:04 +08:00
6ab4c1a6e9 Refactor: improve how NvidiaCV calculate res total token counts (#10455)
### What problem does this PR solve?
improve how NvidiaCV calculate res total token counts

### Type of change
- [x] Refactoring
2025-10-10 11:03:40 +08:00
1a47e136e3 Feat: Adds a new feature that enables the LLM to extract a structured table of contents (TOC) directly from plain text. (#10428)
### What problem does this PR solve?

**Adds a new feature that enables the LLM to extract a structured table
of contents (TOC) directly from plain text.**
_This implementation prioritizes efficiency over reasoning — the model
runs in a strictly deterministic mode (thinking disabled) to minimize
latency.
As a result, overall performance may be less optimal, but the extraction
speed and consistency are guaranteed._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-09 13:47:31 +08:00
cbf04ee470 Feat: Use data pipeline to visualize the parsing configuration of the knowledge base (#10423)
### What problem does this PR solve?

#9869

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: jinhai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: chanx <1243304602@qq.com>
Co-authored-by: balibabu <cike8899@users.noreply.github.com>
Co-authored-by: Lynn <lynn_inf@hotmail.com>
Co-authored-by: 纷繁下的无奈 <zhileihuang@126.com>
Co-authored-by: huangzl <huangzl@shinemo.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Wilmer <33392318@qq.com>
Co-authored-by: Adrian Weidig <adrianweidig@gmx.net>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: Liu An <asiro@qq.com>
Co-authored-by: buua436 <66937541+buua436@users.noreply.github.com>
Co-authored-by: BadwomanCraZY <511528396@qq.com>
Co-authored-by: cucusenok <31804608+cucusenok@users.noreply.github.com>
Co-authored-by: Russell Valentine <russ@coldstonelabs.org>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Billy Bao <newyorkupperbay@gmail.com>
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: TensorNull <129579691+TensorNull@users.noreply.github.com>
Co-authored-by: TensorNull <tensor.null@gmail.com>
Co-authored-by: TeslaZY <TeslaZY@outlook.com>
Co-authored-by: Ajay <160579663+aybanda@users.noreply.github.com>
Co-authored-by: AB <aj@Ajays-MacBook-Air.local>
Co-authored-by: 天海蒼灆 <huangaoqin@tecpie.com>
Co-authored-by: He Wang <wanghechn@qq.com>
Co-authored-by: Atsushi Hatakeyama <atu729@icloud.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Mohamed Mathari <155896313+melmathari@users.noreply.github.com>
Co-authored-by: Mohamed Mathari <nocodeventure@Mac-mini-van-Mohamed.fritz.box>
Co-authored-by: Stephen Hu <stephenhu@seismic.com>
Co-authored-by: Shaun Zhang <zhangwfjh@users.noreply.github.com>
Co-authored-by: zhimeng123 <60221886+zhimeng123@users.noreply.github.com>
Co-authored-by: mxc <mxc@example.com>
Co-authored-by: Dominik Novotný <50611433+SgtMarmite@users.noreply.github.com>
Co-authored-by: EVGENY M <168018528+rjohny55@users.noreply.github.com>
Co-authored-by: mcoder6425 <mcoder64@gmail.com>
Co-authored-by: lemsn <lemsn@msn.com>
Co-authored-by: lemsn <lemsn@126.com>
Co-authored-by: Adrian Gora <47756404+adagora@users.noreply.github.com>
Co-authored-by: Womsxd <45663319+Womsxd@users.noreply.github.com>
Co-authored-by: FatMii <39074672+FatMii@users.noreply.github.com>
2025-10-09 12:36:19 +08:00
dfc5fa1f4d Feat: add DeerAPI support (#10303)
### Related issues
#10078 

### What problem does this PR solve?
Integrate DeerAPI provider.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update

Co-authored-by: DeerAPI <tensor.null@gmail.com>
2025-10-09 11:14:49 +08:00
4585edc20e Refactor: improve cv model logics (#10414)
1. improve how to get total token count

Improve how to get total token count

### Type of change
- [x] Refactoring
2025-10-09 09:47:36 +08:00
17757930a3 Feat: add support for international Dashscope service (#10356)
### What problem does this PR solve?

 Add support for international Dashscope service. #10340 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-29 14:49:45 +08:00
ef59c5bab9 FIX: Rename the CometEmbed and CometSeq2txt classes to CometAPIEmbed and CometAPISeq2txt, and correct supported_models.mdx. (#10298)
### What problem does this PR solve?

Rename the CometEmbed and CometSeq2txt classes to CometAPIEmbed and
CometAPISeq2txt, and correct supported_models.mdx.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-26 10:50:56 +08:00
daea357940 Fix: invalid COMPONENT_EXEC_TIMEOUT (#10278)
### What problem does this PR solve?

Fix invalid COMPONENT_EXEC_TIMEOUT. #10273

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-25 14:11:09 +08:00
193d93d820 Refactor: Improve the logic clean conf for ZhipuChat (#10274)
### What problem does this PR solve?
Improve the logic clean conf for ZhipuChat

### Type of change
- [x] Refactoring
2025-09-25 10:28:03 +08:00
a1f848bfe0 Fix:max_tokens must be at least 1, got -950, BadRequestError (#10252)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/10235

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-09-24 10:49:34 +08:00
38be53cf31 fix: prevent list index out of range in chat streaming (#10238)
### What problem does this PR solve?
issue:
[Bug]: ERROR: list index out of range #10188
change:
fix a potential list index out of range error in chat response parsing
by adding explicit checks for empty choices.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-23 19:59:39 +08:00
10cbbb76f8 revert gpt5 integration (#10228)
### What problem does this PR solve?

  Revert back to chat.completions.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [x] Other (please describe):
  Revert back to chat.completions.
2025-09-23 16:06:12 +08:00
1c84d1b562 Fix: azure OpenAI retry (#10213)
### What problem does this PR solve?

Currently, Azure OpenAI returns one minute Quota limit responses when
chat API is utilized. This change is needed in order to be able to
process almost any documents using models deployed in Azure Foundry.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-23 12:19:28 +08:00
4eb7659499 Fix bug: broken import from rag.prompts.prompts (#10217)
### What problem does this PR solve?

Fix broken imports

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: jinhai <haijin.chn@gmail.com>
2025-09-23 10:19:25 +08:00