Commit Graph

403 Commits

Author SHA1 Message Date
360f5c1179 Move token related functions to common (#10942)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-11-03 08:50:05 +08:00
fe4852cb71 TEI auto truncate inputs (#10916)
### What problem does this PR solve?

TEI auto truncate inputs

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-31 16:46:20 +08:00
0ecccd27eb Refactor:improve the logic for rerank models to cal the total token count (#10882)
### What problem does this PR solve?

improve the logic for rerank models to cal the total token count

### Type of change

- [x] Refactoring
2025-10-31 09:46:16 +08:00
c0c2a10680 Feat: allow initialize Redis without password (#10856)
### What problem does this PR solve?

Allow initialize Redis without password.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-29 09:45:28 +08:00
d86d7061ea Refactor: Improve how to get total token count for AnthropicCV (#10658)
### What problem does this PR solve?

 Improve how to get total token count for AnthropicCV

### Type of change

- [x] Refactoring
2025-10-29 09:41:15 +08:00
84d1ffe44c Feature/add new models for token pony and bug fix for use llm (#10823)
new models for token pony and bug fix for use llm

Co-authored-by: huangzl <huangzl@shinemo.com>
2025-10-28 10:04:41 +08:00
3bd0b99495 Fix: gemini cv model chat issue. (#10799)
### What problem does this PR solve?

#10787
#10781

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-27 11:43:56 +08:00
73144e278b Don't release full image (#10654)
### What problem does this PR solve?

Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
b30f0be858 Refactor: How LiteLLMBase Calculate total count (#10532)
### What problem does this PR solve?

How LiteLLMBase Calculate total count

### Type of change

- [x] Refactoring
2025-10-22 12:25:31 +08:00
a82e9b3d91 Fix: can't upload image in ollama model #10447 (#10717)
### What problem does this PR solve?

Fix: can't upload image in ollama model #10447

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)


### Change all `image=[]` to `image = None`

Changing `image=[]` to `images=None` avoids Python’s mutable default
parameter issue.
If you keep `images=[]`, all calls share the same list, so modifying it
(e.g., images.append()) will affect later calls.
Using images=None and creating a new list inside the function ensures
each call is independent.
This change does not affect current behavior — it simply makes the code
safer and more predictable.


把 `images=[]` 改成 `images=None` 是为了避免 Python 默认参数的可变对象问题。
如果保留 `images=[]`,所有调用都会共用同一个列表,一旦修改就会影响后续调用。
改成 None 并在函数内部重新创建列表,可以确保每次调用都是独立的。
这个修改不会影响现有运行结果,只是让代码更安全、更可控。
2025-10-22 12:24:12 +08:00
aaa4776657 Feat: Qwen-VL series supports video parsing (#10676)
### What problem does this PR solve?

Qwen-VL series supports video parsing. #10617.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-21 09:36:13 +08:00
5b2e5dd334 Feat: Gemini supports video parsing (#10671)
### What problem does this PR solve?

Gemini supports video parsing.


![img_v3_02r8_adbd5adc-d665-4756-9a00-3ae0f12224fg](https://github.com/user-attachments/assets/30d8d296-c336-4b55-9823-803979e705ca)


![img_v3_02r8_ab60c046-1727-4029-ad2e-66097fd3ccbg](https://github.com/user-attachments/assets/441b1487-a970-427e-98b6-6e1e002f2bad)

Close: #10617

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-20 16:49:47 +08:00
b15643bd80 Feat:VolcEngine Model type add IMAGE2TEXT (#10629)
### What problem does this PR solve?
issue:
[#9004](https://github.com/infiniflow/ragflow/issues/9004)
change:
VolcEngine Model type add IMAGE2TEXT

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-17 11:43:22 +08:00
4e86ee4ff9 Feat: Support Specifying OpenRouter Model Provider (#10550)
### What problem does this PR solve?
issue:
[#5787](https://github.com/infiniflow/ragflow/issues/5787)
change:
Support Specifying OpenRouter Model Provider

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-16 09:39:59 +08:00
5037a28e4d Fix problem with Google Cloud models with reasoning (like gemini) - Additional fix to issue #10474 (#10502)
### What problem does this PR solve?

Issue #10474  -  Update to PR #10477 

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
2025-10-15 14:54:20 +08:00
9e73f799b2 Feat: add Zhipu GLM-ASR model (#10529)
### What problem does this PR solve?

Add Zhipu GLM-ASR model

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-14 09:32:45 +08:00
fee757eb41 Fix: Disable reasoning on Gemini 2.5 Flash by default (#10477)
### What problem does this PR solve?

Gemini 2.5 Flash Models use reasoning by default. There is currently no
way to disable this behaviour. This leads to very long response times (>
1min). The default behaviour should be, that reasoning is disabled and
configurable

issue #10474 

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
2025-10-11 10:22:51 +08:00
0283e4098f Fix #10408 (#10471)
### What problem does this PR solve?

Google Cloud model does not work correctly with gemini-2.5 models
Close #10408

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-10-10 19:18:24 +08:00
0d8791936e Feat: TOC retrieval (#10456)
### What problem does this PR solve?

#10436

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-10 17:07:55 +08:00
5d167cd772 feat: support qwq reasoning models with non-stream output (#10468)
### What problem does this PR solve?
issue:
[#6193](https://github.com/infiniflow/ragflow/issues/6193)
change:
support qwq reasoning models with non-stream output

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-10 16:38:04 +08:00
6ab4c1a6e9 Refactor: improve how NvidiaCV calculate res total token counts (#10455)
### What problem does this PR solve?
improve how NvidiaCV calculate res total token counts

### Type of change
- [x] Refactoring
2025-10-10 11:03:40 +08:00
1a47e136e3 Feat: Adds a new feature that enables the LLM to extract a structured table of contents (TOC) directly from plain text. (#10428)
### What problem does this PR solve?

**Adds a new feature that enables the LLM to extract a structured table
of contents (TOC) directly from plain text.**
_This implementation prioritizes efficiency over reasoning — the model
runs in a strictly deterministic mode (thinking disabled) to minimize
latency.
As a result, overall performance may be less optimal, but the extraction
speed and consistency are guaranteed._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-10-09 13:47:31 +08:00
cbf04ee470 Feat: Use data pipeline to visualize the parsing configuration of the knowledge base (#10423)
### What problem does this PR solve?

#9869

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: jinhai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: chanx <1243304602@qq.com>
Co-authored-by: balibabu <cike8899@users.noreply.github.com>
Co-authored-by: Lynn <lynn_inf@hotmail.com>
Co-authored-by: 纷繁下的无奈 <zhileihuang@126.com>
Co-authored-by: huangzl <huangzl@shinemo.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Wilmer <33392318@qq.com>
Co-authored-by: Adrian Weidig <adrianweidig@gmx.net>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: Liu An <asiro@qq.com>
Co-authored-by: buua436 <66937541+buua436@users.noreply.github.com>
Co-authored-by: BadwomanCraZY <511528396@qq.com>
Co-authored-by: cucusenok <31804608+cucusenok@users.noreply.github.com>
Co-authored-by: Russell Valentine <russ@coldstonelabs.org>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Billy Bao <newyorkupperbay@gmail.com>
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: TensorNull <129579691+TensorNull@users.noreply.github.com>
Co-authored-by: TensorNull <tensor.null@gmail.com>
Co-authored-by: TeslaZY <TeslaZY@outlook.com>
Co-authored-by: Ajay <160579663+aybanda@users.noreply.github.com>
Co-authored-by: AB <aj@Ajays-MacBook-Air.local>
Co-authored-by: 天海蒼灆 <huangaoqin@tecpie.com>
Co-authored-by: He Wang <wanghechn@qq.com>
Co-authored-by: Atsushi Hatakeyama <atu729@icloud.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Mohamed Mathari <155896313+melmathari@users.noreply.github.com>
Co-authored-by: Mohamed Mathari <nocodeventure@Mac-mini-van-Mohamed.fritz.box>
Co-authored-by: Stephen Hu <stephenhu@seismic.com>
Co-authored-by: Shaun Zhang <zhangwfjh@users.noreply.github.com>
Co-authored-by: zhimeng123 <60221886+zhimeng123@users.noreply.github.com>
Co-authored-by: mxc <mxc@example.com>
Co-authored-by: Dominik Novotný <50611433+SgtMarmite@users.noreply.github.com>
Co-authored-by: EVGENY M <168018528+rjohny55@users.noreply.github.com>
Co-authored-by: mcoder6425 <mcoder64@gmail.com>
Co-authored-by: lemsn <lemsn@msn.com>
Co-authored-by: lemsn <lemsn@126.com>
Co-authored-by: Adrian Gora <47756404+adagora@users.noreply.github.com>
Co-authored-by: Womsxd <45663319+Womsxd@users.noreply.github.com>
Co-authored-by: FatMii <39074672+FatMii@users.noreply.github.com>
2025-10-09 12:36:19 +08:00
dfc5fa1f4d Feat: add DeerAPI support (#10303)
### Related issues
#10078 

### What problem does this PR solve?
Integrate DeerAPI provider.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update

Co-authored-by: DeerAPI <tensor.null@gmail.com>
2025-10-09 11:14:49 +08:00
4585edc20e Refactor: improve cv model logics (#10414)
1. improve how to get total token count

Improve how to get total token count

### Type of change
- [x] Refactoring
2025-10-09 09:47:36 +08:00
17757930a3 Feat: add support for international Dashscope service (#10356)
### What problem does this PR solve?

 Add support for international Dashscope service. #10340 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-29 14:49:45 +08:00
ef59c5bab9 FIX: Rename the CometEmbed and CometSeq2txt classes to CometAPIEmbed and CometAPISeq2txt, and correct supported_models.mdx. (#10298)
### What problem does this PR solve?

Rename the CometEmbed and CometSeq2txt classes to CometAPIEmbed and
CometAPISeq2txt, and correct supported_models.mdx.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-26 10:50:56 +08:00
daea357940 Fix: invalid COMPONENT_EXEC_TIMEOUT (#10278)
### What problem does this PR solve?

Fix invalid COMPONENT_EXEC_TIMEOUT. #10273

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-25 14:11:09 +08:00
193d93d820 Refactor: Improve the logic clean conf for ZhipuChat (#10274)
### What problem does this PR solve?
Improve the logic clean conf for ZhipuChat

### Type of change
- [x] Refactoring
2025-09-25 10:28:03 +08:00
a1f848bfe0 Fix:max_tokens must be at least 1, got -950, BadRequestError (#10252)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/10235

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-09-24 10:49:34 +08:00
38be53cf31 fix: prevent list index out of range in chat streaming (#10238)
### What problem does this PR solve?
issue:
[Bug]: ERROR: list index out of range #10188
change:
fix a potential list index out of range error in chat response parsing
by adding explicit checks for empty choices.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-23 19:59:39 +08:00
10cbbb76f8 revert gpt5 integration (#10228)
### What problem does this PR solve?

  Revert back to chat.completions.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [x] Other (please describe):
  Revert back to chat.completions.
2025-09-23 16:06:12 +08:00
1c84d1b562 Fix: azure OpenAI retry (#10213)
### What problem does this PR solve?

Currently, Azure OpenAI returns one minute Quota limit responses when
chat API is utilized. This change is needed in order to be able to
process almost any documents using models deployed in Azure Foundry.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-23 12:19:28 +08:00
4eb7659499 Fix bug: broken import from rag.prompts.prompts (#10217)
### What problem does this PR solve?

Fix broken imports

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: jinhai <haijin.chn@gmail.com>
2025-09-23 10:19:25 +08:00
da82566304 Fix: resolve hash collisions by switching to UUID &correct logic for always-true statements & Update GPT api integration & Support qianwen-deepresearch (#10208)
### What problem does this PR solve?

Fix: resolve hash collisions by switching to UUID &correct logic for
always-true statements, solved: #10165
Feat: Update GPT api integration, solved: #10204 
Feat: Support qianwen-deepresearch, solved: #10163 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-09-23 09:34:30 +08:00
94dbd4aac9 Refactor: use the same implement for total token count from res (#10197)
### What problem does this PR solve?
use the same implement for total token count from res

### Type of change

- [x] Refactoring
2025-09-22 17:17:06 +08:00
70ce02faf4 Feat: add support for Anthropic third-party API (#10173)
### What problem does this PR solve?
issue:
[Bug]: anthropic model have not baseurl selecting,need add #8546
change:
This PR adds support for using Anthropic models through a third-party
API by allowing a custom base_url.
It ensures compatibility with both the official Anthropic endpoint and
external providers.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-19 19:06:14 +08:00
6c24ad7966 fix: correct rerank_model condition logic (#10174)
### What problem does this PR solve?

fix the rerank_model condition logic by correcting the np.isclose check.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-19 16:02:10 +08:00
4693c5382a Feat: migrate OpenAI-compatible chats to LiteLLM (#10148)
### What problem does this PR solve?

Migrate OpenAI-compatible chats to LiteLLM.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-18 17:16:59 +08:00
91b609447d Fix: embedding model failure in CometAPI (#10137)
### What problem does this PR solve?

Related PR:
Feat: add CometAPI to LLMFactory and update related mappings #10119 

Change:
Fixes the issue where the embedding model in CometAPI was not being
called correctly

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: TensorNull <tensor.null@gmail.com>
2025-09-18 14:49:47 +08:00
f12b9fdcd4 Feat: add CometAPI to LLMFactory and update related mappings (#10119)
### Related issues
#10078

### What problem does this PR solve?
Integrate CometAPI provider.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-09-18 09:51:29 +08:00
d353f7f7f8 Feat/parse audio (#10133)
### What problem does this PR solve?

Dataflow support audio.  And fix giteeAI's sequence2text model. 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-09-18 09:31:32 +08:00
e1d86cfee3 Feat: add TokenPony model provider (#9932)
### What problem does this PR solve?

Add TokenPony as a LLM provider

Co-authored-by: huangzl <huangzl@shinemo.com>
2025-09-11 17:25:31 +08:00
3d39b96c6f Fix: token num exceed (#10046)
### What problem does this PR solve?

fix text input exceed token num limit when using siliconflow's embedding
model BAAI/bge-large-zh-v1.5 and BAAI/bge-large-en-v1.5, truncate before
input.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-11 12:02:12 +08:00
1936ad82d2 Refactor:Improve BytesIO usage for GeminiCV (#10042)
### What problem does this PR solve?
Improve BytesIO usage for GeminiCV

### Type of change
- [x] Refactoring
2025-09-11 11:07:15 +08:00
127af4e45c Refactor:Improve BytesIO usage for image2base64 (#9997)
### What problem does this PR solve?

Improve BytesIO usage for image2base64

### Type of change

- [x] Refactoring
2025-09-10 15:55:33 +08:00
0d9c1f1c3c Feat: dataflow supports Spreadsheet and Word processor document (#9996)
### What problem does this PR solve?

Dataflow supports Spreadsheet and Word processor document

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-10 13:02:53 +08:00
936f27e9e5 Feat: add LongCat-Flash-Chat (#9973)
### What problem does this PR solve?

Add LongCat-Flash-Chat from Meituan, deepseek v3.1 from SiliconFlow,
kimi-k2-09-05-preview and kimi-k2-turbo-preview from Moonshot.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-08 19:00:52 +08:00
91d6fb8061 Fix miscalculated token count (#9776)
### What problem does this PR solve?

The total token was incorrectly accumulated when using the
OpenAI-API-Compatible api.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-05 19:17:21 +08:00
b58e882eaa Feat: add exponential back-off for Chat LiteLLM (#9880)
### What problem does this PR solve?

Add exponential back-off for Chat LiteLLM. #9858.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-03 13:31:43 +08:00