Commit Graph

61 Commits

Author SHA1 Message Date
0fa1a1469e Fix: avoid mixing different embedding models in document parsing (#8260)
### What problem does this PR solve?

Fix mixing different embedding models in document parsing.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-16 13:40:12 +08:00
a1f06a4fdc Feat: Support tool calling in Generate component (#7572)
### What problem does this PR solve?

Hello, our use case requires LLM agent to invoke some tools, so I made a
simple implementation here.

This PR does two things:

1. A simple plugin mechanism based on `pluginlib`:

This mechanism lives in the `plugin` directory. It will only load
plugins from `plugin/embedded_plugins` for now.

A sample plugin `bad_calculator.py` is placed in
`plugin/embedded_plugins/llm_tools`, it accepts two numbers `a` and `b`,
then give a wrong result `a + b + 100`.

In the future, it can load plugins from external location with little
code change.

Plugins are divided into different types. The only plugin type supported
in this PR is `llm_tools`, which must implement the `LLMToolPlugin`
class in the `plugin/llm_tool_plugin.py`.
More plugin types can be added in the future.

2. A tool selector in the `Generate` component:

Added a tool selector to select one or more tools for LLM:


![image](https://github.com/user-attachments/assets/74a21fdf-9333-4175-991b-43df6524c5dc)

And with the `bad_calculator` tool, it results this with the `qwen-max`
model:


![image](https://github.com/user-attachments/assets/93aff9c4-8550-414a-90a2-1a15a5249d94)


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2025-05-16 16:32:19 +08:00
6bd7d572ec Perf: Increase database connection pool size (#7559)
### What problem does this PR solve?

1. The MySQL instance is configured with max_connections=1000,
but our connection pool was limited to max_connections: 100.
This mismatch caused connection pool exhaustion during performance
testing.

2.  Increase stale_timeout to resolve #6548

### Type of change

- [x] Performance Improvement
2025-05-09 17:52:03 +08:00
2dbcc0a1bf Fix: Tried to fix the fid mis match under some cases (#7426)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/7407

Based on this context, I think there should be some reasons that let
some LLMs have a mismatch (add the wrong "@xxx"),
So I think when use fid can not fetch llm then tried to just use name
should can fetch it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 14:55:21 +08:00
5d253e0a34 Fix: pymysql.err.InterfaceError: (0, '') during long time streaming chat responses (#6548) (#7057)
### Related Issue:
https://github.com/infiniflow/ragflow/issues/6548

### Related PR:
https://github.com/infiniflow/ragflow/pull/6861


### Environment:
Commit version:
[[48730e0](48730e00a8)]

### Bug Description:
Unexpected `pymysql.err.InterfaceError: (0, '') `when using Peewee +
PyMySQL + PooledMySQLDatabase after a long-running `chat streamly`
operation.

This is a common issue with Peewee + PyMySQL + connection pooling: you
end up using a connection that was silently closed by the server, but
Peewee doesn't realize it's dead.

**I found that the error only occurs during longer streaming outputs**
and is unrelated to the database connection context, so it's likely
because:

- The prolonged streaming response caused the database connection to
time out

- The original database connection might have been disconnected by the
server during the streaming process

### Why This Happens
This error happens even when using `@DB.connection_context() `after the
stream is done. After investigation, I found this is caused by MySQL
connection pools that appear to be open but are actually dead (expired
due to` wait_timeout`).

1. `@DB.connection_context()` (as a decorator or context manager) pulls
a connection from the pool.

2. If this connection was idle and expired on the MySQL server (e.g.,
due to `wait_timeout`), but not closed in Python, it will still be
considered “open” (`DB.is_closed() == False`).

3. The real error will occur only when I execute a SQL command (such as
.`get_or_none()`), and PyMySQL tries to send it to the server via a
broken socket.


### Changes Made:

1. I implemented manual connection checks before executing SQL:
```
    try:
        DB.execute_sql("SELECT 1")
    except Exception:
        print("Connection dead, reconnecting...")
        DB.close()
        DB.connect()
```
2. Delayed the token count update until after the streaming response is
completed to ensure the streaming output isn't interrupted by database
operations.
```
        total_tokens = 0 
        for txt in chat_streamly(system, history, gen_conf):
            if isinstance(txt, int):
                total_tokens = txt
......
                break
......
        if total_tokens > 0:
            if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, txt, self.llm_name):
                logging.error("LLMBundle.chat_streamly can't update token usage for {}/CHAT llm_name: {}, content: {}".format(self.tenant_id, self.llm_name, txt))
```
2025-04-16 19:15:35 +08:00
dc2c74b249 Feat: add primitive support for function calls (#6840)
### What problem does this PR solve?

This PR introduces ​**​primitive support for function calls​**​,
enabling the system to handle basic function call capabilities.
However, this feature is currently experimental and ​**​not yet enabled
for general use​**​, as it is only supported by a subset of models,
namely, Qwen and OpenAI models.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-08 16:09:03 +08:00
df3890827d Refa: change LLM chat output from full to delta (incremental) (#6534)
### What problem does this PR solve?

Change LLM chat output from full to delta (incremental)

### Type of change

- [x] Refactoring
2025-03-26 19:33:14 +08:00
bf483fdf02 Fix: describe parameter error. (#6519)
### What problem does this PR solve?
#6228

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 09:02:48 +08:00
85eb367ede Feat: add basic Langfuse support for LLM module (#6443)
### What problem does this PR solve?

#6155

Add basic Langfuse support for LLM module.

A trace example:

<img width="755" alt="image"
src="https://github.com/user-attachments/assets/25c1f852-5116-486c-a47f-6097187142ca"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-24 13:18:47 +08:00
5cf610af40 Feat: add vision LLM PDF parser (#6173)
### What problem does this PR solve?

Add vision LLM PDF parser

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-18 14:52:20 +08:00
2d4a60cae6 Fix: Reduce excessive IO operations by loading LLM factory configurations (#6047)
…ions

### What problem does this PR solve?

This PR fixes an issue where the application was repeatedly reading the
llm_factories.json file from disk in multiple places, which could lead
to "Too many open files" errors under high load conditions. The fix
centralizes the file reading operation in the settings.py module and
stores the data in a global variable that can be accessed by other
modules.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
2025-03-14 09:54:38 +08:00
4326873af6 refactor: no need to inherit in python3 clean the code (#5659)
### What problem does this PR solve?

As title

### Type of change


- [x] Refactoring

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-05 18:03:53 +08:00
00c7ddbc9b Fix: The max tokens defined by the tenant are not used (#4297) (#2817) (#5066)
### What problem does this PR solve?

Fix: The max tokens defined by the tenant are not used (#4297) (#2817)


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-02-18 13:42:22 +08:00
849d9eb463 Ignore tenant not found error while increasing token usage. (#4950)
### What problem does this PR solve?

#4940

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-14 11:10:49 +08:00
588207d7c1 optimize TenantLLMService.increase_usage for "can't update token usag… (#4755)
…e error " message

### What problem does this PR solve?

optimize TenantLLMService.increase_usage Performance

### Type of change

- [x] Performance Improvement

Co-authored-by: che_shuai <che_shuai@massclouds.com>
2025-02-07 12:16:17 +08:00
213218a094 Refactor ask decorator (#4116)
### What problem does this PR solve?

Refactor ask decorator

### Type of change

- [x] Refactoring

---------

Signed-off-by: jinhai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-12-19 18:13:33 +08:00
0d68a6cd1b Fix errors detected by Ruff (#3918)
### What problem does this PR solve?

Fix errors detected by Ruff

### Type of change

- [x] Refactoring
2024-12-08 14:21:12 +08:00
311a475b6f Fix: Fixed the issue that the agent list page failed to load #3827 (#3902)
### What problem does this PR solve?

Fix: Fixed the issue that the agent list page failed to load #3827

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-06 17:05:40 +08:00
92ab7ef659 Refactor embedding batch_size (#3825)
### What problem does this PR solve?

Refactor embedding batch_size. Close #3657

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2024-12-03 16:22:39 +08:00
74b28ef1b0 Add pagerank to KB. (#3809)
### What problem does this PR solve?

#3794

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-12-03 14:30:35 +08:00
7543047de3 Fix @ in model name issue. (#3821)
### What problem does this PR solve?

#3814

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-03 12:41:39 +08:00
c5f13629af Set Log level by env (#3798)
### What problem does this PR solve?

Set Log level by env

### Type of change

- [x] Refactoring
2024-12-02 17:24:39 +08:00
381219aa41 Fixed increase_usage for builtin models (#3748)
### What problem does this PR solve?

Fixed increase_usage for builtin models. Close #1803

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-29 17:02:49 +08:00
30f6421760 Use consistent log file names, introduced initLogger (#3403)
### What problem does this PR solve?

Use consistent log file names, introduced initLogger

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2024-11-14 17:13:48 +08:00
a2a5631da4 Rework logging (#3358)
Unified all log files into one.

### What problem does this PR solve?

Unified all log files into one.

### Type of change

- [x] Refactoring
2024-11-12 17:35:13 +08:00
b2524eec49 fix sequence2txt error and usage total token issue (#2961)
### What problem does this PR solve?

#1363

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-10-22 11:38:37 +08:00
ac26d09a59 Feature/feat1017 (#2872)
### What problem does this PR solve?

1. fix: mid map show error in knowledge graph, juse because
```@antv/g6```version changed
2. feat: concurrent threads configuration support in graph extractor
3. fix: used tokens update failed for tenant
4. feat: timeout configuration support for llm
5. fix: regex error in graph extractor
6. feat: qwen rerank(```gte-rerank```) support
7. fix: timeout deal in knowledge graph index process. Now chat by
stream output, also, it is configuratable.
8. feat: ```qwen-long``` model configuration

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: chongchuanbing <chongchuanbing@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-10-21 12:11:08 +08:00
a3ab5ba9ac support sequence2txt and tts model in Xinference (#2696)
### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-10-08 10:43:18 +08:00
34abcf7704 style: fix typo and format code (#2618)
### What problem does this PR solve?

- Fix typo
- Remove unused import
- Format code

### Type of change

- [x] Other (please describe): typo and format
2024-09-27 13:17:25 +08:00
01acc3fd5a fix duplicated llm name betweeen different suppliers (#2477)
### What problem does this PR solve?

#2465

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-09-18 16:09:22 +08:00
333608a1d4 add search TAB backend api (#2375)
### What problem does this PR solve?
 #2247

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-09-11 19:49:18 +08:00
ad09d4bb24 fix tts interface error (#2197)
### What problem does this PR solve?

fix tts interface error

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-09-02 18:40:57 +08:00
6b7c028578 add support for TTS model (#2095)
### What problem does this PR solve?

add support for TTS model
#1853

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-08-26 15:19:43 +08:00
6b3a40be5c Format file format from Windows/dos to Unix (#1949)
### What problem does this PR solve?

Related source file is in Windows/DOS format, they are format to Unix
format.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2024-08-15 09:17:36 +08:00
H
ac7a0d4fbf Add ParsertType Audio (#1637)
### What problem does this PR solve?

#1514 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-07-22 19:17:30 +08:00
H
58df013722 Chat Use CVmodel (#1607)
### What problem does this PR solve?

#1230 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-07-19 18:36:34 +08:00
5d2f7136dd fix chunk modification bug (#1011)
### What problem does this PR solve?

As title.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-31 15:45:11 +08:00
758eb03ccb fix jina adding issure and term weight refinement (#974)
### What problem does this PR solve?

#724 #162

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2024-05-29 19:38:57 +08:00
614defec21 add rerank model (#969)
### What problem does this PR solve?

feat: add rerank models to the project #724 #162

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-29 16:50:02 +08:00
6f99bbbb08 add raptor (#899)
### What problem does this PR solve?

#882 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-23 14:31:16 +08:00
e73ce39b66 Add 2 embeding models from OpenAI (#812)
### What problem does this PR solve?

#810 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-17 08:51:29 +08:00
95f809187e add stream chat (#811)
### What problem does this PR solve?

#709 
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-16 20:14:53 +08:00
aa1c915d6e support gpt-4o (#773)
### What problem does this PR solve?
#771 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-15 11:16:08 +08:00
8d6d7f6887 fix task losting isssue (#665)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 20:46:45 +08:00
66f8d35632 Refactor (#537)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-25 14:14:28 +08:00
03f8b01b3b fix bug for fasetembed (#392)
### What problem does this PR solve?

Issue link:#325

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-16 19:12:12 +08:00
890561703b Add bce-embedding and fastembed (#383)
### What problem does this PR solve?


Issue link:#326

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-16 16:42:19 +08:00
e20207101a fix wrong log printting (#330)
### What problem does this PR solve?

Issue link:#325

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-12 09:08:08 +08:00
e876f58b4c refine readme (#170) 2024-03-29 14:38:15 +08:00
38e5737067 add base url for OpenAI (#166) 2024-03-28 19:15:16 +08:00