Commit Graph

350 Commits

Author SHA1 Message Date
a4be6c50cf [BREAKING CHANGE] GET to POST: enhance document list capability (#7349)
### What problem does this PR solve?

Enhance capability of `list_docs`.

Breaking change: change method from `GET` to `POST`.

### Type of change

- [x] Refactoring
- [x] Enhancement with breaking change
2025-04-27 16:48:27 +08:00
eead838353 Fix pymysql interface error (#7295)
### What problem does this PR solve?

According to the
[[Rucongzhang](https://github.com/Rucongzhang)](https://github.com/infiniflow/ragflow/pull/7057#issuecomment-2827410047)
I added DB reconnection strategy in function `update_by_id`
2025-04-25 13:29:47 +08:00
2c62652ea8 <think> tag is missing. (#7256)
### What problem does this PR solve?

Some models force thinking, resulting in the absence of the think tag in
the returned content

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-24 11:44:10 +08:00
f35ff65c36 [BREAKING CHANGE] GET to POST: enhance kb list capability (#7205)
### What problem does this PR solve?

Enhance capability of `list_kbs`.

Breaking change: change method from `GET` to `POST`.

### Type of change

- [x] Refactoring
- [x] Enhancement with breaking change
2025-04-22 17:54:12 +08:00
5d253e0a34 Fix: pymysql.err.InterfaceError: (0, '') during long time streaming chat responses (#6548) (#7057)
### Related Issue:
https://github.com/infiniflow/ragflow/issues/6548

### Related PR:
https://github.com/infiniflow/ragflow/pull/6861


### Environment:
Commit version:
[[48730e0](48730e00a8)]

### Bug Description:
Unexpected `pymysql.err.InterfaceError: (0, '') `when using Peewee +
PyMySQL + PooledMySQLDatabase after a long-running `chat streamly`
operation.

This is a common issue with Peewee + PyMySQL + connection pooling: you
end up using a connection that was silently closed by the server, but
Peewee doesn't realize it's dead.

**I found that the error only occurs during longer streaming outputs**
and is unrelated to the database connection context, so it's likely
because:

- The prolonged streaming response caused the database connection to
time out

- The original database connection might have been disconnected by the
server during the streaming process

### Why This Happens
This error happens even when using `@DB.connection_context() `after the
stream is done. After investigation, I found this is caused by MySQL
connection pools that appear to be open but are actually dead (expired
due to` wait_timeout`).

1. `@DB.connection_context()` (as a decorator or context manager) pulls
a connection from the pool.

2. If this connection was idle and expired on the MySQL server (e.g.,
due to `wait_timeout`), but not closed in Python, it will still be
considered “open” (`DB.is_closed() == False`).

3. The real error will occur only when I execute a SQL command (such as
.`get_or_none()`), and PyMySQL tries to send it to the server via a
broken socket.


### Changes Made:

1. I implemented manual connection checks before executing SQL:
```
    try:
        DB.execute_sql("SELECT 1")
    except Exception:
        print("Connection dead, reconnecting...")
        DB.close()
        DB.connect()
```
2. Delayed the token count update until after the streaming response is
completed to ensure the streaming output isn't interrupted by database
operations.
```
        total_tokens = 0 
        for txt in chat_streamly(system, history, gen_conf):
            if isinstance(txt, int):
                total_tokens = txt
......
                break
......
        if total_tokens > 0:
            if not TenantLLMService.increase_usage(self.tenant_id, self.llm_type, txt, self.llm_name):
                logging.error("LLMBundle.chat_streamly can't update token usage for {}/CHAT llm_name: {}, content: {}".format(self.tenant_id, self.llm_name, txt))
```
2025-04-16 19:15:35 +08:00
5af2d57086 Refa. (#7022)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-04-15 10:20:33 +08:00
7a34159737 Fix: add fallback for bad citation output (#7014)
### What problem does this PR solve?

Add fallback for bad citation output. #6948

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-15 09:33:53 +08:00
98670c3755 Fix: KB update_time changed whenever system relaunched (#6959)
### What problem does this PR solve?

Fix KB update_time changed whenever system relaunched. #6953 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-11 20:10:49 +08:00
dc2c74b249 Feat: add primitive support for function calls (#6840)
### What problem does this PR solve?

This PR introduces ​**​primitive support for function calls​**​,
enabling the system to handle basic function call capabilities.
However, this feature is currently experimental and ​**​not yet enabled
for general use​**​, as it is only supported by a subset of models,
namely, Qwen and OpenAI models.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-08 16:09:03 +08:00
a20439bf81 fix: add exception handling for get_by_id method (#6861)
### What problem does this PR solve?

Fixes #6548 

Add exception handling to prevent exceptions from propagating back to
the web, which may lead to failure in displaying conversation content.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: cm <caiming@sict.ac.cn>
2025-04-08 16:06:57 +08:00
cded812b97 Feat: add OpenAI compatible API for agent (#6329)
### What problem does this PR solve?
add openai agent
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-04-03 16:51:37 +08:00
9ecc78feeb Refa: copywriting refinement. (#6779)
### What problem does this PR solve?

Close #6762

### Type of change

- [x] Refactoring
2025-04-03 11:38:02 +08:00
d4a3e9a7cc Fix table migration on non-exist-yet indexed columns. (#6666)
### What problem does this PR solve?

Fix #6334

Hello, I encountered the same problem in #6334. In the
`api/db/db_models.py`, it calls `obj.create_table()` unconditionally in
`init_database_tables`, before the `migrate_db()`. Specially for the
`permission` field of `user_canvas` table, it has `index=True`, which
causes `peewee` to issue a SQL trying to create the index when the field
does not exist (the `user_canvas` table already exists), so
`psycopg2.errors.UndefinedColumn: column "permission" does not exist`
occurred.

I've added a judgement in the code, to only call `create_table()` when
the table does not exist, delegate the migration process to
`migrate_db()`.

Then another problem occurs: the `migrate_db()` actually does nothing
because it failed on the first migration! The `playhouse` blindly issue
DDLs without things like `IF NOT EXISTS`, so it fails... even if the
exception is `pass`, the transaction is still rolled back. So I removed
the transaction in `migrate_db()` to make it work.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-31 11:27:20 +08:00
65a8cd1772 Fix knowledge_graph_kwd on infinity. Close #6476 and #6624 (#6651)
### What problem does this PR solve?

Fix knowledge_graph_kwd on infinity. Close #6476 and #6624

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-28 22:05:40 +08:00
ecc9605a32 Fix: team doc deletion issue. (#6589)
### What problem does this PR solve?

#6557

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-27 13:26:38 +08:00
df3890827d Refa: change LLM chat output from full to delta (incremental) (#6534)
### What problem does this PR solve?

Change LLM chat output from full to delta (incremental)

### Type of change

- [x] Refactoring
2025-03-26 19:33:14 +08:00
735d9dd949 Feat: add "tools" to llm_factories.json (#6552)
### What problem does this PR solve?



### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Chenzy <chenzy901@gmail.com>
2025-03-26 17:31:18 +08:00
bf483fdf02 Fix: describe parameter error. (#6519)
### What problem does this PR solve?
#6228

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 09:02:48 +08:00
814a210f5d Fix: failed to acquire lock exception with retry mechanism for postgres and mysql (#6483)
Added the with_retry decorator in db_models.py to add a retry mechanism
for database operations. Applied the retry mechanism to the lock and
unlock methods of the PostgresDatabaseLock and MysqlDatabaseLock classes
to enhance the reliability of lock operations.

### What problem does this PR solve?
resolve failed to acquire lock exception with retry mechanism

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-03-25 15:09:56 +08:00
5e0a77df2b Feat: add Langfuse APIs (#6460)
### What problem does this PR solve?

Add Langfuse APIs

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-24 18:25:43 +08:00
66e557b6c0 Fix: Langfuse update model has no fields attribute (#6453)
### What problem does this PR solve?

Langfuse update model has no fields attribute

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-24 15:37:14 +08:00
200b6f55c6 Fix: NameError: free variable 'langfuse_generation' referenced before assignment in enclosing scope (#6451)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: lizheng@ssc-hn.com <lizheng@ssc-hn.com>
2025-03-24 15:14:36 +08:00
85eb367ede Feat: add basic Langfuse support for LLM module (#6443)
### What problem does this PR solve?

#6155

Add basic Langfuse support for LLM module.

A trace example:

<img width="755" alt="image"
src="https://github.com/user-attachments/assets/25c1f852-5116-486c-a47f-6097187142ca"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-24 13:18:47 +08:00
7f80d7304d Fix: Optimized the get_by_id method to resolve the issue of missing exceptions and improve query performance (#6320)
Fix: Optimized the get_by_id method to resolve the issue of missing
exceptions and improve query performance

### What problem does this PR solve?

Optimized the get_by_id method to resolve the issue of missing
exceptions and improve query performance.
Optimization details:
1. The original method used a custom query method that required
concatenating SQL, which impacted performance.
2. The query method returned a list, which needed to be accessed by
index, posing a risk of index out-of-bounds errors.
3. The original method used except Exception to catch all errors, which
is not a best practice in Python programming and may lead to missing
exceptions. The get_or_none method accurately catches DoesNotExist
errors while allowing other errors to be raised normally.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement
2025-03-20 23:23:48 +08:00
344727f9ba Feat: add agent share team viewer (#6222)
### What problem does this PR solve?
Allow member view agent  
#  Canvas editor

![image](https://github.com/user-attachments/assets/042af36d-5fd1-43e2-acf7-05869220a1c1)
# List agent

![image](https://github.com/user-attachments/assets/8b9c7376-780b-47ff-8f5c-6c0e7358158d)
# Setting 

![image](https://github.com/user-attachments/assets/6cb7d12a-7a66-4dd7-9acc-5b53ff79a10a)
 
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-19 19:04:13 +08:00
53ac27c3ff Feat: support agent version history. (#6130)
### What problem does this PR solve?
Add history version save
- Allows users to view and download agent files by version revision
history

![image](https://github.com/user-attachments/assets/c300375d-8b97-4230-9fc4-83d148137132)

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-19 15:22:53 +08:00
e689532e6e Fix: long api key issue. (#6267)
### What problem does this PR solve?

#6248

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-19 13:30:40 +08:00
5cf610af40 Feat: add vision LLM PDF parser (#6173)
### What problem does this PR solve?

Add vision LLM PDF parser

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-18 14:52:20 +08:00
89a69eed72 Introduced task priority (#6118)
### What problem does this PR solve?

Introduced task priority

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-14 23:43:46 +08:00
d7774cf049 Fix: fix document concurrent upload issue (#6095)
### What problem does this PR solve?

Resolve document concurrent upload issue. #6039 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-14 16:31:44 +08:00
b77e844fc3 Fix: none parse_config updating. (#6092)
### What problem does this PR solve?

#6081

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-14 16:06:16 +08:00
42eb99554f Feat: add token comsumption & speed to little lamp. (#6077)
### What problem does this PR solve?

#6059

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-14 13:37:31 +08:00
7463241896 Fix: empty doc id validation. (#6064)
### What problem does this PR solve?

#6031

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-14 11:45:44 +08:00
2d4a60cae6 Fix: Reduce excessive IO operations by loading LLM factory configurations (#6047)
…ions

### What problem does this PR solve?

This PR fixes an issue where the application was repeatedly reading the
llm_factories.json file from disk in multiple places, which could lead
to "Too many open files" errors under high load conditions. The fix
centralizes the file reading operation in the settings.py module and
stores the data in a global variable that can be accessed by other
modules.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
2025-03-14 09:54:38 +08:00
47926f7d21 Improve API Documentation, Standardize Error Handling, and Enhance Comments (#5990)
### What problem does this PR solve?  
- The API documentation lacks detailed error code explanations. Added
error code tables to `python_api_reference.md` and
`http_api_reference.md` to clarify possible error codes and their
meanings.
- Error handling in the codebase is inconsistent. Standardized error
handling logic in `sdk/python/ragflow_sdk/modules/chunk.py`.
- Improved API comments by adding standardized docstrings to enhance
code readability and maintainability.

### Type of change  
- [x] Documentation Update  
- [x] Refactoring
2025-03-13 19:06:50 +08:00
3c43a7aee8 For an Agent with an Input Begin value, on the first call the return … (#5957)
…session_id does not exist in the session

For an Agent with an Input Begin value, on the first call the return
session_id does not exist in the session

### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-12 17:01:44 +08:00
e3ea4b7ec2 Fix: Add Knowledge Base Document Parsing Status Check (#5966)
When creating and updating chats, add a check for the parsing status of
knowledge base documents. Ensure that all documents have been parsed
before allowing chat creation to improve user experience and system
stability.

**Main Changes:**

- Add document parsing status check logic in `chat.py`.
- Implement the `is_parsed_done` method in `knowledgebase_service.py`.
- Prevent chat creation when documents are being parsed or parsing has
failed.

### What problem does this PR solve?

fix this bug:https://github.com/infiniflow/ragflow/issues/5960

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-03-12 16:07:45 +08:00
caecaa7562 Feat: apply LLM to optimize citations. (#5935)
### What problem does this PR solve?

#5905

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-11 19:56:21 +08:00
90d18143ba Refa: add prompt to empty retrieved answwer. (#5892)
### What problem does this PR solve?

#5883

### Type of change

- [x] Refactoring
2025-03-11 13:11:14 +08:00
8e965040ce Fix: rm <think> for ES sql generation. (#5881)
### What problem does this PR solve?

#5850

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-11 10:41:19 +08:00
b1a46d5adc Fix:when start with source code not in docker env report 'UnicodeDec… (#5802)
### What problem does this PR solve?

fix:when start with  source code not in docker env report
"UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 5:
illegal multibyte sequence" in windows

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: tangyu <1@1.com>
2025-03-10 11:22:06 +08:00
66938e0b68 Feat(api): Add dsl parameters to control whether dsl fields are included (#5769)
1. **Issue**: When calling `list_agent_session` via the HTTP API, users
may only need to display conversation messages, and do not want to see
the associated dsl, which can be very large. Therefore, consider adding
a control option to determine whether the DSL should be returned, with
the default being to return it.

2. **Documentation Discrepancy**: In the HTTP API documentation, under
"List agent sessions," the "Response" section states that the "data"
field is a dictionary when "success" is returned. However, the actual
returned data is a list. This discrepancy has been corrected.
2025-03-07 16:58:00 +08:00
0e3e129a83 Fix: Resolve inconsistency in APIToken dialog_id field definition (#5749)
The `dialog_id` field was inconsistently defined:
- In the `migrate_db()` function, it was set to `null=True`.
- In the model class, it was defined as `null=False`.

This inconsistency caused an issue during the initial deployment where
the database table did not allow `dialog_id` to be null. As a result,
calling `APITokenService.save(**obj)` in `system_app.py` raised the
following error:

```
peewee.IntegrityError: null value in column "dialog_id" violates not-null constraint
```

### What problem does this PR solve?

Error: peewee.IntegrityError: null value in column "dialog_id" violates
not-null constraint

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-07 13:26:08 +08:00
9fc7174612 Fix: too long context during KG issue. (#5723)
### What problem does this PR solve?

#5088

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-06 19:21:07 +08:00
4326873af6 refactor: no need to inherit in python3 clean the code (#5659)
### What problem does this PR solve?

As title

### Type of change


- [x] Refactoring

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-05 18:03:53 +08:00
ec68ab1c8c Fix: search citation issue. (#5657)
### What problem does this PR solve?
#5649
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-05 17:25:47 +08:00
f65c3ae62b Refactored DocumentService.update_progress (#5642)
### What problem does this PR solve?

Refactored DocumentService.update_progress

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-05 14:48:03 +08:00
f6dd2cd1af Fix: fix may lose part of information of last stream chunck (#5584)
### What problem does this PR solve?

 Fix may lose part of information of last stream chunck

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-04 11:58:10 +08:00
c813c1ff4c Made task_executor async to speedup parsing (#5530)
### What problem does this PR solve?

Made task_executor async to speedup parsing

### Type of change

- [x] Performance Improvement
2025-03-03 18:59:49 +08:00
7a81fa00e9 Optimize prompt. (#5541)
### What problem does this PR solve?

#5526

### Type of change

- [x] Performance Improvement
2025-03-03 13:12:38 +08:00