### What problem does this PR solve?
Fix: Optimize Prompts and Regex for use_sql() #11127
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
```
admin> show version;
show_version
+-----------------------+
| version |
+-----------------------+
| v0.21.0-241-gc6cf58d5 |
+-----------------------+
admin> \q
Goodbye!
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Move EMBEDDING_CFG to common.globals
2. Fix error imports
3. Move signal handles to common/signal_utils.py
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Rename identifier name
2. Fix some return statement
3. Fix some typos
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: Create dataset performance unmatched between HTTP api and web ui
#10925
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: parsing hyperlinks in docx and pdf #10848
Fix: default parser config of toc extraction
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add get_uuid, download_img and hash_str2int into misc_utils.py
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fix: the input length exceeds the context length #10750
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add time utilities and unit tests
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
- rename rmSpace to remove_redundant_spaces
- move clean_markdown_block to common module
- add unit tests for remove_redundant_spaces and clean_markdown_block
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Fix: can't upload image in ollama model #10447
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### Change all `image=[]` to `image = None`
Changing `image=[]` to `images=None` avoids Python’s mutable default
parameter issue.
If you keep `images=[]`, all calls share the same list, so modifying it
(e.g., images.append()) will affect later calls.
Using images=None and creating a new list inside the function ensures
each call is independent.
This change does not affect current behavior — it simply makes the code
safer and more predictable.
把 `images=[]` 改成 `images=None` 是为了避免 Python 默认参数的可变对象问题。
如果保留 `images=[]`,所有调用都会共用同一个列表,一旦修改就会影响后续调用。
改成 None 并在函数内部重新创建列表,可以确保每次调用都是独立的。
这个修改不会影响现有运行结果,只是让代码更安全、更可控。
### What problem does this PR solve?
Feat: Support attribute filtering #8703
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: writinwaters <cai.keith@gmail.com>
### What problem does this PR solve?
File: Now parsing support all types of embedded documents, solved #10059
Fix: Incomplete words in chat #10530
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## What problem does this PR solve?
Fixes the PostgreSQL connection error that prevents RAGFlow from
starting:
peewee.ProgrammingError: invalid dsn: invalid connection option
"max_retries"
## Problem Analysis
The `BaseDataBase` class in `api/db/db_models.py` adds `max_retries` and
`retry_delay` to the database configuration dict before passing it to
the database connection constructor.
- **MySQL**: Has `RetryingPooledMySQLDatabase` class that properly
extracts these custom parameters using `kwargs.pop()` before calling the
parent constructor
- **PostgreSQL**: Was using the base `PooledPostgresqlDatabase` class
which passes all parameters directly to `psycopg2.connect()`, which
doesn't recognize `max_retries` as a valid connection option
## Solution
Created `RetryingPooledPostgresqlDatabase` class that:
- Extracts `max_retries` and `retry_delay` parameters before
initialization
- Implements retry logic with exponential backoff for connection
failures
- Handles PostgreSQL-specific connection errors (connection refused,
server closed, etc.)
- Mirrors the existing `RetryingPooledMySQLDatabase` implementation
Updated the `PooledDatabase` enum to use the new retrying class for
PostgreSQL.
## Benefits
✅ Prevents invalid connection parameters from being passed to psycopg2
✅ Adds automatic retry logic for PostgreSQL connection failures
✅ Provides better error logging for PostgreSQL-specific issues
✅ Maintains consistency between MySQL and PostgreSQL database handling
## Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
## Testing
Tested with PostgreSQL database configuration and verified:
- Server starts without the "invalid dsn" error
- Database connections are established successfully
- Retry logic works correctly on connection failures
Co-authored-by: Andrea Bugeja <andrea.bugeja@gig.com>
### What problem does this PR solve?
Improve file management. #10287.
Passed tests:
1. Create folder `A` and `B`.
2. Upload a file inside `A`, called `file`.
3. Create a KB, called `K`.
3. Link `file` to `K`.
4. Parse `file` inside of `K`. (OK)
5. Move `file` from `A` to `B`.
6. Parse `file` inside of `K`. (OK)
7. Move `file` from `B` to `A`.
8. Parse `file` inside of `K`. (OK)
9. Move entire folder `A` into `B`. (B -> A -> file)
10. Parse `file` inside of `K`. (OK)
11. Delete folder `B`.
12. All clear. (There is no document inside of `K`)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Google Cloud model does not work correctly with gemini-2.5 models
Close#10408
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: add total in List dataset API, solved #10360
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Unexpected operation of document management.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Admin client support drop user.
Issue: #10241
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Move base64 related function to api/common/base64.py
### Type of change
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Refactor import modules.
### Type of change
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Fixed the issue where database connections were interrupted under high
concurrency
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: lemsn <lemsn@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>