Commit Graph

210 Commits

Author SHA1 Message Date
0b487dee43 Fix: support cross language for API. (#8946)
### What problem does this PR solve?

Close #8943

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-21 17:25:28 +08:00
4d7bfd2ba3 Fix: typo process_duration (#8696)
### What problem does this PR solve?

Fix typo process_duration.

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-07-07 14:11:47 +08:00
7353070f49 Adds retrieval result fields to Chunk (#8478)
### What problem does this PR solve?

This PR adds fields to the `Chunk` class to store retrieval results like
similarity scores, term similarity, vector similarity, positions, and
document type. This allows the chunk object to hold all the information
needed when returning search results from the vector database.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-25 16:53:15 +08:00
dac5bcdf17 Fix: Enforce default embedding model in create_dataset / update_dataset (#8486)
### What problem does this PR solve?

Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model

Now:
- Correctly prioritizes user-configured default embedding_model

Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-25 16:41:32 +08:00
7e87eb2e23 Docs: Update version references to v0.19.1 in READMEs and docs (#8366)
### What problem does this PR solve?

- Update Docker image version badges and references from v0.19.0 to
v0.19.1
- Modify version mentions in all localized README files (id, ja, ko,
pt_br, tzh, zh)
- Update version in docker/README.md and related documentation files
- Includes updates to Helm values and Python SDK dependencies

### Type of change

- [x] Documentation Update
2025-06-19 14:39:27 +08:00
e470645efd Refactor code (#8341)
### What problem does this PR solve?

1. rename var
2. update if statement

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-18 16:40:30 +08:00
545ea229b6 Refa: Structure Ask Message (#8276)
### What problem does this PR solve?

Refactoring codes for SDK

### Type of change

- [x] Refactoring
2025-06-16 10:17:21 +08:00
7fbbc9650d Fix: Move pagerank field from create to update dataset API (#8217)
### What problem does this PR solve?

- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references

#8208

This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 15:47:49 +08:00
ad1f89fea0 Fix: chat module update LLM defaults (#8125)
### What problem does this PR solve?

Previously when LLM.model_name was not configured:
- System incorrectly defaulted to 'deepseek-chat' model
- This caused permission errors for unauthorized tenants

Now:
- Use tenant's default chat_model configuration first

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-09 11:44:02 +08:00
2ff911b08c Fix: Set default rerank_model to empty string in Chat class (#8130)
### What problem does this PR solve?

Previously when LLM.rerank_model was not configured:
- SDK would pass None as the value
- Database field with null=False constraint would reject it
- Caused storage failures for unset rerank_model cases

Now:
- SDK checks for None value before database operations
- Provides empty string as default when rerank_model is unset

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-09 11:43:42 +08:00
cc1b2c8f09 Test: add sdk Document test cases (#8094)
### What problem does this PR solve?

Add sdk document test cases

### Type of change

- [x] Add test cases
2025-06-06 09:47:06 +08:00
100ea574a7 Fix(python-sdk): Add name filtering support to Dataset.list_documents() (#8090)
### What problem does this PR solve?

Added name filtering capability for Dataset.list_documents()

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 19:04:35 +08:00
f007c1c772 Fix: Resolve JSON download errors in Document.download() (#8084)
### What problem does this PR solve?

An exception is thrown only when the json file has only two keys, `code`
and `message`. In other cases, response.content is returned normally.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 18:03:51 +08:00
8b7c424617 Fix: Document.update() now refreshes object data (#8068)
### What problem does this PR solve?

#8067 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 12:46:29 +08:00
4f3abb855a Fix: remove zhipu ai api key (#8066)
### What problem does this PR solve?

- Removed hardcoded Zhipu API key from codebase
- New requirement: Tests now require ZHIPU_AI_API_KEY environment
variable
  Example: export ZHIPU_AI_API_KEY=your_api_key_here

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 12:04:09 +08:00
a374816fb2 Don't use ',' (U+FF0C) but ', ' (U+2C U+20) (#8063)
The Unicode codepoint ',' (U+FF0C) is meant to be used in Chinese text,
but this is English text. It looks like a comma followed by a space, but
isn't. Of course I didn't change actual Chinese text.

### What problem does this PR solve?

Mixup of Unicode characters. This is probably unnoticed by most users,
but I wonder if screen readers would read it out differently or if LLMs
would trip up on it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-06-05 09:29:07 +08:00
ab5e3ded68 Fix: DataSet.update() now refreshes object data (#8058)
### What problem does this PR solve?

#8057 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 09:26:19 +08:00
73f9c226d3 Fix: Allow None value for parser_config in create_dataset SDK method (#8041)
### What problem does this PR solve?

Fix parser_config=None handling in create_dataset

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-04 13:16:32 +08:00
31f4d44c73 Update upload filename length limit from 128 to 256, which is aligned with os (#7971)
### What problem does this PR solve?

Change filename length limit from 128 to 256

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-05-30 14:25:59 +08:00
5d6bf2224a Fix: Opensearch chunk management (#7802)
### What problem does this PR solve?

This PR solve the problems metioned in the
pr(https://github.com/infiniflow/ragflow/pull/7140) which is also
submitted by me

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):


### Introduction
I fixed the problems when using OpenSearch as the DOC_ENGINE, the
failures of pytest and the wrong API's return.
Mainly about delete chunk, list chunks, update chunk, retrieval chunk.
The pytest comand "cd sdk/python && uv sync --python 3.10 --group test
--frozen && source .venv/bin/activate && cd test/test_http_api &&
DOC_ENGINE=opensearch pytest test_chunk_management_within_dataset -s
--tb=short " is finally successful.

###Others
As some changes between Elasticsearch And Opensearch differ, some pytest
results about OpenSearch are correct and resonable. However, some pytest
params (skipif params) are incompatible. So I changed some pytest params
about skipif.

As a search engine programmer, I will still focus on the usage of vector
databases (especially OpenSearch) for the RAG stuff.
Thanks for your review
2025-05-26 16:57:58 +08:00
590b9dabab Docs: update for v0.19.0 (#7823)
### What problem does this PR solve?

update for v0.19.0

### Type of change

- [x] Documentation Update
2025-05-23 18:25:47 +08:00
e166f132b3 Feat: change default models (#7777)
### What problem does this PR solve?

change default models to buildin models
https://github.com/infiniflow/ragflow/issues/7774

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 18:21:25 +08:00
fed1221302 Refa: HTTP API list datasets / test cases / docs (#7720)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the list datasets HTTP
API, improving code clarity and robustness. Key changes include:

Pydantic Validation
Error Handling
Test Updates
Documentation Updates

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-05-20 09:58:26 +08:00
59705a1c1d Test: change variable for ZHIPU_AI_API_KEY (#7684)
### What problem does this PR solve?

change variable for ZHIPU_AI_API_KEY

### Type of change

- [x] Update test case
2025-05-16 15:58:54 +08:00
04edf9729f Test: use environment variable for ZHIPU_AI_API_KEY (#7680)
### What problem does this PR solve?

use environment variable for ZHIPU_AI_API_KEY

### Type of change

- [x] Test update
2025-05-16 13:51:21 +08:00
ae8b628f0a Refa: HTTP API delete dataset / test cases / docs (#7657)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the delete dataset HTTP
API, improving code clarity and robustness. Key changes include:

1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-05-16 10:16:43 +08:00
f8cc557892 Fix(api): correct default value handling in dataset parser config (#7589)
### What problem does this PR solve?

Fix  HTTP API Create/Update dataset parser config default value error

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-12 19:39:18 +08:00
992398bca3 Feat: Add http api to create, update, or delete agents. (#7515)
### What problem does this PR solve?

Hello, we are using ragflow as a backend service, so we need to manage
agents from our own frontend. So adding these http APIs to manage
agents.

The code logic is copied and modified from the `rm` and `save` methods
in `api/apps/canvas_app.py`.

btw, I found that the `save` method in `canvas_app.py` actually allows
to modify an agent to an existing title, so I kept the behavior in the
http api. I'm not sure if this is intentional.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-12 17:59:53 +08:00
ad412380cb Fix:Discrepancy between Document.list_chunks() API documentation and implementation (#7575)
### What problem does this PR solve?


Close #7567

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-12 11:05:32 +08:00
ef0c4b134d Test: skip unstable test cases (#7578)
### What problem does this PR solve?

Skip unstable test cases to ensure daily testing stability

### Type of change

- [x] Update test cases
2025-05-12 09:49:14 +08:00
35e36cb945 Refa: HTTP API update dataset / test cases / docs (#7564)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the update dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. ​​Error Handling
3. Test Updates
4. Documentation Updates
5. fix bug: #5915

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
2025-05-09 19:17:08 +08:00
0fbca63e9d Test: Configure test case priorities to reduce CI execution time (#7532)
### What problem does this PR solve?

Configure test case priorities to reduce CI execution time

### Type of change

- [x] Test cases update
2025-05-08 19:22:52 +08:00
c98933499a refa: Optimize create dataset validation (#7451)
### What problem does this PR solve?

Optimize dataset validation and add function docs

### Type of change

- [x] Refactoring
2025-05-06 17:38:06 +08:00
fc379e90d1 Fix: change create dataset htto api delimiter default value to r'\n' (#7434)
### What problem does this PR solve?

change create dataset delimiter default value to r'\n'

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-30 17:43:42 +08:00
1f82889001 Fix: create dataset remove unnecessary parameter constraints (#7432)
### What problem does this PR solve?

Remove unnecessary parameter restrictions in dataset creation API

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-30 14:50:23 +08:00
e6c824e606 Test: Update tests to use new fixture instead of deprecated one (#7431)
### What problem does this PR solve?

Deprecate get_dataset_id_and_document_id fixture, use add_document
instead

### Type of change

- [x] Update test cases
2025-04-30 14:49:26 +08:00
78380fa181 Refa: http API create dataset and test cases (#7393)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the create dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. ​​Error Handling
3. Test Updates
4. Documentation

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
2025-04-29 16:53:57 +08:00
c7310f7fb2 Refa: similarity calculations. (#7381)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-04-28 19:17:11 +08:00
a4be6c50cf [BREAKING CHANGE] GET to POST: enhance document list capability (#7349)
### What problem does this PR solve?

Enhance capability of `list_docs`.

Breaking change: change method from `GET` to `POST`.

### Type of change

- [x] Refactoring
- [x] Enhancement with breaking change
2025-04-27 16:48:27 +08:00
94181a990b Refa: knowledge_graph chunk method is deprecated (#7220)
### What problem does this PR solve?

The knowledge_graph chunk method is deprecated and should no longer be
used. #7184.

### Type of change

- [x] Refactoring
2025-04-23 13:01:46 +08:00
03672df691 Docs: update for v0.18.0 (#7223)
### What problem does this PR solve?

update for v0.18.0

### Type of change

- [x] Documentation Update
2025-04-23 12:02:50 +08:00
f35ff65c36 [BREAKING CHANGE] GET to POST: enhance kb list capability (#7205)
### What problem does this PR solve?

Enhance capability of `list_kbs`.

Breaking change: change method from `GET` to `POST`.

### Type of change

- [x] Refactoring
- [x] Enhancement with breaking change
2025-04-22 17:54:12 +08:00
67dee2d74e Fix: fix retrieval tesing wrong pagination (#7174)
### What problem does this PR solve?

Fix retrieval testing wrong pagination. #7171 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-04-22 15:16:04 +08:00
e5f9d148e7 Test: Added test cases for Delete Sessions With Chat Assistant HTTP API (#7025)
### What problem does this PR solve?

cover [Delete chat assistant's
sessions](https://ragflow.io/docs/dev/http_api_reference#delete-chat-assistants-sessions)
endpoints

### Type of change

- [x] Add test cases
2025-04-15 14:54:26 +08:00
9b789c2ae9 Test: Added test cases for Update Session With Chat Assistant HTTP API (#6968)
### What problem does this PR solve?

cover [Update chat assistant's
sessions](https://ragflow.io/docs/dev/http_api_reference#update-chat-assistants-session)
endpoints

### Type of change

- [x] Update test cases
2025-04-11 20:10:24 +08:00
ffb9f01bea Test: Update test cases for PR 6906 ISSUE 6875 (#6971)
### What problem does this PR solve?

PR #6906 ISSUE #6875

### Type of change

- [ ] Update test cases
2025-04-11 20:09:44 +08:00
dc59aba132 Test: Added test cases for List Sessions With Chat Assistant HTTP API (#6938)
### What problem does this PR solve?

cover [List chat assistant's
sessions](https://ragflow.io/docs/dev/http_api_reference#list-chat-assistants-sessions)
endpoints

### Type of change

- [x] Update test cases
2025-04-10 17:31:01 +08:00
8fb5edd927 Test: Update test cases for PR 6906 (#6929)
### What problem does this PR solve?

PR #6906

### Type of change

- [x] Update test cases
2025-04-10 12:28:56 +08:00
3bb1e012e6 Fix: assistant deleteion issue. (#6906)
### What problem does this PR solve?

#6875

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-09 20:29:40 +08:00
22758a2763 Test: Update test cases for PR 6888 ISSUE 6876 (#6907)
### What problem does this PR solve?

PR #6888 ISSUE #6876

### Type of change

- [x] Update test case
2025-04-09 20:29:29 +08:00