fix bugs in test (#3196)

### What problem does this PR solve?

fix bugs in test

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
This commit is contained in:
liuhua
2024-11-04 20:03:14 +08:00
committed by GitHub
parent a9344e6838
commit cbca7dfce6
10 changed files with 60 additions and 62 deletions

View File

@ -1,5 +1,6 @@
---
sidebar_position: 0
slug: /http_api_reference
---
@ -615,14 +616,14 @@ Failure:
## List documents
**GET** `/api/v1/datasets/{dataset_id}/documents?offset={offset}&limit={limit}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}`
**GET** `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}`
Lists documents in a specified dataset.
### Request
- Method: GET
- URL: `/api/v1/datasets/{dataset_id}/documents?offset={offset}&limit={limit}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}`
- URL: `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}`
- Headers:
- `'content-Type: application/json'`
- `'Authorization: Bearer <YOUR_API_KEY>'`
@ -631,7 +632,7 @@ Lists documents in a specified dataset.
```bash
curl --request GET \
--url http://{address}/api/v1/datasets/{dataset_id}/documents?offset={offset}&limit={limit}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name} \
--url http://{address}/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name} \
--header 'Authorization: Bearer <YOUR_API_KEY>'
```
@ -641,10 +642,10 @@ curl --request GET \
The associated dataset ID.
- `keywords`: (*Filter parameter*), `string`
The keywords used to match document titles.
- `offset`: (*Filter parameter*), `integer`
The starting index for the documents to retrieve. Typically used in conjunction with `limit`. Defaults to `1`.
- `limit`: (*Filter parameter*), `integer`
The maximum number of documents to retrieve. Defaults to `1024`.
- `page`: (*Filter parameter*), `integer`
Specifies the page on which the documents will be displayed. Defaults to `1`.
- `page_size`: (*Filter parameter*), `integer`
The maximum number of documents on each page. Defaults to `1024`.
- `orderby`: (*Filter parameter*), `string`
The field by which documents should be sorted. Available options:
- `create_time` (default)
@ -958,14 +959,14 @@ Failure:
## List chunks
**GET** `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
**GET** `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&page={page}&page_size={page_size}&id={id}`
Lists chunks in a specified document.
### Request
- Method: GET
- URL: `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&offset={offset}&limit={limit}&id={chunk_id}`
- URL: `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&page={page}&page_size={page_size}&id={chunk_id}`
- Headers:
- `'Authorization: Bearer <YOUR_API_KEY>'`
@ -973,7 +974,7 @@ Lists chunks in a specified document.
```bash
curl --request GET \
--url http://{address}/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&offset={offset}&limit={limit}&id={chunk_id} \
--url http://{address}/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&page={page}&page_size={page_size}&id={chunk_id} \
--header 'Authorization: Bearer <YOUR_API_KEY>'
```
@ -985,10 +986,10 @@ curl --request GET \
The associated document ID.
- `keywords`(*Filter parameter*), `string`
The keywords used to match chunk content.
- `offset`(*Filter parameter*), `string`
The starting index for the chunks to retrieve. Defaults to `1`.
- `limit`(*Filter parameter*), `integer`
The maximum number of chunks to retrieve. Default: `1024`
- `page`(*Filter parameter*), `integer`
Specifies the page on which the chunks will be displayed. Defaults to `1`.
- `page_size`(*Filter parameter*), `integer`
The maximum number of chunks on each page. Defaults to `1024`.
- `id`(*Filter parameter*), `string`
The ID of the chunk to retrieve.
@ -1209,8 +1210,8 @@ Retrieves chunks from specified datasets.
- `"question"`: `string`
- `"dataset_ids"`: `list[string]`
- `"document_ids"`: `list[string]`
- `"offset"`: `integer`
- `"limit"`: `integer`
- `"page"`: `integer`
- `"page_size"`: `integer`
- `"similarity_threshold"`: `float`
- `"vector_similarity_weight"`: `float`
- `"top_k"`: `integer`
@ -1241,10 +1242,10 @@ curl --request POST \
The IDs of the datasets to search. If you do not set this argument, ensure that you set `"document_ids"`.
- `"document_ids"`: (*Body parameter*), `list[string]`
The IDs of the documents to search. Ensure that all selected documents use the same embedding model. Otherwise, an error will occur. If you do not set this argument, ensure that you set `"dataset_ids"`.
- `"offset"`: (*Body parameter*), `integer`
The starting index for the documents to retrieve. Defaults to `1`.
- `"limit"`: (*Body parameter*)
The maximum number of chunks to retrieve. Defaults to `1024`.
- `"page"`: (*Body parameter*), `integer`
Specifies the page on which the chunks will be displayed. Defaults to `1`.
- `"page_size"`: (*Body parameter*)
The maximum number of chunks on each page. Defaults to `1024`.
- `"similarity_threshold"`: (*Body parameter*)
The minimum similarity score. Defaults to `0.2`.
- `"vector_similarity_weight"`: (*Body parameter*), `float`

View File

@ -1,5 +1,5 @@
---
sidebar_position: 1
from Demos.mmapfile_demo import page_sizefrom Demos.mmapfile_demo import page_sizesidebar_position: 1
slug: /python_api_reference
---
@ -58,7 +58,7 @@ A brief description of the dataset to create. Defaults to `""`.
The language setting of the dataset to create. Available options:
- `"English"` (default)
- `"English"` (Default)
- `"Chinese"`
#### permission
@ -413,7 +413,7 @@ print(doc)
## List documents
```python
Dataset.list_documents(id:str =None, keywords: str=None, offset: int=1, limit:int = 1024,order_by:str = "create_time", desc: bool = True) -> list[Document]
Dataset.list_documents(id:str =None, keywords: str=None, page: int=1, page_size:int = 1024,order_by:str = "create_time", desc: bool = True) -> list[Document]
```
Lists documents in the current dataset.
@ -428,13 +428,13 @@ The ID of the document to retrieve. Defaults to `None`.
The keywords used to match document titles. Defaults to `None`.
#### offset: `int`
#### page: `int`
The starting index for the documents to retrieve. Typically used in conjunction with `limit`. Defaults to `0`.
Specifies the page on which the documents will be displayed. Defaults to `1`.
#### limit: `int`
#### page_size: `int`
The maximum number of documents to retrieve. Defaults to `1024`.
The maximum number of documents on each page. Defaults to `1024`.
#### orderby: `str`
@ -513,7 +513,7 @@ dataset = rag_object.create_dataset(name="kb_1")
filename1 = "~/ragflow.txt"
blob = open(filename1 , "rb").read()
dataset.upload_documents([{"name":filename1,"blob":blob}])
for doc in dataset.list_documents(keywords="rag", offset=0, limit=12):
for doc in dataset.list_documents(keywords="rag", page=0, page_size=12):
print(doc)
```
@ -689,7 +689,7 @@ chunk = doc.add_chunk(content="xxxxxxx")
## List chunks
```python
Document.list_chunks(keywords: str = None, offset: int = 1, limit: int = 1024, id : str = None) -> list[Chunk]
Document.list_chunks(keywords: str = None, page: int = 1, page_size: int = 1024, id : str = None) -> list[Chunk]
```
Lists chunks in the current document.
@ -700,13 +700,13 @@ Lists chunks in the current document.
The keywords used to match chunk content. Defaults to `None`
#### offset: `int`
#### page: `int`
The starting index for the chunks to retrieve. Defaults to `1`.
Specifies the page on which the chunks will be displayed. Defaults to `1`.
#### limit: `int`
#### page_size: `int`
The maximum number of chunks to retrieve. Default: `1024`
The maximum number of chunks on each page. Defaults to `1024`.
#### id: `str`
@ -726,7 +726,7 @@ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:
dataset = rag_object.list_datasets("123")
dataset = dataset[0]
dataset.async_parse_documents(["wdfxb5t547d"])
for chunk in doc.list_chunks(keywords="rag", offset=0, limit=12):
for chunk in doc.list_chunks(keywords="rag", page=0, page_size=12):
print(chunk)
```
@ -811,7 +811,7 @@ chunk.update({"content":"sdfx..."})
## Retrieve chunks
```python
RAGFlow.retrieve(question:str="", dataset_ids:list[str]=None, document_ids=list[str]=None, offset:int=1, limit:int=1024, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk]
RAGFlow.retrieve(question:str="", dataset_ids:list[str]=None, document_ids=list[str]=None, page:int=1, page_size:int=1024, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk]
```
Retrieves chunks from specified datasets.
@ -830,11 +830,11 @@ The IDs of the datasets to search. Defaults to `None`. If you do not set this ar
The IDs of the documents to search. Defaults to `None`. You must ensure all selected documents use the same embedding model. Otherwise, an error will occur. If you do not set this argument, ensure that you set `dataset_ids`.
#### offset: `int`
#### page: `int`
The starting index for the documents to retrieve. Defaults to `1`.
#### limit: `int`
#### page_size: `int`
The maximum number of chunks to retrieve. Defaults to `1024`.
@ -889,7 +889,7 @@ doc = doc[0]
dataset.async_parse_documents([doc.id])
for c in rag_object.retrieve(question="What's ragflow?",
dataset_ids=[dataset.id], document_ids=[doc.id],
offset=1, limit=30, similarity_threshold=0.2,
page=1, page_size=30, similarity_threshold=0.2,
vector_similarity_weight=0.3,
top_k=1024
):