mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-02-06 10:35:06 +08:00
feat: Add optional document metadata in OpenAI-compatible response references (#12950)
### What problem does this PR solve? This PR adds an opt‑in way to include document‑level metadata in OpenAI‑compatible reference chunks. Until now, metadata could be used for filtering but wasn’t returned in responses. The change enables clients to show richer citations (author/year/source, etc.) while keeping payload size and privacy under control via an explicit request flag and optional field allowlist. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Contribution during my time at RAGcon GmbH.
This commit is contained in:
@ -65,6 +65,10 @@ curl --request POST \
|
||||
"stream": true,
|
||||
"extra_body": {
|
||||
"reference": true,
|
||||
"reference_metadata": {
|
||||
"include": true,
|
||||
"fields": ["author", "year", "source"]
|
||||
},
|
||||
"metadata_condition": {
|
||||
"logic": "and",
|
||||
"conditions": [
|
||||
@ -93,6 +97,9 @@ curl --request POST \
|
||||
- `extra_body` (*Body parameter*) `object`
|
||||
Extra request parameters:
|
||||
- `reference`: `boolean` - include reference in the final chunk (stream) or in the final message (non-stream).
|
||||
- `reference_metadata`: `object` - include document metadata in each reference chunk.
|
||||
- `include`: `boolean` - enable document metadata in reference chunks.
|
||||
- `fields`: `list[string]` - optional allowlist of metadata keys. Omit to include all. Use an empty list to include none.
|
||||
- `metadata_condition`: `object` - metadata filter conditions applied to retrieval results.
|
||||
|
||||
#### Response
|
||||
@ -275,6 +282,11 @@ data: {
|
||||
"content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.",
|
||||
"document_id": "4bdd2ff65e1511f0907f09f583941b45",
|
||||
"document_name": "INSTALL22.md",
|
||||
"document_metadata": {
|
||||
"author": "bob",
|
||||
"year": "2023",
|
||||
"source": "internal"
|
||||
},
|
||||
"dataset_id": "456ce60c5e1511f0907f09f583941b45",
|
||||
"image_id": "",
|
||||
"positions": [
|
||||
@ -345,6 +357,11 @@ Non-stream:
|
||||
"doc_type": "",
|
||||
"document_id": "4bdd2ff65e1511f0907f09f583941b45",
|
||||
"document_name": "INSTALL22.md",
|
||||
"document_metadata": {
|
||||
"author": "bob",
|
||||
"year": "2023",
|
||||
"source": "internal"
|
||||
},
|
||||
"id": "4b8935ac0a22deb1",
|
||||
"image_id": "",
|
||||
"positions": [
|
||||
@ -3948,6 +3965,8 @@ data: {
|
||||
data:[DONE]
|
||||
```
|
||||
|
||||
When `extra_body.reference_metadata.include` is `true`, each reference chunk may include a `document_metadata` object.
|
||||
|
||||
Non-stream:
|
||||
|
||||
```json
|
||||
|
||||
@ -83,7 +83,13 @@ completion = client.chat.completions.create(
|
||||
{"role": "user", "content": "Can you tell me how to install neovim"},
|
||||
],
|
||||
stream=stream,
|
||||
extra_body={"reference": reference}
|
||||
extra_body={
|
||||
"reference": reference,
|
||||
"reference_metadata": {
|
||||
"include": True,
|
||||
"fields": ["author", "year", "source"],
|
||||
},
|
||||
}
|
||||
)
|
||||
|
||||
if stream:
|
||||
@ -98,6 +104,8 @@ else:
|
||||
print(completion.choices[0].message.reference)
|
||||
```
|
||||
|
||||
When `extra_body.reference_metadata.include` is `true`, each reference chunk may include a `document_metadata` object in both streaming and non-streaming responses.
|
||||
|
||||
## DATASET MANAGEMENT
|
||||
|
||||
---
|
||||
@ -1518,6 +1526,8 @@ A list of `Chunk` objects representing references to the message, each containin
|
||||
The ID of the referenced document.
|
||||
- `document_name` `str`
|
||||
The name of the referenced document.
|
||||
- `document_metadata` `dict`
|
||||
Optional document metadata, returned only when `extra_body.reference_metadata.include` is `true`.
|
||||
- `position` `list[str]`
|
||||
The location information of the chunk within the referenced document.
|
||||
- `dataset_id` `str`
|
||||
@ -1643,6 +1653,8 @@ A list of `Chunk` objects representing references to the message, each containin
|
||||
The ID of the referenced document.
|
||||
- `document_name` `str`
|
||||
The name of the referenced document.
|
||||
- `document_metadata` `dict`
|
||||
Optional document metadata, returned only when `extra_body.reference_metadata.include` is `true`.
|
||||
- `position` `list[str]`
|
||||
The location information of the chunk within the referenced document.
|
||||
- `dataset_id` `str`
|
||||
@ -2596,4 +2608,3 @@ memory_object.get_message_content(message_id)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user