Feat: ​​OpenSearch's support for newly embedding models​​ (#10494)

### What problem does this PR solve?

fix issues:https://github.com/infiniflow/ragflow/issues/10402

As the newly distributed embedding models support vector dimensions max
to 4096, while current OpenSearch's max dimension support is 1536.
As I tested, the 4096-dimensions vector will be treated as a float type
which is unacceptable in OpenSearch.

Besides, OpenSearch supports max to 16000 dimensions by defalut with the
vector engine(Faiss). According to:
https://docs.opensearch.org/2.19/field-types/supported-field-types/knn-methods-engines/

I added max to 10240 dimensions support for OpenSearch, as I think will
be sufficient in the future.

As I tested , it worked well on my own server (treated as knn_vector)by
using qwen3-embedding:8b as the embedding model:
<img width="1338" height="790" alt="image"
src="https://github.com/user-attachments/assets/a9b2d284-fcf6-4cea-859a-6aadccf36ace"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)


By the way, I will still focus on the stuff about
Elasticsearch/Opensearch as search engines and vector databases.

Co-authored-by: 张雨豪 <zhangyh80@chinatelecom.cn>
This commit is contained in:
pyyuhao
2025-10-11 19:58:12 +08:00
committed by GitHub
parent 2828e321bc
commit ad56137a59

View File

@ -200,6 +200,61 @@
} }
} }
}, },
{
"knn_vector": {
"match": "*_2048_vec",
"mapping": {
"type": "knn_vector",
"index": true,
"space_type": "cosinesimil",
"dimension": 2048
}
}
},
{
"knn_vector": {
"match": "*_4096_vec",
"mapping": {
"type": "knn_vector",
"index": true,
"space_type": "cosinesimil",
"dimension": 4096
}
}
},
{
"knn_vector": {
"match": "*_6144_vec",
"mapping": {
"type": "knn_vector",
"index": true,
"space_type": "cosinesimil",
"dimension": 6144
}
}
},
{
"knn_vector": {
"match": "*_8192_vec",
"mapping": {
"type": "knn_vector",
"index": true,
"space_type": "cosinesimil",
"dimension": 8192
}
}
},
{
"knn_vector": {
"match": "*_10240_vec",
"mapping": {
"type": "knn_vector",
"index": true,
"space_type": "cosinesimil",
"dimension": 10240
}
}
},
{ {
"binary": { "binary": {
"match": "*_bin", "match": "*_bin",