From ad56137a59f155150fac7e9f146438b8e4b3d9a4 Mon Sep 17 00:00:00 2001 From: pyyuhao Date: Sat, 11 Oct 2025 19:58:12 +0800 Subject: [PATCH] =?UTF-8?q?Feat:=20=E2=80=8B=E2=80=8BOpenSearch's=20suppor?= =?UTF-8?q?t=20for=20newly=20embedding=20models=E2=80=8B=E2=80=8B=20(#1049?= =?UTF-8?q?4)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ### What problem does this PR solve? fix issues:https://github.com/infiniflow/ragflow/issues/10402 As the newly distributed embedding models support vector dimensions max to 4096, while current OpenSearch's max dimension support is 1536. As I tested, the 4096-dimensions vector will be treated as a float type which is unacceptable in OpenSearch. Besides, OpenSearch supports max to 16000 dimensions by defalut with the vector engine(Faiss). According to: https://docs.opensearch.org/2.19/field-types/supported-field-types/knn-methods-engines/ I added max to 10240 dimensions support for OpenSearch, as I think will be sufficient in the future. As I tested , it worked well on my own server (treated as knn_vector)by using qwen3-embedding:8b as the embedding model: image ### Type of change - [x] New Feature (non-breaking change which adds functionality) By the way, I will still focus on the stuff about Elasticsearch/Opensearch as search engines and vector databases. Co-authored-by: 张雨豪 --- conf/os_mapping.json | 55 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/conf/os_mapping.json b/conf/os_mapping.json index a8663e069..47b7c24b0 100644 --- a/conf/os_mapping.json +++ b/conf/os_mapping.json @@ -200,6 +200,61 @@ } } }, + { + "knn_vector": { + "match": "*_2048_vec", + "mapping": { + "type": "knn_vector", + "index": true, + "space_type": "cosinesimil", + "dimension": 2048 + } + } + }, + { + "knn_vector": { + "match": "*_4096_vec", + "mapping": { + "type": "knn_vector", + "index": true, + "space_type": "cosinesimil", + "dimension": 4096 + } + } + }, + { + "knn_vector": { + "match": "*_6144_vec", + "mapping": { + "type": "knn_vector", + "index": true, + "space_type": "cosinesimil", + "dimension": 6144 + } + } + }, + { + "knn_vector": { + "match": "*_8192_vec", + "mapping": { + "type": "knn_vector", + "index": true, + "space_type": "cosinesimil", + "dimension": 8192 + } + } + }, + { + "knn_vector": { + "match": "*_10240_vec", + "mapping": { + "type": "knn_vector", + "index": true, + "space_type": "cosinesimil", + "dimension": 10240 + } + } + }, { "binary": { "match": "*_bin",