mirror of
https://github.com/infiniflow/ragflow.git
synced 2025-12-08 20:42:30 +08:00
Feat: Adds OpenSearch2.19.1 as the vector_database support (#7140)
### What problem does this PR solve?
This PR adds the support for latest OpenSearch2.19.1 as the store engine
& search engine option for RAGFlow.
### Main Benefit
1. OpenSearch2.19.1 is licensed under the [Apache v2.0 License] which is
much better than Elasticsearch
2. For search, OpenSearch2.19.1 supports full-text
search、vector_search、hybrid_search those are similar with Elasticsearch
on schema
3. For store, OpenSearch2.19.1 stores text、vector those are quite
simliar with Elasticsearch on schema
### Changes
- Support opensearch_python_connetor. I make a lot of adaptions since
the schema and api/method between ES and Opensearch differs in many
ways(especially the knn_search has a significant gap) :
rag/utils/opensearch_coon.py
- Support static config adaptions by changing:
conf/service_conf.yaml、api/settings.py、rag/settings.py
- Supprt some store&search schema changes between OpenSearch and ES:
conf/os_mapping.json
- Support OpenSearch python sdk : pyproject.toml
- Support docker config for OpenSearch2.19.1 :
docker/.env、docker/docker-compose-base.yml、docker/service_conf.yaml.template
### How to use
- I didn't change the priority that ES as the default doc/search engine.
Only if in docker/.env , we set DOC_ENGINE=${DOC_ENGINE:-opensearch}, it
will work.
### Others
Our team tested a lot of docs in our environment by using OpenSearch as
the vector database ,it works very well.
All the conifg for OpenSearch is necessary.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
This commit is contained in:
@ -35,6 +35,44 @@ services:
|
||||
- ragflow
|
||||
restart: on-failure
|
||||
|
||||
opensearch01:
|
||||
container_name: ragflow-opensearch-01
|
||||
profiles:
|
||||
- opensearch
|
||||
image: hub.icert.top/opensearchproject/opensearch:2.19.1
|
||||
volumes:
|
||||
- osdata01:/usr/share/opensearch/data
|
||||
ports:
|
||||
- ${OS_PORT}:9201
|
||||
env_file: .env
|
||||
environment:
|
||||
- node.name=opensearch01
|
||||
- OPENSEARCH_PASSWORD=${OPENSEARCH_PASSWORD}
|
||||
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_PASSWORD}
|
||||
- bootstrap.memory_lock=false
|
||||
- discovery.type=single-node
|
||||
- plugins.security.disabled=false
|
||||
- plugins.security.ssl.http.enabled=false
|
||||
- plugins.security.ssl.transport.enabled=true
|
||||
- cluster.routing.allocation.disk.watermark.low=5gb
|
||||
- cluster.routing.allocation.disk.watermark.high=3gb
|
||||
- cluster.routing.allocation.disk.watermark.flood_stage=2gb
|
||||
- TZ=${TIMEZONE}
|
||||
- http.port=9201
|
||||
mem_limit: ${MEM_LIMIT}
|
||||
ulimits:
|
||||
memlock:
|
||||
soft: -1
|
||||
hard: -1
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "curl http://localhost:9201"]
|
||||
interval: 10s
|
||||
timeout: 10s
|
||||
retries: 120
|
||||
networks:
|
||||
- ragflow
|
||||
restart: on-failure
|
||||
|
||||
infinity:
|
||||
container_name: ragflow-infinity
|
||||
profiles:
|
||||
@ -133,6 +171,8 @@ services:
|
||||
volumes:
|
||||
esdata01:
|
||||
driver: local
|
||||
osdata01:
|
||||
driver: local
|
||||
infinity_data:
|
||||
driver: local
|
||||
mysql_data:
|
||||
|
||||
Reference in New Issue
Block a user