## What's changed
fix: unify embedding model fallback logic for both TEI and non-TEI
Docker deployments
> This fix targets **Docker / `docker-compose` deployments**, ensuring a
valid default embedding model is always set—regardless of the compose
profile used.
## Changes
| Scenario | New Behavior |
|--------|--------------|
| **Non-`tei-` profile** (e.g., default deployment) | `EMBEDDING_MDL` is
now correctly initialized from `EMBEDDING_CFG` (derived from
`user_default_llm`), ensuring custom defaults like `bge-m3@Ollama` are
properly applied to new tenants. |
| **`tei-` profile** (`COMPOSE_PROFILES` contains `tei-`) | Still
respects the `TEI_MODEL` environment variable. If unset, falls back to
`EMBEDDING_CFG`. Only when both are empty does it use the built-in
default (`BAAI/bge-small-en-v1.5`), preventing an empty embedding model.
|
## Why This Change?
- **In non-TEI mode**: The previous logic would reset `EMBEDDING_MDL` to
an empty string, causing pre-configured defaults (e.g., `bge-m3@Ollama`
in the Docker image) to be ignored—leading to tenant initialization
failures or silent misconfigurations.
- **In TEI mode**: Users need the ability to override the model via
`TEI_MODEL`, but without a safe fallback, missing configuration could
break the system. The new logic adopts a **“config-first,
env-var-override”** strategy for robustness in containerized
environments.
## Implementation
- Updated the assignment logic for `EMBEDDING_MDL` in
`rag/common/settings.py` to follow a unified fallback chain:
EMBEDDING_CFG → TEI_MODEL (if tei- profile active) → built-in default
## Testing
Verified in Docker deployments:
1. **`COMPOSE_PROFILES=`** (no TEI)
→ New tenants get `bge-m3@Ollama` as the default embedding model
2. **`COMPOSE_PROFILES=tei-gpu` with no `TEI_MODEL` set**
→ Falls back to `BAAI/bge-small-en-v1.5`
3. **`COMPOSE_PROFILES=tei-gpu` with `TEI_MODEL=my-model`**
→ New tenants use `my-model` as the embedding model
Closes#8916fix#11522fix#11306
### What problem does this PR solve?
This Pull Request introduces native support for Google Cloud Storage
(GCS) as an optional object storage backend.
Currently, RAGFlow relies on a limited set of storage options. This
feature addresses the need for seamless integration with GCP
environments, allowing users to leverage a fully managed, highly
durable, and scalable storage service (GCS) instead of needing to deploy
and maintain third-party object storage solutions. This simplifies
deployment, especially for users running on GCP infrastructure like GKE
or Cloud Run.
The implementation uses a single GCS bucket defined via configuration,
mapping RAGFlow's internal logical storage units (or "buckets") to
folder prefixes within that GCS container to maintain data separation.
This architectural choice avoids the operational complexities associated
with dynamically creating and managing unique GCS buckets for every
logical unit.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add OceanBase doc engine. Close#5350
### Type of change
- [x] New Feature (non-breaking change which adds functionality)