Fix: Enforce default embedding model in create_dataset / update_dataset (#8486)

### What problem does this PR solve?

Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model

Now:
- Correctly prioritizes user-configured default embedding_model

Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
This commit is contained in:
Liu An
2025-06-25 16:41:32 +08:00
committed by GitHub
parent 340354b79c
commit dac5bcdf17
7 changed files with 52 additions and 30 deletions

View File

@ -53,7 +53,7 @@ class RAGFlow:
name: str,
avatar: Optional[str] = None,
description: Optional[str] = None,
embedding_model: Optional[str] = "BAAI/bge-large-zh-v1.5@BAAI",
embedding_model: Optional[str] = None,
permission: str = "me",
chunk_method: str = "naive",
parser_config: Optional[DataSet.ParserConfig] = None,