refactor: optimize agent list payload and improve multimodal detection logic (#12942)

## Description This PR focuses on API performance optimization and refining the model capability detection logic in the Agent/Canvas module. ### 1. Performance Optimization (Backend) - **Changes**: Removed `cls.model.dsl` from query fields in `UserCanvasService.get_by_tenant_ids`. - **Reasoning**: The `dsl` object is large and unnecessary for the Agent list view. Excluding it reduces the payload size of the `/v1/canvas/list` API, leading to faster serialization and reduced network latency. - **Consistency**: Full DSL data remains accessible via the individual `/v1/canvas/get/<id>` endpoint used in the detail view. ### 2. Multimodal Detection Refinement (Frontend) - **Changes**: Replaced `model_type === LlmModelType.Image2text` with `tags?.includes('IMAGE2TEXT')`. - **Reasoning**: In RAGFlow, `model_type` defines the primary role of a model (e.g., `chat`). However, many advanced Chat models are also vision-capable. Since `model_type` is a single-value field, it cannot represent these multiple capabilities. - **Solution**: Utilizing the `tags` field (which supports multiple attributes) to check for `IMAGE2TEXT` ensures that models like `gpt-5.2-pro` correctly display multimodal input options. ## Type of Change - [x] Bug fix (logic correction for multimodal detection) - [x] Optimization (performance improvement for list API) ## Main Changes - `api/db/services/canvas_service.py`: Optimized DB query by excluding heavy DSL fields. - `web/src/pages/agent/form/agent-form/index.tsx`: Enhanced capability detection using the tags system. ## Verification - [x] Verified Agent list loads faster with reduced response payload. - [x] Confirmed that `chat` models with the `IMAGE2TEXT` tag now correctly enable the multimodal input UI.
2026-02-03 00:55:10 +08:00 · 2026-02-02 17:35:54 +08:00
parent 0121866ce4
commit 2e5a18602b
2 changed files with 1 additions and 2 deletions
--- a/api/db/services/canvas_service.py
+++ b/api/db/services/canvas_service.py
@ -146,7 +146,6 @@ class UserCanvasService(CommonService):
            cls.model.id,
            cls.model.avatar,
            cls.model.title,
            cls.model.dsl,
            cls.model.description,
            cls.model.permission,
            cls.model.user_id.alias("tenant_id"),
--- a/web/src/pages/agent/form/agent-form/index.tsx
+++ b/web/src/pages/agent/form/agent-form/index.tsx
@ -162,7 +162,7 @@ function AgentForm({ node }: INextOperatorForm) {
        <FormWrapper>
          {isSubAgent && <DescriptionField></DescriptionField>}
          <LargeModelFormField showSpeech2TextModel></LargeModelFormField>
-          {findLlmByUuid(llmId)?.model_type === LlmModelType.Image2text && (
+          {findLlmByUuid(llmId)?.tags?.includes('IMAGE2TEXT') && (
            <QueryVariable
              name="visual_files_var"
              label="Visual Input File"