杨耀飞 2664522168 v0.2.1: 基于官方插件同步 thinking mode / compatibility_mode / reasoning_effort 功能
## 问题/错误
官方 OpenAI-API-compatible 插件已支持 thinking mode、compatibility_mode、reasoning_effort 等功能,
而我们的 vLLM 插件虽然 yaml 中有 agent_thought_support 字段配置,但代码层面完全没有实现对应逻辑,
导致用户配置思考模式后实际不生效。

## 发现过程
通过对比官方 langgenius/dify-official-plugins 仓库的 models/openai_api_compatible/models/llm/llm.py
与我们的实现,发现以下功能缺失:
- P0: enable_thinking 思考模式开关、_wrap_thinking_by_reasoning_content(vLLM reasoning 字段适配)
- P1: compatibility_mode 兼容模式、reasoning_effort 推理力度、_filter_thinking_stream/result 思考内容过滤
同时确认 vLLM OpenAI-compatible API 支持上述所有功能(reasoning_effort 为顶层参数,enable_thinking 通过 chat_template_kwargs 传递)。

## 实现方案
1. 从官方插件拷贝 P0+P1 功能实现,适配 vLLM 差异(vLLM 用 reasoning 而非 reasoning_content)
2. 保留 extra_body 作为 vLLM extra parameters 的独有功能
3. 移除旧的 OpenAILargeLanguageModel 中间类,VllmLargeLanguageModel 直接继承 OAICompatLargeLanguageModel
4. 新增 compatibility_mode yaml 配置字段(strict/extended)

## 实现过程
- models/llm/llm.py: 重写,合并官方插件的 thinking mode、structured output、reasoning_effort 实现,
  加入 _validate_extra_body 和 extra_body 参数规则
- provider/vllm.yaml: 新增 compatibility_mode 配置项
- manifest.yaml: 版本升至 0.2.1
- README.md: 添加 0.2.1 版本说明
- requirements.txt: dify_plugin 升至 0.7.4

## 总结
版本升级至 0.2.1,核心变更:thinking mode 全链路支持(enable_thinking → chat_template_kwargs → reasoning 响应 →
think 标签过滤)、compatibility_mode 控制额外参数注入、reasoning_effort 推理力度控制。
2026-04-22 11:52:08 +08:00
2025-05-22 16:41:00 +08:00
2025-03-05 15:52:24 +08:00
2025-03-05 15:52:24 +08:00
2025-03-05 15:52:24 +08:00
2025-03-12 17:52:37 +08:00

vllm-openai dify provider plugin to support extra parameters

NOTE!!!

This plugin is a extension for official OpenAI-API-compatible, provide features for extra parameters in vLLM's OpenAI-Compatible Server.

本插件是在官方 OpenAI-API-compatible 基础上构建, 用于提供vLLM's OpenAI-Compatible Server 中的 extra parameters. 若没有使用上述 extra parameters 中的相关的特性, 请使用官方 OpenAI-API-compatible 插件.

version 0.2.0 notice

由于 Dify 的 openai compatible provider 和 vLLM's openai compatible server 都已完整支持结构化输出, 因此本项目的原有的主要功能支持已经没有必要. 在 0.2.0 中会去除对原有任何参数的支持, 转而直接使用 extra_body 来传递 extra parameters.

version 0.2.0 notice

Due to the change of Dify's openai compatible provider and vLLM's openai compatible server support the CFG(Classifier-Free Guidance) features. So, it's not necessary to do structured output here, and so as other features. In 0.2.0, the plugin will remove all parameters and add extra_body to pass extra parameters directly to vLLM backend.

version 0.2.1 notice

基于官方 OpenAI-API-compatible 插件的最新实现,增加了以下功能:

  • Thinking Mode: 支持 enable_thinking 开关,适配 vLLM 的 chat_template_kwargsreasoning 响应字段
  • Compatibility Mode: 新增 compatibility_mode 配置strict/extended控制是否注入额外参数
  • Reasoning Effort: 支持 reasoning_effort 参数none/low/medium/highvLLM 原生支持
  • Structured Output: 支持 response_formatjson_schemareasoning_format 参数
  • Thinking Content Filter: 当 thinking 关闭时自动过滤 <think/> 内容
  • 保留 extra_body 作为 vLLM extra parameters 的通用传递方式

version 0.2.1 notice

Based on the latest official OpenAI-API-compatible plugin implementation, the following features have been added:

  • Thinking Mode: Support enable_thinking toggle, compatible with vLLM's chat_template_kwargs and reasoning response field
  • Compatibility Mode: New compatibility_mode config (strict/extended), controls whether to inject extra parameters
  • Reasoning Effort: Support reasoning_effort parameter (none/low/medium/high), natively supported by vLLM
  • Structured Output: Support response_format, json_schema, reasoning_format parameters
  • Thinking Content Filter: Automatically filter <think/> content when thinking is disabled
  • Keep extra_body as the generic way to pass vLLM extra parameters

Repo

https://github.com/yangyaofei/dify-vllm-provider

Description

The vllm openAI compatible server has extra parameters.** , The openai compatible provider in Dify can not do this.

This plugin provide a vllm-openai provider upon Dify's openai compatible provider with extra_body

Add model same as openai compatible with vLLM-openai backend

add model

Config model guided with extra_body

use guided

Description
No description provided
Readme 584 KiB
Languages
Python 100%