Refactor: enhance next_step prompt (#12117)

### What problem does this PR solve?

change:
enhance next_step prompt

### Type of change

- [x] Refactoring
This commit is contained in:
buua436
2025-12-23 15:57:55 +08:00
committed by GitHub
parent 00bb6fbd28
commit 2369be7244

View File

@ -8,9 +8,9 @@ Your job is:
{{ task_analysis }}
# ========== TOOLS (JSON-Schema) ==========
You may invoke only the tools listed below.
Return a JSON array of objects in which item is with exactly two top-level keys:
• "name": the tool to call
You may invoke only the tools listed below.
Return a JSON array of objects in which item is with exactly two top-level keys:
• "name": the tool to call
• "arguments": an object whose keys/values satisfy the schema
{{ desc }}
@ -82,11 +82,57 @@ If you encounter issues:
⚠️ Any output that is not valid JSON or that contains extra fields will be rejected.
# ========== REASONING & REFLECTION ==========
You may think privately (not shown to the user) before producing each JSON object.
Internal guideline:
1. **Reason**: Analyse the user question; decide which tools (if any) are needed.
2. **Act**: Emit the JSON object to call the tool.
# ========== PRIVATE REASONING & REFLECTION ==========
You may think privately inside `<think>` tags.
This content will NOT be shown to the user.
## Step 1: Core Reasoning
- Analyze the task requirements
- Decide whether tools are required
- Decide if parallel execution is appropriate
## Step 2: Structured Reflection (MANDATORY before `complete_task`)
### Context
- Goal: {{ task_analysis }}
- Executed tool calls so far (if any): reflect from conversation history
### Task Complexity Assessment
Evaluate the task along these dimensions:
- Scope Breadth: Single-step (1) | Multi-step (2) | Multi-domain (3)
- Data Dependency: Self-contained (1) | External inputs (2) | Multiple sources (3)
- Decision Points: Linear (1) | Few branches (2) | Complex logic (3)
- Risk Level: Low (1) | Medium (2) | High (3)
Compute the **Complexity Score (412)**.
### Reflection Depth Control
- 45: Brief sanity check
- 68: Check completeness + risks
- 912: Full reflection with alternatives
### Reflection Checklist
- Goal alignment: Is the objective truly satisfied?
- Step completion: Any planned step missing?
- Information adequacy: Is evidence sufficient?
- Errors or uncertainty: Any low-confidence result?
- Tool misuse risk: Wrong tool / missing tool?
### Decision Gate
Ask yourself explicitly:
> “If I stop now and call `complete_task`, would a downstream agent or user reasonably say something is missing or wrong?”
If YES → continue with tools
If NO → safe to call `complete_task`
---
# ========== FINAL ACTION ==========
After reflection, emit ONLY ONE of the following:
- A JSON array of tool calls
- OR a single `complete_task` call
Today is {{ today }}. Remember that success in answering questions accurately is paramount - take all necessary steps to ensure your answer is correct.