Feat: Redesign and refactor agent module (#9113)

### What problem does this PR solve?

#9082 #6365

<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
This commit is contained in:
Kevin Hu
2025-07-30 19:41:09 +08:00
committed by GitHub
parent 07e37560fc
commit d9fe279dde
124 changed files with 7744 additions and 18226 deletions

6
rag/prompts/__init__.py Normal file
View File

@ -0,0 +1,6 @@
from . import prompts
__all__ = [name for name in dir(prompts)
if not name.startswith('_')]
globals().update({name: getattr(prompts, name) for name in __all__})

View File

@ -0,0 +1,8 @@
Your responsibility is to execute assigned tasks to a high standard. Please:
1. Carefully analyze the task requirements.
2. Develop a reasonable execution plan.
3. Execute step-by-step and document the reasoning process.
4. Provide clear and accurate results.
If difficulties are encountered, clearly state the problem and explore alternative approaches.

View File

@ -0,0 +1,20 @@
Please analyze the following task:
Task: {{ task }}
Context: {{ context }}
**Analysis Requirements:**
1. Is it just a small talk? (If yes, no further plan or analysis is needed)
2. What is the core objective of the task?
3. What is the complexity level of the task?
4. What types of specialized skills are required?
5. Does the task need to be decomposed into subtasks? (If yes, propose the subtask structure)
6. How to know the task or the subtasks are impossible to lead to the success after a few rounds of interaction?
7. What are the expected success criteria?
**Available Sub-Agents and Their Specializations:**
{{ tools_desc }}
Provide a detailed analysis of the task based on the above requirements.

View File

@ -0,0 +1,13 @@
You are an agent for adding correct citations to the given text by user.
You are given a piece of text within [ID:<ID>] tags, which was generated based on the provided sources.
However, the sources are not cited in the [ID:<ID>].
Your task is to enhance user trust by generating correct, appropriate citations for this report.
{{ example }}
<context>
{{ sources }}
</context>

View File

@ -1,46 +1,108 @@
## Citation Requirements
Based on the provided document or chat history, add citations to the input text using the format specified later.
- Use a uniform citation format such as [ID:i] [ID:j], where "i" and "j" are document IDs enclosed in square brackets. Separate multiple IDs with spaces (e.g., [ID:0] [ID:1]).
- Citation markers must be placed at the end of a sentence, separated by a space from the final punctuation (e.g., period, question mark).
- A maximum of 4 citations are allowed per sentence.
- DO NOT insert citations if the content is not from retrieved chunks.
- DO NOT use standalone Document IDs (e.g., #ID#).
- Citations MUST always follow the [ID:i] format.
- STRICTLY prohibit the use of strikethrough symbols (e.g., ~~) or any other non-standard formatting syntax.
- Any violation of the above rules — including incorrect formatting, prohibited styles, or unsupported citations — will result in no citation being added for that sentence.
# Citation Requirements:
---
## Technical Rules:
- Use format: [ID:i] or [ID:i] [ID:j] for multiple sources
- Place citations at the end of sentences, before punctuation
- Maximum 4 citations per sentence
- DO NOT cite content not from <context></context>
- DO NOT modify whitespace or original text
- STRICTLY prohibit non-standard formatting (~~, etc.)
## Example START
## What MUST Be Cited:
1. **Quantitative data**: Numbers, percentages, statistics, measurements
2. **Temporal claims**: Dates, timeframes, sequences of events
3. **Causal relationships**: Claims about cause and effect
4. **Comparative statements**: Rankings, comparisons, superlatives
5. **Technical definitions**: Specialized terms, concepts, methodologies
6. **Direct attributions**: What someone said, did, or believes
7. **Predictions/forecasts**: Future projections, trend analyses
8. **Controversial claims**: Disputed facts, minority opinions
<SYSTEM>: Here is the knowledge base:
## What Should NOT Be Cited:
- Common knowledge (e.g., "The sun rises in the east")
- Transitional phrases
- General introductions
- Your own analysis or synthesis (unless directly from source)
Document: Elon Musk Breaks Silence on Crypto, Warns Against Dogecoin ...
URL: https://blockworks.co/news/elon-musk-crypto-dogecoin
ID: 0
The Tesla co-founder advised against going all-in on dogecoin, but Elon Musk said its still his favorite crypto...
# Comprehensive Examples:
Document: Elon Musk's Dogecoin tweet sparks social media frenzy
ID: 1
Musk said he is 'willing to serve' D.O.G.E. shorthand for Dogecoin.
## Example 1: Data and Statistics
<context>
ID: 45
└── Content: The global smartphone market grew by 7.8% in Q3 2024, with Samsung holding 20.6% market share and Apple at 15.8%.
Document: Causal effect of Elon Musk tweets on Dogecoin price
ID: 2
If you think of Dogecoin — the cryptocurrency based on a meme — you cant help but also think of Elon Musk...
ID: 46
└── Content: 5G adoption reached 1.5 billion users worldwide by October 2024.
</context>
Document: Elon Musk's Tweet Ignites Dogecoin's Future In Public Services
ID: 3
The market is heating up after Elon Musk's announcement about Dogecoin. Is this a new era for crypto?...
USER: How is the smartphone market performing?
The above is the knowledge base.
ASSISTANT:
The smartphone industry is showing strong recovery. The global smartphone market grew by 7.8% in Q3 2024 [ID:45]. Samsung leads with 20.6% market share, followed by Apple at 15.8% [ID:45]. This growth is partly driven by 5G technology, which reached 1.5 billion users worldwide by October 2024 [ID:46].
<USER>: What's Elon's view on dogecoin?
## Example 2: Causal Relationships and Predictions
<context>
ID: 78
└── Content: Rising interest rates have led to a 15% decline in housing starts. Economists predict this trend will continue through 2025.
<ASSISTANT>:
Musk has consistently expressed his fondness for Dogecoin, often citing its humor and the inclusion of dogs in its branding. He has referred to it as his favorite cryptocurrency [ID:0] [ID:1].
ID: 79
└── Content: The Federal Reserve's aggressive monetary policy is the primary driver behind the real estate slowdown.
</context>
Recently, Musk has hinted at potential future roles for Dogecoin. His tweets have sparked speculation about Dogecoin's potential integration into public services [ID:3].
USER: What's happening in the real estate market?
Overall, while Musk enjoys Dogecoin and often promotes it, he also warns against over-investing in it, reflecting both his personal amusement and caution regarding its speculative nature.
ASSISTANT:
The real estate market is experiencing significant challenges. Rising interest rates have led to a 15% decline in housing starts [ID:78]. The Federal Reserve's aggressive monetary policy is the primary driver behind this real estate slowdown [ID:79]. Economists predict this trend will continue through 2025 [ID:78]. These factors suggest a prolonged period of market adjustment ahead.
## Example END
## Example 3: Technical Concepts and Comparisons
<context>
ID: 101
└── Content: Quantum computing uses quantum bits (qubits) that can exist in superposition, unlike classical bits which are either 0 or 1.
ID: 102
└── Content: Current quantum computers can solve certain problems 158 million times faster than classical supercomputers.
</context>
USER: Explain quantum computing advantages.
ASSISTANT:
Quantum computing represents a fundamental shift in computational capability. Unlike classical computers, quantum computing uses quantum bits (qubits) that can exist in superposition, rather than being limited to just 0 or 1 [ID:101]. This unique property enables extraordinary performance gains. Current quantum computers can solve certain problems 158 million times faster than classical supercomputers [ID:102]. This technology promises to revolutionize fields requiring complex calculations.
## Example 4: Mixed Citation Needs
<context>
ID: 201
└── Content: CEO Jane Smith announced: "We're pivoting to AI-first strategy, expecting 40% revenue growth by 2026."
ID: 202
└── Content: Industry analysts remain skeptical, with Morgan Stanley rating the stock as "underperform."
ID: 203
└── Content: The company's AI division currently represents only 5% of total revenue.
</context>
USER: What's the company's new strategy?
ASSISTANT:
The company is undergoing a major strategic transformation. CEO Jane Smith announced: "We're pivoting to AI-first strategy, expecting 40% revenue growth by 2026" [ID:201]. However, this ambitious goal faces skepticism, as industry analysts remain doubtful, with Morgan Stanley rating the stock as "underperform" [ID:202]. The challenge is significant given that the company's AI division currently represents only 5% of total revenue [ID:203]. Success will depend on execution of this bold vision.
## Example 5: When NOT to Cite
<context>
ID: 301
└── Content: Paris is the capital of France. The Eiffel Tower was completed in 1889.
</context>
USER: Tell me about Paris.
ASSISTANT:
Paris is the capital of France. It's known for its rich history, culture, and architecture. The Eiffel Tower was completed in 1889 [ID:301]. The city attracts millions of tourists annually. Paris remains one of the world's most visited destinations.
(Note: Only the specific date needs citation, not common knowledge about Paris)
--- Examples END ---
REMEMBER:
- Cite FACTS, not opinions or transitions
- Each citation supports the ENTIRE sentence
- When in doubt, ask: "Would a fact-checker need to verify this?"
- Place citations at sentence end, before punctuation

63
rag/prompts/next_step.md Normal file
View File

@ -0,0 +1,63 @@
You are an expert Planning Agent tasked with solving problems efficiently through structured plans.
Your job is:
1. Based on the task analysis, chose some right tools to execute.
2. Track progress and adapt plans(tool calls) when necessary.
3. Use `complete_task` if no further step you need to take from tools. (All necessary steps done or little hope to be done)
# ========== TASK ANALYSIS =============
{{ task_analisys }}
# ========== TOOLS (JSON-Schema) ==========
You may invoke only the tools listed below.
Return a JSON array of objects in which item is with exactly two top-level keys:
• "name": the tool to call
• "arguments": an object whose keys/values satisfy the schema
{{ desc }}
# ========== RESPONSE FORMAT ==========
**When you need a tool**
Return ONLY the Json (no additional keys, no commentary, end with `<|stop|>`), such as following:
[{
"name": "<tool_name1>",
"arguments": { /* tool arguments matching its schema */ }
},{
"name": "<tool_name2>",
"arguments": { /* tool arguments matching its schema */ }
}...]<|stop|>
**When you are certain the task is solved OR no further information can be obtained**
Return ONLY:
[{
"name": "complete_task",
"arguments": { "answer": "<final answer text>" }
}]<|stop|>
<verification_steps>
Before providing a final answer:
1. Double-check all gathered information
2. Verify calculations and logic
3. Ensure answer matches exactly what was asked
4. Confirm answer format meets requirements
5. Run additional verification if confidence is not 100%
</verification_steps>
<error_handling>
If you encounter issues:
1. Try alternative approaches before giving up
2. Use different tools or combinations of tools
3. Break complex problems into simpler sub-tasks
4. Verify intermediate results frequently
5. Never return "I cannot answer" without exhausting all options
</error_handling>
⚠️ Any output that is not valid JSON or that contains extra fields will be rejected.
# ========== REASONING & REFLECTION ==========
You may think privately (not shown to the user) before producing each JSON object.
Internal guideline:
1. **Reason**: Analyse the user question; decide which tools (if any) are needed.
2. **Act**: Emit the JSON object to call the tool.
Today is {{ today }}. Remember that success in answering questions accurately is paramount - take all necessary steps to ensure your answer is correct.

View File

@ -0,0 +1,20 @@
import os
PROMPT_DIR = os.path.dirname(__file__)
_loaded_prompts = {}
def load_prompt(name: str) -> str:
if name in _loaded_prompts:
return _loaded_prompts[name]
path = os.path.join(PROMPT_DIR, f"{name}.md")
if not os.path.isfile(path):
raise FileNotFoundError(f"Prompt file '{name}.md' not found in prompts/ directory.")
with open(path, "r", encoding="utf-8") as f:
content = f.read().strip()
_loaded_prompts[name] = content
return content

415
rag/prompts/prompts.py Normal file
View File

@ -0,0 +1,415 @@
#
# Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import datetime
import json
import logging
import re
from copy import deepcopy
from typing import Tuple
import jinja2
import json_repair
from api.utils import hash_str2int
from rag.prompts.prompt_template import load_prompt
from rag.settings import TAG_FLD
from rag.utils import encoder, num_tokens_from_string
STOP_TOKEN="<|STOP|>"
COMPLETE_TASK="complete_task"
def get_value(d, k1, k2):
return d.get(k1, d.get(k2))
def chunks_format(reference):
return [
{
"id": get_value(chunk, "chunk_id", "id"),
"content": get_value(chunk, "content", "content_with_weight"),
"document_id": get_value(chunk, "doc_id", "document_id"),
"document_name": get_value(chunk, "docnm_kwd", "document_name"),
"dataset_id": get_value(chunk, "kb_id", "dataset_id"),
"image_id": get_value(chunk, "image_id", "img_id"),
"positions": get_value(chunk, "positions", "position_int"),
"url": chunk.get("url"),
"similarity": chunk.get("similarity"),
"vector_similarity": chunk.get("vector_similarity"),
"term_similarity": chunk.get("term_similarity"),
"doc_type": chunk.get("doc_type_kwd"),
}
for chunk in reference.get("chunks", [])
]
def message_fit_in(msg, max_length=4000):
def count():
nonlocal msg
tks_cnts = []
for m in msg:
tks_cnts.append({"role": m["role"], "count": num_tokens_from_string(m["content"])})
total = 0
for m in tks_cnts:
total += m["count"]
return total
c = count()
if c < max_length:
return c, msg
msg_ = [m for m in msg if m["role"] == "system"]
if len(msg) > 1:
msg_.append(msg[-1])
msg = msg_
c = count()
if c < max_length:
return c, msg
ll = num_tokens_from_string(msg_[0]["content"])
ll2 = num_tokens_from_string(msg_[-1]["content"])
if ll / (ll + ll2) > 0.8:
m = msg_[0]["content"]
m = encoder.decode(encoder.encode(m)[: max_length - ll2])
msg[0]["content"] = m
return max_length, msg
m = msg_[-1]["content"]
m = encoder.decode(encoder.encode(m)[: max_length - ll2])
msg[-1]["content"] = m
return max_length, msg
def kb_prompt(kbinfos, max_tokens, hash_id=False):
from api.db.services.document_service import DocumentService
knowledges = [get_value(ck, "content", "content_with_weight") for ck in kbinfos["chunks"]]
kwlg_len = len(knowledges)
used_token_count = 0
chunks_num = 0
for i, c in enumerate(knowledges):
if not c:
continue
used_token_count += num_tokens_from_string(c)
chunks_num += 1
if max_tokens * 0.97 < used_token_count:
knowledges = knowledges[:i]
logging.warning(f"Not all the retrieval into prompt: {len(knowledges)}/{kwlg_len}")
break
docs = DocumentService.get_by_ids([get_value(ck, "doc_id", "document_id") for ck in kbinfos["chunks"][:chunks_num]])
docs = {d.id: d.meta_fields for d in docs}
def draw_node(k, line):
if not line:
return ""
return f"\n├── {k}: " + re.sub(r"\n+", " ", line, flags=re.DOTALL)
knowledges = []
for i, ck in enumerate(kbinfos["chunks"][:chunks_num]):
cnt = "\nID: {}".format(i if not hash_id else hash_str2int(get_value(ck, "id", "chunk_id"), 100))
cnt += draw_node("Title", get_value(ck, "docnm_kwd", "document_name"))
cnt += draw_node("URL", ck['url']) if "url" in ck else ""
for k, v in docs.get(get_value(ck, "doc_id", "document_id"), {}).items():
cnt += draw_node(k, v)
cnt += "\n└── Content:\n"
cnt += get_value(ck, "content", "content_with_weight")
knowledges.append(cnt)
return knowledges
CITATION_PROMPT_TEMPLATE = load_prompt("citation_prompt")
CITATION_PLUS_TEMPLATE = load_prompt("citation_plus")
CONTENT_TAGGING_PROMPT_TEMPLATE = load_prompt("content_tagging_prompt")
CROSS_LANGUAGES_SYS_PROMPT_TEMPLATE = load_prompt("cross_languages_sys_prompt")
CROSS_LANGUAGES_USER_PROMPT_TEMPLATE = load_prompt("cross_languages_user_prompt")
FULL_QUESTION_PROMPT_TEMPLATE = load_prompt("full_question_prompt")
KEYWORD_PROMPT_TEMPLATE = load_prompt("keyword_prompt")
QUESTION_PROMPT_TEMPLATE = load_prompt("question_prompt")
VISION_LLM_DESCRIBE_PROMPT = load_prompt("vision_llm_describe_prompt")
VISION_LLM_FIGURE_DESCRIBE_PROMPT = load_prompt("vision_llm_figure_describe_prompt")
ANALYZE_TASK_SYSTEM = load_prompt("analyze_task_system")
ANALYZE_TASK_USER = load_prompt("analyze_task_user")
NEXT_STEP = load_prompt("next_step")
REFLECT = load_prompt("reflect")
SUMMARY4MEMORY = load_prompt("summary4memory")
RANK_MEMORY = load_prompt("rank_memory")
PROMPT_JINJA_ENV = jinja2.Environment(autoescape=False, trim_blocks=True, lstrip_blocks=True)
def citation_prompt() -> str:
template = PROMPT_JINJA_ENV.from_string(CITATION_PROMPT_TEMPLATE)
return template.render()
def citation_plus(sources: str) -> str:
template = PROMPT_JINJA_ENV.from_string(CITATION_PLUS_TEMPLATE)
return template.render(example=citation_prompt(), sources=sources)
def keyword_extraction(chat_mdl, content, topn=3):
template = PROMPT_JINJA_ENV.from_string(KEYWORD_PROMPT_TEMPLATE)
rendered_prompt = template.render(content=content, topn=topn)
msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
_, msg = message_fit_in(msg, chat_mdl.max_length)
kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.2})
if isinstance(kwd, tuple):
kwd = kwd[0]
kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
if kwd.find("**ERROR**") >= 0:
return ""
return kwd
def question_proposal(chat_mdl, content, topn=3):
template = PROMPT_JINJA_ENV.from_string(QUESTION_PROMPT_TEMPLATE)
rendered_prompt = template.render(content=content, topn=topn)
msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
_, msg = message_fit_in(msg, chat_mdl.max_length)
kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.2})
if isinstance(kwd, tuple):
kwd = kwd[0]
kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
if kwd.find("**ERROR**") >= 0:
return ""
return kwd
def full_question(tenant_id=None, llm_id=None, messages=[], language=None, chat_mdl=None):
from api.db import LLMType
from api.db.services.llm_service import LLMBundle
from api.db.services.llm_service import TenantLLMService
if not chat_mdl:
if TenantLLMService.llm_id2llm_type(llm_id) == "image2text":
chat_mdl = LLMBundle(tenant_id, LLMType.IMAGE2TEXT, llm_id)
else:
chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_id)
conv = []
for m in messages:
if m["role"] not in ["user", "assistant"]:
continue
conv.append("{}: {}".format(m["role"].upper(), m["content"]))
conversation = "\n".join(conv)
today = datetime.date.today().isoformat()
yesterday = (datetime.date.today() - datetime.timedelta(days=1)).isoformat()
tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat()
template = PROMPT_JINJA_ENV.from_string(FULL_QUESTION_PROMPT_TEMPLATE)
rendered_prompt = template.render(
today=today,
yesterday=yesterday,
tomorrow=tomorrow,
conversation=conversation,
language=language,
)
ans = chat_mdl.chat(rendered_prompt, [{"role": "user", "content": "Output: "}])
ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
return ans if ans.find("**ERROR**") < 0 else messages[-1]["content"]
def cross_languages(tenant_id, llm_id, query, languages=[]):
from api.db import LLMType
from api.db.services.llm_service import LLMBundle
from api.db.services.llm_service import TenantLLMService
if llm_id and TenantLLMService.llm_id2llm_type(llm_id) == "image2text":
chat_mdl = LLMBundle(tenant_id, LLMType.IMAGE2TEXT, llm_id)
else:
chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_id)
rendered_sys_prompt = PROMPT_JINJA_ENV.from_string(CROSS_LANGUAGES_SYS_PROMPT_TEMPLATE).render()
rendered_user_prompt = PROMPT_JINJA_ENV.from_string(CROSS_LANGUAGES_USER_PROMPT_TEMPLATE).render(query=query, languages=languages)
ans = chat_mdl.chat(rendered_sys_prompt, [{"role": "user", "content": rendered_user_prompt}], {"temperature": 0.2})
ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
if ans.find("**ERROR**") >= 0:
return query
return "\n".join([a for a in re.sub(r"(^Output:|\n+)", "", ans, flags=re.DOTALL).split("===") if a.strip()])
def content_tagging(chat_mdl, content, all_tags, examples, topn=3):
template = PROMPT_JINJA_ENV.from_string(CONTENT_TAGGING_PROMPT_TEMPLATE)
for ex in examples:
ex["tags_json"] = json.dumps(ex[TAG_FLD], indent=2, ensure_ascii=False)
rendered_prompt = template.render(
topn=topn,
all_tags=all_tags,
examples=examples,
content=content,
)
msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
_, msg = message_fit_in(msg, chat_mdl.max_length)
kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.5})
if isinstance(kwd, tuple):
kwd = kwd[0]
kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
if kwd.find("**ERROR**") >= 0:
raise Exception(kwd)
try:
obj = json_repair.loads(kwd)
except json_repair.JSONDecodeError:
try:
result = kwd.replace(rendered_prompt[:-1], "").replace("user", "").replace("model", "").strip()
result = "{" + result.split("{")[1].split("}")[0] + "}"
obj = json_repair.loads(result)
except Exception as e:
logging.exception(f"JSON parsing error: {result} -> {e}")
raise e
res = {}
for k, v in obj.items():
try:
if int(v) > 0:
res[str(k)] = int(v)
except Exception:
pass
return res
def vision_llm_describe_prompt(page=None) -> str:
template = PROMPT_JINJA_ENV.from_string(VISION_LLM_DESCRIBE_PROMPT)
return template.render(page=page)
def vision_llm_figure_describe_prompt() -> str:
template = PROMPT_JINJA_ENV.from_string(VISION_LLM_FIGURE_DESCRIBE_PROMPT)
return template.render()
def tool_schema(tools_description: list[dict], complete_task=False):
if not tools_description:
return ""
desc = {}
if complete_task:
desc[COMPLETE_TASK] = {
"type": "function",
"function": {
"name": COMPLETE_TASK,
"description": "When you have the final answer and are ready to complete the task, call this function with your answer",
"parameters": {
"type": "object",
"properties": {"answer":{"type":"string", "description": "The final answer to the user's question"}},
"required": ["answer"]
}
}
}
for tool in tools_description:
desc[tool["function"]["name"]] = tool
return "\n\n".join([f"## {i+1}. {fnm}\n{json.dumps(des, ensure_ascii=False, indent=4)}" for i, (fnm, des) in enumerate(desc.items())])
def form_history(history, limit=-6):
context = ""
for h in history[limit:]:
if h["role"] == "system":
continue
role = "USER"
if h["role"].upper()!= role:
role = "AGENT"
context += f"\n{role}: {h['content'][:2048] + ('...' if len(h['content'])>2048 else '')}"
return context
def analyze_task(chat_mdl, task_name, tools_description: list[dict]):
tools_desc = tool_schema(tools_description)
context = ""
template = PROMPT_JINJA_ENV.from_string(ANALYZE_TASK_USER)
kwd = chat_mdl.chat(ANALYZE_TASK_SYSTEM,[{"role": "user", "content": template.render(task=task_name, context=context, tools_desc=tools_desc)}], {})
if isinstance(kwd, tuple):
kwd = kwd[0]
kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
if kwd.find("**ERROR**") >= 0:
return ""
return kwd
def next_step(chat_mdl, history:list, tools_description: list[dict], task_desc):
if not tools_description:
return ""
desc = tool_schema(tools_description)
template = PROMPT_JINJA_ENV.from_string(NEXT_STEP)
user_prompt = "\nWhat's the next tool to call? If ready OR IMPOSSIBLE TO BE READY, then call `complete_task`."
hist = deepcopy(history)
if hist[-1]["role"] == "user":
hist[-1]["content"] += user_prompt
else:
hist.append({"role": "user", "content": user_prompt})
json_str = chat_mdl.chat(template.render(task_analisys=task_desc, desc=desc, today=datetime.datetime.now().strftime("%Y-%m-%d")),
hist[1:], stop=["<|stop|>"])
tk_cnt = num_tokens_from_string(json_str)
json_str = re.sub(r"^.*</think>", "", json_str, flags=re.DOTALL)
return json_str, tk_cnt
def reflect(chat_mdl, history: list[dict], tool_call_res: list[Tuple]):
tool_calls = [{"name": p[0], "result": p[1]} for p in tool_call_res]
goal = history[1]["content"]
template = PROMPT_JINJA_ENV.from_string(REFLECT)
user_prompt = template.render(goal=goal, tool_calls=tool_calls)
hist = deepcopy(history)
if hist[-1]["role"] == "user":
hist[-1]["content"] += user_prompt
else:
hist.append({"role": "user", "content": user_prompt})
_, msg = message_fit_in(hist, chat_mdl.max_length)
ans = chat_mdl.chat(msg[0]["content"], msg[1:])
ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
return """
**Observation**
{}
**Reflection**
{}
""".format(json.dumps(tool_calls, ensure_ascii=False, indent=2), ans)
def form_message(system_prompt, user_prompt):
return [{"role": "system", "content": system_prompt},{"role": "user", "content": user_prompt}]
def tool_call_summary(chat_mdl, name: str, params: dict, result: str) -> str:
template = PROMPT_JINJA_ENV.from_string(SUMMARY4MEMORY)
system_prompt = template.render(name=name,
params=json.dumps(params, ensure_ascii=False, indent=2),
result=result)
user_prompt = "→ Summary: "
_, msg = message_fit_in(form_message(system_prompt, user_prompt), chat_mdl.max_length)
ans = chat_mdl.chat(msg[0]["content"], msg[1:])
return re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
def rank_memories(chat_mdl, goal:str, sub_goal:str, tool_call_summaries: list[str]):
template = PROMPT_JINJA_ENV.from_string(RANK_MEMORY)
system_prompt = template.render(goal=goal, sub_goal=sub_goal, results=[{"i": i, "content": s} for i,s in enumerate(tool_call_summaries)])
user_prompt = " → rank: "
_, msg = message_fit_in(form_message(system_prompt, user_prompt), chat_mdl.max_length)
ans = chat_mdl.chat(msg[0]["content"], msg[1:], stop="<|stop|>")
return re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)

View File

@ -0,0 +1,30 @@
**Task**: Sort the tool call results based on relevance to the overall goal and current sub-goal. Return ONLY a sorted list of indices (0-indexed).
**Rules**:
1. Analyze each result's contribution to both:
- The overall goal (primary priority)
- The current sub-goal (secondary priority)
2. Sort from MOST relevant (highest impact) to LEAST relevant
3. Output format: Strictly a Python-style list of integers. Example: [2, 0, 1]
🔹 Overall Goal: {{ goal }}
🔹 Sub-goal: {{ sub_goal }}
**Examples**:
🔹 Tool Response:
- index: 0
> Tokyo temperature is 78°F.
- index: 1
> Error: Authentication failed (expired API key).
- index: 2
> Available: 12 widgets in stock (max 5 per customer).
→ rank: [1,2,0]<|stop|>
**Your Turn**:
🔹 Tool Response:
{% for f in results %}
- index: f.i
> f.content
{% endfor %}

34
rag/prompts/reflect.md Normal file
View File

@ -0,0 +1,34 @@
**Context**:
- To achieve the goal: {{ goal }}.
- You have executed following tool calls:
{% for call in tool_calls %}
Tool call: `{{ call.name }}`
Results: {{ call.result }}
{% endfor %}
**Reflection Instructions:**
Analyze the current state of the overall task ({{ goal }}), then provide structured responses to the following:
## 1. Goal Achievement Status
- Does the current outcome align with the original purpose of this task phase?
- If not, what critical gaps exist?
## 2. Step Completion Check
- Which planned steps were completed? (List verified items)
- Which steps are pending/incomplete? (Specify exactly whats missing)
## 3. Information Adequacy
- Is the collected data sufficient to proceed?
- What key information is still needed? (e.g., metrics, user input, external data)
## 4. Critical Observations
- Unexpected outcomes: [Flag anomalies/errors]
- Risks/blockers: [Identify immediate obstacles]
- Accuracy concerns: [Highlight unreliable results]
## 5. Next-Step Recommendations
- Proposed immediate action: [Concrete next step]
- Alternative strategies if blocked: [Workaround solution]
- Tools/inputs required for next phase: [Specify resources]

View File

@ -0,0 +1,35 @@
**Role**: AI Assistant
**Task**: Summarize tool call responses
**Rules**:
1. Context: You've executed a tool (API/function) and received a response.
2. Condense the response into 1-2 short sentences.
3. Never omit:
- Success/error status
- Core results (e.g., data points, decisions)
- Critical constraints (e.g., limits, conditions)
4. Exclude technical details like timestamps/request IDs unless crucial.
5. Use language as the same as main content of the tool response.
**Response Template**:
"[Status] + [Key Outcome] + [Critical Constraints]"
**Examples**:
🔹 Tool Response:
{"status": "success", "temperature": 78.2, "unit": "F", "location": "Tokyo", "timestamp": 16923456}
→ Summary: "Success: Tokyo temperature is 78°F."
🔹 Tool Response:
{"error": "invalid_api_key", "message": "Authentication failed: expired key"}
→ Summary: "Error: Authentication failed (expired API key)."
🔹 Tool Response:
{"available": true, "inventory": 12, "product": "widget", "limit": "max 5 per customer"}
→ Summary: "Available: 12 widgets in stock (max 5 per customer)."
**Your Turn**:
- Tool call: {{ name }}
- Tool inputs as following:
{{ params }}
- Tool Response:
{{ result }}

View File

@ -0,0 +1,19 @@
**Task Instruction:**
You are tasked with reading and analyzing tool call result based on the following inputs: **Inputs for current call**, and **Results**. Your objective is to extract relevant and helpful information for **Inputs for current call** from the **Results** and seamlessly integrate this information into the previous steps to continue reasoning for the original question.
**Guidelines:**
1. **Analyze the Results:**
- Carefully review the content of each results of tool call.
- Identify factual information that is relevant to the **Inputs for current call** and can aid in the reasoning process for the original question.
2. **Extract Relevant Information:**
- Select the information from the Searched Web Pages that directly contributes to advancing the previous reasoning steps.
- Ensure that the extracted information is accurate and relevant.
- **Inputs for current call:**
{{ inputs }}
- **Results:**
{{ results }}