mirror of
https://github.com/infiniflow/ragflow.git
synced 2025-12-08 20:42:30 +08:00
Feat: Redesign and refactor agent module (#9113)
### What problem does this PR solve? #9082 #6365 <u> **WARNING: it's not compatible with the older version of `Agent` module, which means that `Agent` from older versions can not work anymore.**</u> ### Type of change - [x] New Feature (non-breaking change which adds functionality)
This commit is contained in:
6
rag/prompts/__init__.py
Normal file
6
rag/prompts/__init__.py
Normal file
@ -0,0 +1,6 @@
|
||||
from . import prompts
|
||||
|
||||
__all__ = [name for name in dir(prompts)
|
||||
if not name.startswith('_')]
|
||||
|
||||
globals().update({name: getattr(prompts, name) for name in __all__})
|
||||
8
rag/prompts/analyze_task_system.md
Normal file
8
rag/prompts/analyze_task_system.md
Normal file
@ -0,0 +1,8 @@
|
||||
Your responsibility is to execute assigned tasks to a high standard. Please:
|
||||
1. Carefully analyze the task requirements.
|
||||
2. Develop a reasonable execution plan.
|
||||
3. Execute step-by-step and document the reasoning process.
|
||||
4. Provide clear and accurate results.
|
||||
|
||||
If difficulties are encountered, clearly state the problem and explore alternative approaches.
|
||||
|
||||
20
rag/prompts/analyze_task_user.md
Normal file
20
rag/prompts/analyze_task_user.md
Normal file
@ -0,0 +1,20 @@
|
||||
Please analyze the following task:
|
||||
|
||||
Task: {{ task }}
|
||||
|
||||
Context: {{ context }}
|
||||
|
||||
**Analysis Requirements:**
|
||||
1. Is it just a small talk? (If yes, no further plan or analysis is needed)
|
||||
2. What is the core objective of the task?
|
||||
3. What is the complexity level of the task?
|
||||
4. What types of specialized skills are required?
|
||||
5. Does the task need to be decomposed into subtasks? (If yes, propose the subtask structure)
|
||||
6. How to know the task or the subtasks are impossible to lead to the success after a few rounds of interaction?
|
||||
7. What are the expected success criteria?
|
||||
|
||||
**Available Sub-Agents and Their Specializations:**
|
||||
|
||||
{{ tools_desc }}
|
||||
|
||||
Provide a detailed analysis of the task based on the above requirements.
|
||||
13
rag/prompts/citation_plus.md
Normal file
13
rag/prompts/citation_plus.md
Normal file
@ -0,0 +1,13 @@
|
||||
You are an agent for adding correct citations to the given text by user.
|
||||
You are given a piece of text within [ID:<ID>] tags, which was generated based on the provided sources.
|
||||
However, the sources are not cited in the [ID:<ID>].
|
||||
Your task is to enhance user trust by generating correct, appropriate citations for this report.
|
||||
|
||||
{{ example }}
|
||||
|
||||
<context>
|
||||
|
||||
{{ sources }}
|
||||
|
||||
</context>
|
||||
|
||||
@ -1,46 +1,108 @@
|
||||
## Citation Requirements
|
||||
Based on the provided document or chat history, add citations to the input text using the format specified later.
|
||||
|
||||
- Use a uniform citation format such as [ID:i] [ID:j], where "i" and "j" are document IDs enclosed in square brackets. Separate multiple IDs with spaces (e.g., [ID:0] [ID:1]).
|
||||
- Citation markers must be placed at the end of a sentence, separated by a space from the final punctuation (e.g., period, question mark).
|
||||
- A maximum of 4 citations are allowed per sentence.
|
||||
- DO NOT insert citations if the content is not from retrieved chunks.
|
||||
- DO NOT use standalone Document IDs (e.g., #ID#).
|
||||
- Citations MUST always follow the [ID:i] format.
|
||||
- STRICTLY prohibit the use of strikethrough symbols (e.g., ~~) or any other non-standard formatting syntax.
|
||||
- Any violation of the above rules — including incorrect formatting, prohibited styles, or unsupported citations — will result in no citation being added for that sentence.
|
||||
# Citation Requirements:
|
||||
|
||||
---
|
||||
## Technical Rules:
|
||||
- Use format: [ID:i] or [ID:i] [ID:j] for multiple sources
|
||||
- Place citations at the end of sentences, before punctuation
|
||||
- Maximum 4 citations per sentence
|
||||
- DO NOT cite content not from <context></context>
|
||||
- DO NOT modify whitespace or original text
|
||||
- STRICTLY prohibit non-standard formatting (~~, etc.)
|
||||
|
||||
## Example START
|
||||
## What MUST Be Cited:
|
||||
1. **Quantitative data**: Numbers, percentages, statistics, measurements
|
||||
2. **Temporal claims**: Dates, timeframes, sequences of events
|
||||
3. **Causal relationships**: Claims about cause and effect
|
||||
4. **Comparative statements**: Rankings, comparisons, superlatives
|
||||
5. **Technical definitions**: Specialized terms, concepts, methodologies
|
||||
6. **Direct attributions**: What someone said, did, or believes
|
||||
7. **Predictions/forecasts**: Future projections, trend analyses
|
||||
8. **Controversial claims**: Disputed facts, minority opinions
|
||||
|
||||
<SYSTEM>: Here is the knowledge base:
|
||||
## What Should NOT Be Cited:
|
||||
- Common knowledge (e.g., "The sun rises in the east")
|
||||
- Transitional phrases
|
||||
- General introductions
|
||||
- Your own analysis or synthesis (unless directly from source)
|
||||
|
||||
Document: Elon Musk Breaks Silence on Crypto, Warns Against Dogecoin ...
|
||||
URL: https://blockworks.co/news/elon-musk-crypto-dogecoin
|
||||
ID: 0
|
||||
The Tesla co-founder advised against going all-in on dogecoin, but Elon Musk said it’s still his favorite crypto...
|
||||
# Comprehensive Examples:
|
||||
|
||||
Document: Elon Musk's Dogecoin tweet sparks social media frenzy
|
||||
ID: 1
|
||||
Musk said he is 'willing to serve' D.O.G.E. – shorthand for Dogecoin.
|
||||
## Example 1: Data and Statistics
|
||||
<context>
|
||||
ID: 45
|
||||
└── Content: The global smartphone market grew by 7.8% in Q3 2024, with Samsung holding 20.6% market share and Apple at 15.8%.
|
||||
|
||||
Document: Causal effect of Elon Musk tweets on Dogecoin price
|
||||
ID: 2
|
||||
If you think of Dogecoin — the cryptocurrency based on a meme — you can’t help but also think of Elon Musk...
|
||||
ID: 46
|
||||
└── Content: 5G adoption reached 1.5 billion users worldwide by October 2024.
|
||||
</context>
|
||||
|
||||
Document: Elon Musk's Tweet Ignites Dogecoin's Future In Public Services
|
||||
ID: 3
|
||||
The market is heating up after Elon Musk's announcement about Dogecoin. Is this a new era for crypto?...
|
||||
USER: How is the smartphone market performing?
|
||||
|
||||
The above is the knowledge base.
|
||||
ASSISTANT:
|
||||
The smartphone industry is showing strong recovery. The global smartphone market grew by 7.8% in Q3 2024 [ID:45]. Samsung leads with 20.6% market share, followed by Apple at 15.8% [ID:45]. This growth is partly driven by 5G technology, which reached 1.5 billion users worldwide by October 2024 [ID:46].
|
||||
|
||||
<USER>: What's Elon's view on dogecoin?
|
||||
## Example 2: Causal Relationships and Predictions
|
||||
<context>
|
||||
ID: 78
|
||||
└── Content: Rising interest rates have led to a 15% decline in housing starts. Economists predict this trend will continue through 2025.
|
||||
|
||||
<ASSISTANT>:
|
||||
Musk has consistently expressed his fondness for Dogecoin, often citing its humor and the inclusion of dogs in its branding. He has referred to it as his favorite cryptocurrency [ID:0] [ID:1].
|
||||
ID: 79
|
||||
└── Content: The Federal Reserve's aggressive monetary policy is the primary driver behind the real estate slowdown.
|
||||
</context>
|
||||
|
||||
Recently, Musk has hinted at potential future roles for Dogecoin. His tweets have sparked speculation about Dogecoin's potential integration into public services [ID:3].
|
||||
USER: What's happening in the real estate market?
|
||||
|
||||
Overall, while Musk enjoys Dogecoin and often promotes it, he also warns against over-investing in it, reflecting both his personal amusement and caution regarding its speculative nature.
|
||||
ASSISTANT:
|
||||
The real estate market is experiencing significant challenges. Rising interest rates have led to a 15% decline in housing starts [ID:78]. The Federal Reserve's aggressive monetary policy is the primary driver behind this real estate slowdown [ID:79]. Economists predict this trend will continue through 2025 [ID:78]. These factors suggest a prolonged period of market adjustment ahead.
|
||||
|
||||
## Example END
|
||||
## Example 3: Technical Concepts and Comparisons
|
||||
<context>
|
||||
ID: 101
|
||||
└── Content: Quantum computing uses quantum bits (qubits) that can exist in superposition, unlike classical bits which are either 0 or 1.
|
||||
|
||||
ID: 102
|
||||
└── Content: Current quantum computers can solve certain problems 158 million times faster than classical supercomputers.
|
||||
</context>
|
||||
|
||||
USER: Explain quantum computing advantages.
|
||||
|
||||
ASSISTANT:
|
||||
Quantum computing represents a fundamental shift in computational capability. Unlike classical computers, quantum computing uses quantum bits (qubits) that can exist in superposition, rather than being limited to just 0 or 1 [ID:101]. This unique property enables extraordinary performance gains. Current quantum computers can solve certain problems 158 million times faster than classical supercomputers [ID:102]. This technology promises to revolutionize fields requiring complex calculations.
|
||||
|
||||
## Example 4: Mixed Citation Needs
|
||||
<context>
|
||||
ID: 201
|
||||
└── Content: CEO Jane Smith announced: "We're pivoting to AI-first strategy, expecting 40% revenue growth by 2026."
|
||||
|
||||
ID: 202
|
||||
└── Content: Industry analysts remain skeptical, with Morgan Stanley rating the stock as "underperform."
|
||||
|
||||
ID: 203
|
||||
└── Content: The company's AI division currently represents only 5% of total revenue.
|
||||
</context>
|
||||
|
||||
USER: What's the company's new strategy?
|
||||
|
||||
ASSISTANT:
|
||||
The company is undergoing a major strategic transformation. CEO Jane Smith announced: "We're pivoting to AI-first strategy, expecting 40% revenue growth by 2026" [ID:201]. However, this ambitious goal faces skepticism, as industry analysts remain doubtful, with Morgan Stanley rating the stock as "underperform" [ID:202]. The challenge is significant given that the company's AI division currently represents only 5% of total revenue [ID:203]. Success will depend on execution of this bold vision.
|
||||
|
||||
## Example 5: When NOT to Cite
|
||||
<context>
|
||||
ID: 301
|
||||
└── Content: Paris is the capital of France. The Eiffel Tower was completed in 1889.
|
||||
</context>
|
||||
|
||||
USER: Tell me about Paris.
|
||||
|
||||
ASSISTANT:
|
||||
Paris is the capital of France. It's known for its rich history, culture, and architecture. The Eiffel Tower was completed in 1889 [ID:301]. The city attracts millions of tourists annually. Paris remains one of the world's most visited destinations.
|
||||
(Note: Only the specific date needs citation, not common knowledge about Paris)
|
||||
|
||||
--- Examples END ---
|
||||
|
||||
REMEMBER:
|
||||
- Cite FACTS, not opinions or transitions
|
||||
- Each citation supports the ENTIRE sentence
|
||||
- When in doubt, ask: "Would a fact-checker need to verify this?"
|
||||
- Place citations at sentence end, before punctuation
|
||||
63
rag/prompts/next_step.md
Normal file
63
rag/prompts/next_step.md
Normal file
@ -0,0 +1,63 @@
|
||||
You are an expert Planning Agent tasked with solving problems efficiently through structured plans.
|
||||
Your job is:
|
||||
1. Based on the task analysis, chose some right tools to execute.
|
||||
2. Track progress and adapt plans(tool calls) when necessary.
|
||||
3. Use `complete_task` if no further step you need to take from tools. (All necessary steps done or little hope to be done)
|
||||
|
||||
# ========== TASK ANALYSIS =============
|
||||
{{ task_analisys }}
|
||||
|
||||
|
||||
# ========== TOOLS (JSON-Schema) ==========
|
||||
You may invoke only the tools listed below.
|
||||
Return a JSON array of objects in which item is with exactly two top-level keys:
|
||||
• "name": the tool to call
|
||||
• "arguments": an object whose keys/values satisfy the schema
|
||||
|
||||
{{ desc }}
|
||||
|
||||
# ========== RESPONSE FORMAT ==========
|
||||
✦ **When you need a tool**
|
||||
Return ONLY the Json (no additional keys, no commentary, end with `<|stop|>`), such as following:
|
||||
[{
|
||||
"name": "<tool_name1>",
|
||||
"arguments": { /* tool arguments matching its schema */ }
|
||||
},{
|
||||
"name": "<tool_name2>",
|
||||
"arguments": { /* tool arguments matching its schema */ }
|
||||
}...]<|stop|>
|
||||
|
||||
✦ **When you are certain the task is solved OR no further information can be obtained**
|
||||
Return ONLY:
|
||||
[{
|
||||
"name": "complete_task",
|
||||
"arguments": { "answer": "<final answer text>" }
|
||||
}]<|stop|>
|
||||
|
||||
<verification_steps>
|
||||
Before providing a final answer:
|
||||
1. Double-check all gathered information
|
||||
2. Verify calculations and logic
|
||||
3. Ensure answer matches exactly what was asked
|
||||
4. Confirm answer format meets requirements
|
||||
5. Run additional verification if confidence is not 100%
|
||||
</verification_steps>
|
||||
|
||||
<error_handling>
|
||||
If you encounter issues:
|
||||
1. Try alternative approaches before giving up
|
||||
2. Use different tools or combinations of tools
|
||||
3. Break complex problems into simpler sub-tasks
|
||||
4. Verify intermediate results frequently
|
||||
5. Never return "I cannot answer" without exhausting all options
|
||||
</error_handling>
|
||||
|
||||
⚠️ Any output that is not valid JSON or that contains extra fields will be rejected.
|
||||
|
||||
# ========== REASONING & REFLECTION ==========
|
||||
You may think privately (not shown to the user) before producing each JSON object.
|
||||
Internal guideline:
|
||||
1. **Reason**: Analyse the user question; decide which tools (if any) are needed.
|
||||
2. **Act**: Emit the JSON object to call the tool.
|
||||
|
||||
Today is {{ today }}. Remember that success in answering questions accurately is paramount - take all necessary steps to ensure your answer is correct.
|
||||
20
rag/prompts/prompt_template.py
Normal file
20
rag/prompts/prompt_template.py
Normal file
@ -0,0 +1,20 @@
|
||||
import os
|
||||
|
||||
|
||||
PROMPT_DIR = os.path.dirname(__file__)
|
||||
|
||||
_loaded_prompts = {}
|
||||
|
||||
|
||||
def load_prompt(name: str) -> str:
|
||||
if name in _loaded_prompts:
|
||||
return _loaded_prompts[name]
|
||||
|
||||
path = os.path.join(PROMPT_DIR, f"{name}.md")
|
||||
if not os.path.isfile(path):
|
||||
raise FileNotFoundError(f"Prompt file '{name}.md' not found in prompts/ directory.")
|
||||
|
||||
with open(path, "r", encoding="utf-8") as f:
|
||||
content = f.read().strip()
|
||||
_loaded_prompts[name] = content
|
||||
return content
|
||||
415
rag/prompts/prompts.py
Normal file
415
rag/prompts/prompts.py
Normal file
@ -0,0 +1,415 @@
|
||||
#
|
||||
# Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
import datetime
|
||||
import json
|
||||
import logging
|
||||
import re
|
||||
from copy import deepcopy
|
||||
from typing import Tuple
|
||||
import jinja2
|
||||
import json_repair
|
||||
from api.utils import hash_str2int
|
||||
from rag.prompts.prompt_template import load_prompt
|
||||
from rag.settings import TAG_FLD
|
||||
from rag.utils import encoder, num_tokens_from_string
|
||||
|
||||
|
||||
STOP_TOKEN="<|STOP|>"
|
||||
COMPLETE_TASK="complete_task"
|
||||
|
||||
|
||||
def get_value(d, k1, k2):
|
||||
return d.get(k1, d.get(k2))
|
||||
|
||||
|
||||
def chunks_format(reference):
|
||||
|
||||
return [
|
||||
{
|
||||
"id": get_value(chunk, "chunk_id", "id"),
|
||||
"content": get_value(chunk, "content", "content_with_weight"),
|
||||
"document_id": get_value(chunk, "doc_id", "document_id"),
|
||||
"document_name": get_value(chunk, "docnm_kwd", "document_name"),
|
||||
"dataset_id": get_value(chunk, "kb_id", "dataset_id"),
|
||||
"image_id": get_value(chunk, "image_id", "img_id"),
|
||||
"positions": get_value(chunk, "positions", "position_int"),
|
||||
"url": chunk.get("url"),
|
||||
"similarity": chunk.get("similarity"),
|
||||
"vector_similarity": chunk.get("vector_similarity"),
|
||||
"term_similarity": chunk.get("term_similarity"),
|
||||
"doc_type": chunk.get("doc_type_kwd"),
|
||||
}
|
||||
for chunk in reference.get("chunks", [])
|
||||
]
|
||||
|
||||
|
||||
def message_fit_in(msg, max_length=4000):
|
||||
def count():
|
||||
nonlocal msg
|
||||
tks_cnts = []
|
||||
for m in msg:
|
||||
tks_cnts.append({"role": m["role"], "count": num_tokens_from_string(m["content"])})
|
||||
total = 0
|
||||
for m in tks_cnts:
|
||||
total += m["count"]
|
||||
return total
|
||||
|
||||
c = count()
|
||||
if c < max_length:
|
||||
return c, msg
|
||||
|
||||
msg_ = [m for m in msg if m["role"] == "system"]
|
||||
if len(msg) > 1:
|
||||
msg_.append(msg[-1])
|
||||
msg = msg_
|
||||
c = count()
|
||||
if c < max_length:
|
||||
return c, msg
|
||||
|
||||
ll = num_tokens_from_string(msg_[0]["content"])
|
||||
ll2 = num_tokens_from_string(msg_[-1]["content"])
|
||||
if ll / (ll + ll2) > 0.8:
|
||||
m = msg_[0]["content"]
|
||||
m = encoder.decode(encoder.encode(m)[: max_length - ll2])
|
||||
msg[0]["content"] = m
|
||||
return max_length, msg
|
||||
|
||||
m = msg_[-1]["content"]
|
||||
m = encoder.decode(encoder.encode(m)[: max_length - ll2])
|
||||
msg[-1]["content"] = m
|
||||
return max_length, msg
|
||||
|
||||
|
||||
def kb_prompt(kbinfos, max_tokens, hash_id=False):
|
||||
from api.db.services.document_service import DocumentService
|
||||
|
||||
knowledges = [get_value(ck, "content", "content_with_weight") for ck in kbinfos["chunks"]]
|
||||
kwlg_len = len(knowledges)
|
||||
used_token_count = 0
|
||||
chunks_num = 0
|
||||
for i, c in enumerate(knowledges):
|
||||
if not c:
|
||||
continue
|
||||
used_token_count += num_tokens_from_string(c)
|
||||
chunks_num += 1
|
||||
if max_tokens * 0.97 < used_token_count:
|
||||
knowledges = knowledges[:i]
|
||||
logging.warning(f"Not all the retrieval into prompt: {len(knowledges)}/{kwlg_len}")
|
||||
break
|
||||
|
||||
docs = DocumentService.get_by_ids([get_value(ck, "doc_id", "document_id") for ck in kbinfos["chunks"][:chunks_num]])
|
||||
docs = {d.id: d.meta_fields for d in docs}
|
||||
|
||||
def draw_node(k, line):
|
||||
if not line:
|
||||
return ""
|
||||
return f"\n├── {k}: " + re.sub(r"\n+", " ", line, flags=re.DOTALL)
|
||||
|
||||
knowledges = []
|
||||
for i, ck in enumerate(kbinfos["chunks"][:chunks_num]):
|
||||
cnt = "\nID: {}".format(i if not hash_id else hash_str2int(get_value(ck, "id", "chunk_id"), 100))
|
||||
cnt += draw_node("Title", get_value(ck, "docnm_kwd", "document_name"))
|
||||
cnt += draw_node("URL", ck['url']) if "url" in ck else ""
|
||||
for k, v in docs.get(get_value(ck, "doc_id", "document_id"), {}).items():
|
||||
cnt += draw_node(k, v)
|
||||
cnt += "\n└── Content:\n"
|
||||
cnt += get_value(ck, "content", "content_with_weight")
|
||||
knowledges.append(cnt)
|
||||
|
||||
return knowledges
|
||||
|
||||
|
||||
CITATION_PROMPT_TEMPLATE = load_prompt("citation_prompt")
|
||||
CITATION_PLUS_TEMPLATE = load_prompt("citation_plus")
|
||||
CONTENT_TAGGING_PROMPT_TEMPLATE = load_prompt("content_tagging_prompt")
|
||||
CROSS_LANGUAGES_SYS_PROMPT_TEMPLATE = load_prompt("cross_languages_sys_prompt")
|
||||
CROSS_LANGUAGES_USER_PROMPT_TEMPLATE = load_prompt("cross_languages_user_prompt")
|
||||
FULL_QUESTION_PROMPT_TEMPLATE = load_prompt("full_question_prompt")
|
||||
KEYWORD_PROMPT_TEMPLATE = load_prompt("keyword_prompt")
|
||||
QUESTION_PROMPT_TEMPLATE = load_prompt("question_prompt")
|
||||
VISION_LLM_DESCRIBE_PROMPT = load_prompt("vision_llm_describe_prompt")
|
||||
VISION_LLM_FIGURE_DESCRIBE_PROMPT = load_prompt("vision_llm_figure_describe_prompt")
|
||||
|
||||
ANALYZE_TASK_SYSTEM = load_prompt("analyze_task_system")
|
||||
ANALYZE_TASK_USER = load_prompt("analyze_task_user")
|
||||
NEXT_STEP = load_prompt("next_step")
|
||||
REFLECT = load_prompt("reflect")
|
||||
SUMMARY4MEMORY = load_prompt("summary4memory")
|
||||
RANK_MEMORY = load_prompt("rank_memory")
|
||||
|
||||
PROMPT_JINJA_ENV = jinja2.Environment(autoescape=False, trim_blocks=True, lstrip_blocks=True)
|
||||
|
||||
|
||||
def citation_prompt() -> str:
|
||||
template = PROMPT_JINJA_ENV.from_string(CITATION_PROMPT_TEMPLATE)
|
||||
return template.render()
|
||||
|
||||
|
||||
def citation_plus(sources: str) -> str:
|
||||
template = PROMPT_JINJA_ENV.from_string(CITATION_PLUS_TEMPLATE)
|
||||
return template.render(example=citation_prompt(), sources=sources)
|
||||
|
||||
|
||||
def keyword_extraction(chat_mdl, content, topn=3):
|
||||
template = PROMPT_JINJA_ENV.from_string(KEYWORD_PROMPT_TEMPLATE)
|
||||
rendered_prompt = template.render(content=content, topn=topn)
|
||||
|
||||
msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
|
||||
_, msg = message_fit_in(msg, chat_mdl.max_length)
|
||||
kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.2})
|
||||
if isinstance(kwd, tuple):
|
||||
kwd = kwd[0]
|
||||
kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
|
||||
if kwd.find("**ERROR**") >= 0:
|
||||
return ""
|
||||
return kwd
|
||||
|
||||
|
||||
def question_proposal(chat_mdl, content, topn=3):
|
||||
template = PROMPT_JINJA_ENV.from_string(QUESTION_PROMPT_TEMPLATE)
|
||||
rendered_prompt = template.render(content=content, topn=topn)
|
||||
|
||||
msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
|
||||
_, msg = message_fit_in(msg, chat_mdl.max_length)
|
||||
kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.2})
|
||||
if isinstance(kwd, tuple):
|
||||
kwd = kwd[0]
|
||||
kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
|
||||
if kwd.find("**ERROR**") >= 0:
|
||||
return ""
|
||||
return kwd
|
||||
|
||||
|
||||
def full_question(tenant_id=None, llm_id=None, messages=[], language=None, chat_mdl=None):
|
||||
from api.db import LLMType
|
||||
from api.db.services.llm_service import LLMBundle
|
||||
from api.db.services.llm_service import TenantLLMService
|
||||
|
||||
if not chat_mdl:
|
||||
if TenantLLMService.llm_id2llm_type(llm_id) == "image2text":
|
||||
chat_mdl = LLMBundle(tenant_id, LLMType.IMAGE2TEXT, llm_id)
|
||||
else:
|
||||
chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_id)
|
||||
conv = []
|
||||
for m in messages:
|
||||
if m["role"] not in ["user", "assistant"]:
|
||||
continue
|
||||
conv.append("{}: {}".format(m["role"].upper(), m["content"]))
|
||||
conversation = "\n".join(conv)
|
||||
today = datetime.date.today().isoformat()
|
||||
yesterday = (datetime.date.today() - datetime.timedelta(days=1)).isoformat()
|
||||
tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat()
|
||||
|
||||
template = PROMPT_JINJA_ENV.from_string(FULL_QUESTION_PROMPT_TEMPLATE)
|
||||
rendered_prompt = template.render(
|
||||
today=today,
|
||||
yesterday=yesterday,
|
||||
tomorrow=tomorrow,
|
||||
conversation=conversation,
|
||||
language=language,
|
||||
)
|
||||
|
||||
ans = chat_mdl.chat(rendered_prompt, [{"role": "user", "content": "Output: "}])
|
||||
ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
|
||||
return ans if ans.find("**ERROR**") < 0 else messages[-1]["content"]
|
||||
|
||||
|
||||
def cross_languages(tenant_id, llm_id, query, languages=[]):
|
||||
from api.db import LLMType
|
||||
from api.db.services.llm_service import LLMBundle
|
||||
from api.db.services.llm_service import TenantLLMService
|
||||
|
||||
if llm_id and TenantLLMService.llm_id2llm_type(llm_id) == "image2text":
|
||||
chat_mdl = LLMBundle(tenant_id, LLMType.IMAGE2TEXT, llm_id)
|
||||
else:
|
||||
chat_mdl = LLMBundle(tenant_id, LLMType.CHAT, llm_id)
|
||||
|
||||
rendered_sys_prompt = PROMPT_JINJA_ENV.from_string(CROSS_LANGUAGES_SYS_PROMPT_TEMPLATE).render()
|
||||
rendered_user_prompt = PROMPT_JINJA_ENV.from_string(CROSS_LANGUAGES_USER_PROMPT_TEMPLATE).render(query=query, languages=languages)
|
||||
|
||||
ans = chat_mdl.chat(rendered_sys_prompt, [{"role": "user", "content": rendered_user_prompt}], {"temperature": 0.2})
|
||||
ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
|
||||
if ans.find("**ERROR**") >= 0:
|
||||
return query
|
||||
return "\n".join([a for a in re.sub(r"(^Output:|\n+)", "", ans, flags=re.DOTALL).split("===") if a.strip()])
|
||||
|
||||
|
||||
def content_tagging(chat_mdl, content, all_tags, examples, topn=3):
|
||||
template = PROMPT_JINJA_ENV.from_string(CONTENT_TAGGING_PROMPT_TEMPLATE)
|
||||
|
||||
for ex in examples:
|
||||
ex["tags_json"] = json.dumps(ex[TAG_FLD], indent=2, ensure_ascii=False)
|
||||
|
||||
rendered_prompt = template.render(
|
||||
topn=topn,
|
||||
all_tags=all_tags,
|
||||
examples=examples,
|
||||
content=content,
|
||||
)
|
||||
|
||||
msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
|
||||
_, msg = message_fit_in(msg, chat_mdl.max_length)
|
||||
kwd = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.5})
|
||||
if isinstance(kwd, tuple):
|
||||
kwd = kwd[0]
|
||||
kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
|
||||
if kwd.find("**ERROR**") >= 0:
|
||||
raise Exception(kwd)
|
||||
|
||||
try:
|
||||
obj = json_repair.loads(kwd)
|
||||
except json_repair.JSONDecodeError:
|
||||
try:
|
||||
result = kwd.replace(rendered_prompt[:-1], "").replace("user", "").replace("model", "").strip()
|
||||
result = "{" + result.split("{")[1].split("}")[0] + "}"
|
||||
obj = json_repair.loads(result)
|
||||
except Exception as e:
|
||||
logging.exception(f"JSON parsing error: {result} -> {e}")
|
||||
raise e
|
||||
res = {}
|
||||
for k, v in obj.items():
|
||||
try:
|
||||
if int(v) > 0:
|
||||
res[str(k)] = int(v)
|
||||
except Exception:
|
||||
pass
|
||||
return res
|
||||
|
||||
|
||||
def vision_llm_describe_prompt(page=None) -> str:
|
||||
template = PROMPT_JINJA_ENV.from_string(VISION_LLM_DESCRIBE_PROMPT)
|
||||
|
||||
return template.render(page=page)
|
||||
|
||||
|
||||
def vision_llm_figure_describe_prompt() -> str:
|
||||
template = PROMPT_JINJA_ENV.from_string(VISION_LLM_FIGURE_DESCRIBE_PROMPT)
|
||||
return template.render()
|
||||
|
||||
|
||||
def tool_schema(tools_description: list[dict], complete_task=False):
|
||||
if not tools_description:
|
||||
return ""
|
||||
desc = {}
|
||||
if complete_task:
|
||||
desc[COMPLETE_TASK] = {
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": COMPLETE_TASK,
|
||||
"description": "When you have the final answer and are ready to complete the task, call this function with your answer",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {"answer":{"type":"string", "description": "The final answer to the user's question"}},
|
||||
"required": ["answer"]
|
||||
}
|
||||
}
|
||||
}
|
||||
for tool in tools_description:
|
||||
desc[tool["function"]["name"]] = tool
|
||||
|
||||
return "\n\n".join([f"## {i+1}. {fnm}\n{json.dumps(des, ensure_ascii=False, indent=4)}" for i, (fnm, des) in enumerate(desc.items())])
|
||||
|
||||
|
||||
def form_history(history, limit=-6):
|
||||
context = ""
|
||||
for h in history[limit:]:
|
||||
if h["role"] == "system":
|
||||
continue
|
||||
role = "USER"
|
||||
if h["role"].upper()!= role:
|
||||
role = "AGENT"
|
||||
context += f"\n{role}: {h['content'][:2048] + ('...' if len(h['content'])>2048 else '')}"
|
||||
return context
|
||||
|
||||
|
||||
def analyze_task(chat_mdl, task_name, tools_description: list[dict]):
|
||||
tools_desc = tool_schema(tools_description)
|
||||
context = ""
|
||||
|
||||
template = PROMPT_JINJA_ENV.from_string(ANALYZE_TASK_USER)
|
||||
|
||||
kwd = chat_mdl.chat(ANALYZE_TASK_SYSTEM,[{"role": "user", "content": template.render(task=task_name, context=context, tools_desc=tools_desc)}], {})
|
||||
if isinstance(kwd, tuple):
|
||||
kwd = kwd[0]
|
||||
kwd = re.sub(r"^.*</think>", "", kwd, flags=re.DOTALL)
|
||||
if kwd.find("**ERROR**") >= 0:
|
||||
return ""
|
||||
return kwd
|
||||
|
||||
|
||||
def next_step(chat_mdl, history:list, tools_description: list[dict], task_desc):
|
||||
if not tools_description:
|
||||
return ""
|
||||
desc = tool_schema(tools_description)
|
||||
template = PROMPT_JINJA_ENV.from_string(NEXT_STEP)
|
||||
user_prompt = "\nWhat's the next tool to call? If ready OR IMPOSSIBLE TO BE READY, then call `complete_task`."
|
||||
hist = deepcopy(history)
|
||||
if hist[-1]["role"] == "user":
|
||||
hist[-1]["content"] += user_prompt
|
||||
else:
|
||||
hist.append({"role": "user", "content": user_prompt})
|
||||
json_str = chat_mdl.chat(template.render(task_analisys=task_desc, desc=desc, today=datetime.datetime.now().strftime("%Y-%m-%d")),
|
||||
hist[1:], stop=["<|stop|>"])
|
||||
tk_cnt = num_tokens_from_string(json_str)
|
||||
json_str = re.sub(r"^.*</think>", "", json_str, flags=re.DOTALL)
|
||||
return json_str, tk_cnt
|
||||
|
||||
|
||||
def reflect(chat_mdl, history: list[dict], tool_call_res: list[Tuple]):
|
||||
tool_calls = [{"name": p[0], "result": p[1]} for p in tool_call_res]
|
||||
goal = history[1]["content"]
|
||||
template = PROMPT_JINJA_ENV.from_string(REFLECT)
|
||||
user_prompt = template.render(goal=goal, tool_calls=tool_calls)
|
||||
hist = deepcopy(history)
|
||||
if hist[-1]["role"] == "user":
|
||||
hist[-1]["content"] += user_prompt
|
||||
else:
|
||||
hist.append({"role": "user", "content": user_prompt})
|
||||
_, msg = message_fit_in(hist, chat_mdl.max_length)
|
||||
ans = chat_mdl.chat(msg[0]["content"], msg[1:])
|
||||
ans = re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
|
||||
return """
|
||||
**Observation**
|
||||
{}
|
||||
|
||||
**Reflection**
|
||||
{}
|
||||
""".format(json.dumps(tool_calls, ensure_ascii=False, indent=2), ans)
|
||||
|
||||
|
||||
def form_message(system_prompt, user_prompt):
|
||||
return [{"role": "system", "content": system_prompt},{"role": "user", "content": user_prompt}]
|
||||
|
||||
|
||||
def tool_call_summary(chat_mdl, name: str, params: dict, result: str) -> str:
|
||||
template = PROMPT_JINJA_ENV.from_string(SUMMARY4MEMORY)
|
||||
system_prompt = template.render(name=name,
|
||||
params=json.dumps(params, ensure_ascii=False, indent=2),
|
||||
result=result)
|
||||
user_prompt = "→ Summary: "
|
||||
_, msg = message_fit_in(form_message(system_prompt, user_prompt), chat_mdl.max_length)
|
||||
ans = chat_mdl.chat(msg[0]["content"], msg[1:])
|
||||
return re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
|
||||
|
||||
|
||||
def rank_memories(chat_mdl, goal:str, sub_goal:str, tool_call_summaries: list[str]):
|
||||
template = PROMPT_JINJA_ENV.from_string(RANK_MEMORY)
|
||||
system_prompt = template.render(goal=goal, sub_goal=sub_goal, results=[{"i": i, "content": s} for i,s in enumerate(tool_call_summaries)])
|
||||
user_prompt = " → rank: "
|
||||
_, msg = message_fit_in(form_message(system_prompt, user_prompt), chat_mdl.max_length)
|
||||
ans = chat_mdl.chat(msg[0]["content"], msg[1:], stop="<|stop|>")
|
||||
return re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
|
||||
|
||||
30
rag/prompts/rank_memory.md
Normal file
30
rag/prompts/rank_memory.md
Normal file
@ -0,0 +1,30 @@
|
||||
**Task**: Sort the tool call results based on relevance to the overall goal and current sub-goal. Return ONLY a sorted list of indices (0-indexed).
|
||||
|
||||
**Rules**:
|
||||
1. Analyze each result's contribution to both:
|
||||
- The overall goal (primary priority)
|
||||
- The current sub-goal (secondary priority)
|
||||
2. Sort from MOST relevant (highest impact) to LEAST relevant
|
||||
3. Output format: Strictly a Python-style list of integers. Example: [2, 0, 1]
|
||||
|
||||
🔹 Overall Goal: {{ goal }}
|
||||
🔹 Sub-goal: {{ sub_goal }}
|
||||
|
||||
**Examples**:
|
||||
🔹 Tool Response:
|
||||
- index: 0
|
||||
> Tokyo temperature is 78°F.
|
||||
- index: 1
|
||||
> Error: Authentication failed (expired API key).
|
||||
- index: 2
|
||||
> Available: 12 widgets in stock (max 5 per customer).
|
||||
|
||||
→ rank: [1,2,0]<|stop|>
|
||||
|
||||
|
||||
**Your Turn**:
|
||||
🔹 Tool Response:
|
||||
{% for f in results %}
|
||||
- index: f.i
|
||||
> f.content
|
||||
{% endfor %}
|
||||
34
rag/prompts/reflect.md
Normal file
34
rag/prompts/reflect.md
Normal file
@ -0,0 +1,34 @@
|
||||
**Context**:
|
||||
- To achieve the goal: {{ goal }}.
|
||||
- You have executed following tool calls:
|
||||
{% for call in tool_calls %}
|
||||
Tool call: `{{ call.name }}`
|
||||
Results: {{ call.result }}
|
||||
{% endfor %}
|
||||
|
||||
|
||||
**Reflection Instructions:**
|
||||
|
||||
Analyze the current state of the overall task ({{ goal }}), then provide structured responses to the following:
|
||||
|
||||
## 1. Goal Achievement Status
|
||||
- Does the current outcome align with the original purpose of this task phase?
|
||||
- If not, what critical gaps exist?
|
||||
|
||||
## 2. Step Completion Check
|
||||
- Which planned steps were completed? (List verified items)
|
||||
- Which steps are pending/incomplete? (Specify exactly what’s missing)
|
||||
|
||||
## 3. Information Adequacy
|
||||
- Is the collected data sufficient to proceed?
|
||||
- What key information is still needed? (e.g., metrics, user input, external data)
|
||||
|
||||
## 4. Critical Observations
|
||||
- Unexpected outcomes: [Flag anomalies/errors]
|
||||
- Risks/blockers: [Identify immediate obstacles]
|
||||
- Accuracy concerns: [Highlight unreliable results]
|
||||
|
||||
## 5. Next-Step Recommendations
|
||||
- Proposed immediate action: [Concrete next step]
|
||||
- Alternative strategies if blocked: [Workaround solution]
|
||||
- Tools/inputs required for next phase: [Specify resources]
|
||||
35
rag/prompts/summary4memory.md
Normal file
35
rag/prompts/summary4memory.md
Normal file
@ -0,0 +1,35 @@
|
||||
**Role**: AI Assistant
|
||||
**Task**: Summarize tool call responses
|
||||
**Rules**:
|
||||
1. Context: You've executed a tool (API/function) and received a response.
|
||||
2. Condense the response into 1-2 short sentences.
|
||||
3. Never omit:
|
||||
- Success/error status
|
||||
- Core results (e.g., data points, decisions)
|
||||
- Critical constraints (e.g., limits, conditions)
|
||||
4. Exclude technical details like timestamps/request IDs unless crucial.
|
||||
5. Use language as the same as main content of the tool response.
|
||||
|
||||
**Response Template**:
|
||||
"[Status] + [Key Outcome] + [Critical Constraints]"
|
||||
|
||||
**Examples**:
|
||||
🔹 Tool Response:
|
||||
{"status": "success", "temperature": 78.2, "unit": "F", "location": "Tokyo", "timestamp": 16923456}
|
||||
→ Summary: "Success: Tokyo temperature is 78°F."
|
||||
|
||||
🔹 Tool Response:
|
||||
{"error": "invalid_api_key", "message": "Authentication failed: expired key"}
|
||||
→ Summary: "Error: Authentication failed (expired API key)."
|
||||
|
||||
🔹 Tool Response:
|
||||
{"available": true, "inventory": 12, "product": "widget", "limit": "max 5 per customer"}
|
||||
→ Summary: "Available: 12 widgets in stock (max 5 per customer)."
|
||||
|
||||
**Your Turn**:
|
||||
- Tool call: {{ name }}
|
||||
- Tool inputs as following:
|
||||
{{ params }}
|
||||
|
||||
- Tool Response:
|
||||
{{ result }}
|
||||
19
rag/prompts/tool_call_summary.md
Normal file
19
rag/prompts/tool_call_summary.md
Normal file
@ -0,0 +1,19 @@
|
||||
**Task Instruction:**
|
||||
|
||||
You are tasked with reading and analyzing tool call result based on the following inputs: **Inputs for current call**, and **Results**. Your objective is to extract relevant and helpful information for **Inputs for current call** from the **Results** and seamlessly integrate this information into the previous steps to continue reasoning for the original question.
|
||||
|
||||
**Guidelines:**
|
||||
|
||||
1. **Analyze the Results:**
|
||||
- Carefully review the content of each results of tool call.
|
||||
- Identify factual information that is relevant to the **Inputs for current call** and can aid in the reasoning process for the original question.
|
||||
|
||||
2. **Extract Relevant Information:**
|
||||
- Select the information from the Searched Web Pages that directly contributes to advancing the previous reasoning steps.
|
||||
- Ensure that the extracted information is accurate and relevant.
|
||||
|
||||
- **Inputs for current call:**
|
||||
{{ inputs }}
|
||||
|
||||
- **Results:**
|
||||
{{ results }}
|
||||
Reference in New Issue
Block a user