Feat: add or logic operations for meta data filters. (#11404)

### What problem does this PR solve?

#11376 #11387

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
This commit is contained in:
Kevin Hu
2025-11-20 14:31:12 +08:00
committed by GitHub
parent d2b1da0e26
commit 06cef71ba6
11 changed files with 129 additions and 48 deletions

View File

@ -429,7 +429,7 @@ def rank_memories(chat_mdl, goal:str, sub_goal:str, tool_call_summaries: list[st
return re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)
def gen_meta_filter(chat_mdl, meta_data:dict, query: str) -> list:
def gen_meta_filter(chat_mdl, meta_data:dict, query: str) -> dict:
sys_prompt = PROMPT_JINJA_ENV.from_string(META_FILTER).render(
current_date=datetime.datetime.today().strftime('%Y-%m-%d'),
metadata_keys=json.dumps(meta_data),
@ -440,11 +440,13 @@ def gen_meta_filter(chat_mdl, meta_data:dict, query: str) -> list:
ans = re.sub(r"(^.*</think>|```json\n|```\n*$)", "", ans, flags=re.DOTALL)
try:
ans = json_repair.loads(ans)
assert isinstance(ans, list), ans
assert isinstance(ans, dict), ans
assert "conditions" in ans and isinstance(ans["conditions"], list), ans
return ans
except Exception:
logging.exception(f"Loading json failure: {ans}")
return []
return {"conditions": []}
def gen_json(system_prompt:str, user_prompt:str, chat_mdl, gen_conf = None):

View File

@ -9,11 +9,13 @@ You are a metadata filtering condition generator. Analyze the user's question an
}
2. **Output Requirements**:
- Always output a JSON array of filter objects
- Each object must have:
- Always output a JSON dictionary with only 2 keys: 'conditions'(filter objects) and 'logic' between the conditions ('and' or 'or').
- Each filter object in conditions must have:
"key": (metadata attribute name),
"value": (string value to compare),
"op": (operator from allowed list)
- Logic between all the conditions: 'and'(Intersection of results for each condition) / 'or' (union of results for all conditions)
3. **Operator Guide**:
- Use these operators only: ["contains", "not contains", "start with", "end with", "empty", "not empty", "=", "≠", ">", "<", "≥", "≤"]
@ -32,22 +34,97 @@ You are a metadata filtering condition generator. Analyze the user's question an
- Attribute doesn't exist in metadata
- Value has no match in metadata
5. **Example**:
5. **Example A**:
- User query: "上市日期七月份的有哪些商品不要蓝色的"
- Metadata: { "color": {...}, "listing_date": {...} }
- Output:
[
{
"logic": "and",
"conditions": [
{"key": "listing_date", "value": "2025-07-01", "op": "≥"},
{"key": "listing_date", "value": "2025-08-01", "op": "<"},
{"key": "color", "value": "blue", "op": "≠"}
]
}
6. **Final Output**:
- ONLY output valid JSON array
6. **Example B**:
- User query: "Both blue and red are acceptable."
- Metadata: { "color": {...}, "listing_date": {...} }
- Output:
{
"logic": "or",
"conditions": [
{"key": "color", "value": "blue", "op": "="},
{"key": "color", "value": "red", "op": "="}
]
}
7. **Final Output**:
- ONLY output valid JSON dictionary
- NO additional text/explanations
- Json schema is as following:
```json
{
"type": "object",
"properties": {
"logic": {
"type": "string",
"description": "Logic relationship between all the conditions, the default is 'and'.",
"enum": [
"and",
"or"
]
},
"conditions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"key": {
"type": "string",
"description": "Metadata attribute name."
},
"value": {
"type": "string",
"description": "Value to compare."
},
"op": {
"type": "string",
"description": "Operator from allowed list.",
"enum": [
"contains",
"not contains",
"start with",
"end with",
"empty",
"not empty",
"=",
"≠",
">",
"<",
"≥",
"≤"
]
}
},
"required": [
"key",
"value",
"op"
],
"additionalProperties": false
}
}
},
"required": [
"conditions"
],
"additionalProperties": false
}
```
**Current Task**:
- Today's date: {{current_date}}
- Available metadata keys: {{metadata_keys}}
- User query: "{{user_question}}"
- Today's date: {{ current_date }}
- Available metadata keys: {{ metadata_keys }}
- User query: "{{ user_question }}"