Feat: add or logic operations for meta data filters. (#11404)

### What problem does this PR solve? #11376 #11387 ### Type of change - [x] New Feature (non-breaking change which adds functionality)
2026-02-02 08:35:08 +08:00 · 2025-11-20 14:31:12 +08:00
parent d2b1da0e26
commit 06cef71ba6
11 changed files with 129 additions and 48 deletions
--- a/rag/prompts/generator.py
+++ b/rag/prompts/generator.py
@ -429,7 +429,7 @@ def rank_memories(chat_mdl, goal:str, sub_goal:str, tool_call_summaries: list[st
    return re.sub(r"^.*</think>", "", ans, flags=re.DOTALL)


-def gen_meta_filter(chat_mdl, meta_data:dict, query: str) -> list:
+def gen_meta_filter(chat_mdl, meta_data:dict, query: str) -> dict:
    sys_prompt = PROMPT_JINJA_ENV.from_string(META_FILTER).render(
        current_date=datetime.datetime.today().strftime('%Y-%m-%d'),
        metadata_keys=json.dumps(meta_data),
@ -440,11 +440,13 @@ def gen_meta_filter(chat_mdl, meta_data:dict, query: str) -> list:
    ans = re.sub(r"(^.*</think>|```json\n|```\n*$)", "", ans, flags=re.DOTALL)
    try:
        ans = json_repair.loads(ans)
-        assert isinstance(ans, list), ans
+        assert isinstance(ans, dict), ans
+        assert "conditions" in ans and isinstance(ans["conditions"], list), ans
        return ans
    except Exception:
        logging.exception(f"Loading json failure: {ans}")
-    return []
+
+    return {"conditions": []}


 def gen_json(system_prompt:str, user_prompt:str, chat_mdl, gen_conf = None):
--- a/rag/prompts/meta_filter.md
+++ b/rag/prompts/meta_filter.md
@ -9,11 +9,13 @@ You are a metadata filtering condition generator. Analyze the user's question an
     }

 2. **Output Requirements**:
-   - Always output a JSON array of filter objects
-   - Each object must have:
+   - Always output a JSON dictionary with only 2 keys: 'conditions'(filter objects) and 'logic' between the conditions ('and' or 'or').
+   - Each filter object in conditions must have:
        "key": (metadata attribute name),
        "value": (string value to compare),
        "op": (operator from allowed list)
+   - Logic between all the conditions: 'and'(Intersection of results for each condition) / 'or' (union of results for all conditions)
+

 3. **Operator Guide**:
   - Use these operators only: ["contains", "not contains", "start with", "end with", "empty", "not empty", "=", "≠", ">", "<", "≥", "≤"]
@ -32,22 +34,97 @@ You are a metadata filtering condition generator. Analyze the user's question an
        - Attribute doesn't exist in metadata
        - Value has no match in metadata

-5. **Example**:
+5. **Example A**:
   - User query: "上市日期七月份的有哪些商品，不要蓝色的"
   - Metadata: { "color": {...}, "listing_date": {...} }
   - Output: 
-        [
+   {
+        "logic": "and",
+        "conditions": [
          {"key": "listing_date", "value": "2025-07-01", "op": "≥"},
          {"key": "listing_date", "value": "2025-08-01", "op": "<"},
          {"key": "color", "value": "blue", "op": "≠"}
        ]
+   }

-6. **Final Output**:
-   - ONLY output valid JSON array
+6. **Example B**:
+   - User query: "Both blue and red are acceptable."
+   - Metadata: { "color": {...}, "listing_date": {...} }
+   - Output: 
+   {
+        "logic": "or",
+        "conditions": [
+          {"key": "color", "value": "blue", "op": "="},
+          {"key": "color", "value": "red", "op": "="}
+        ]
+   }
+
+7. **Final Output**:
+   - ONLY output valid JSON dictionary
   - NO additional text/explanations
+   - Json schema is as following:
+```json
+{
+  "type": "object",
+  "properties": {
+    "logic": {
+      "type": "string",
+      "description": "Logic relationship between all the conditions, the default is 'and'.",
+      "enum": [
+        "and",
+        "or"
+      ]
+    },
+    "conditions": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "properties": {
+          "key": {
+            "type": "string",
+            "description": "Metadata attribute name."
+          },
+          "value": {
+            "type": "string",
+            "description": "Value to compare."
+          },
+          "op": {
+            "type": "string",
+            "description": "Operator from allowed list.",
+            "enum": [
+              "contains",
+              "not contains",
+              "start with",
+              "end with",
+              "empty",
+              "not empty",
+              "=",
+              "≠",
+              ">",
+              "<",
+              "≥",
+              "≤"
+            ]
+          }
+        },
+        "required": [
+          "key",
+          "value",
+          "op"
+        ],
+        "additionalProperties": false
+      }
+    }
+  },
+  "required": [
+    "conditions"
+  ],
+  "additionalProperties": false
+}
+```

 **Current Task**:
- Today's date: {{current_date}}
- Available metadata keys: {{metadata_keys}}
- User query: "{{user_question}}"
+- Today's date: {{ current_date }}
+- Available metadata keys: {{ metadata_keys }}
+- User query: "{{ user_question }}"