Commit Graph

5245 Commits

Author SHA1 Message Date
c130ac0f88 Fix: Lazy loading adds a loading state to the page (#13038)
### What problem does this PR solve?

Fix: Lazy loading adds a loading state to the page

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-06 16:20:52 +08:00
301ed76aa4 Fix: task cancel (#13034)
### What problem does this PR solve?

Fix: task cancel #11745 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-06 14:48:24 +08:00
13a6545e48 fix(rdbms): use brackets around field names to preserve distinction after chunking (#13010)
Fix RDBMS field separation after chunking by wrapping field names in
brackets (【field】: value). This ensures fields remain distinguishable
even when TxtParser strips newline delimiters during chunk merging.

Closes  #13001

Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com>
2026-02-06 14:44:58 +08:00
yH
5333e764fc fix: optimize Excel row counting for files with abnormal max_row (#13018)
### What problem does this PR solve?

Some Excel files have abnormal `max_row` metadata (e.g.,
`max_row=1,048,534` with only 300 actual data rows). This causes:
- `row_number()` returns incorrect count, creating 350+ tasks instead of
1
- `list(ws.rows)` iterates through millions of empty rows, causing
system hang

This PR uses binary search to find the actual last row with data.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-06 14:43:52 +08:00
00c392e633 Fix: dataset page enter key to save (#13035)
### What problem does this PR solve?

Fix dataset page enter key to save 
Fix the warnings and optimize the code.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-06 14:42:16 +08:00
4b0d65f089 Fix: correct llm_id for graphrag (#13032)
### What problem does this PR solve?

Fix: correct llm_id for graphrag #13030

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-06 14:05:32 +08:00
6a17e8cc85 Update basics (#13033)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2026-02-06 13:15:33 +08:00
a68c56def7 fix: ensure all metadata filters are processed in AND logic (#13019)
### What problem does this PR solve?

Bug: When a filter key doesn't exist in metas or has no matching values,
the filter was skipped entirely, causing AND logic to fail.

Example:
- Filter 1: meeting_series = '宏观早8点' (matches doc1, doc2, doc3)
- Filter 2: date = '2026-03-05' (no matches)
- Expected: [] (AND should return empty)
- Actual: [doc1, doc2, doc3] (Filter 2 was skipped)

Root cause:
Old logic iterated metas.items() first, then filters. If a filter's key
wasn't in metas, it was never processed.

Fix:
Iterate filters first, then look up in metas. If key not found, treat as
no match (empty result), which correctly applies AND logic.

Changes:
- Changed loop order from 'for k in metas: for f in filters' to 'for f
in filters: if f.key in metas'
- Explicitly handle missing keys as empty results


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Clint-chan <Clint-chan@users.noreply.github.com>
2026-02-06 12:57:27 +08:00
0586d5148d fixed vulnerabilities CVE-2025-53859 & CVE-2025-23419 (#13016)
### What problem does this PR solve?

Fixed vulnerabilities CVE-2025-53859 & CVE-2025-23419 by updating nginx
to 1.29.5-1~noble

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
<img width="709" height="54" alt="image"
src="https://github.com/user-attachments/assets/d8c3518f-bca4-4314-a85c-1aed1678f72e"
/>
2026-02-06 12:55:06 +08:00
11703d957d Refactor: Improve Picture.py resource usage (#13011)
### What problem does this PR solve?

Improve Picture.py resource usage

### Type of change


- [x] Refactoring
2026-02-06 09:50:53 +08:00
1262533b74 Feat: support verify to set llm key and boost bigrams. (#12980)
#12863

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
nightly
2026-02-05 19:19:09 +08:00
bbd8ba64a1 Feat: Control interface documentation directory display and hiding (#13008)
### What problem does this PR solve?

Feat: Control interface documentation directory display and hiding

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2026-02-05 16:59:20 +08:00
1a85d2f8de Fix: prevent streaming message width collapse (#12999)
## Summary
- keep assistant message containers stretched to available width
- avoid width collapse during streaming by allowing flex items to shrink

## Test plan
- not run (not requested)

Fixes #12985

Made with [Cursor](https://cursor.com)

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Liu An <asiro@qq.com>
2026-02-05 15:58:55 +08:00
2a7dca6fc9 Fix: parser bug (#13014)
…, clicking "Parse" will still ask if you want to clear the chunks of
the already parsed files.

### What problem does this PR solve?

Fix: After selecting all and then unchecking the already parsed files,
clicking "Parse" will still ask if you want to clear the chunks of the
already parsed files.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-05 15:57:38 +08:00
0a08fc7b07 Fix: example code in session.py (#13004)
### What problem does this PR solve?

Fix: example code in session.py #12950

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Levi <stupse-tipp0j@icloud.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Liu An <asiro@qq.com>
2026-02-05 15:56:58 +08:00
75b2d482e2 Fix: ingestion pipeline (#13012)
### What problem does this PR solve?

Fix ingestion pipeline
Only 1 file is acceptable for ingestion pipeline.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-05 15:55:41 +08:00
89fdb1d498 Feat: Add model verify (#13005)
### What problem does this PR solve?

Feat: Add model verify

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Liu An <asiro@qq.com>
2026-02-05 15:53:20 +08:00
90b726c988 fix: support date comparison operators (>=, <=, >, <) in metadata filtering (#12982)
## Description

This PR fixes the issue where date metadata conditions with comparison
operators (`>=`, `<=`, `>`, `<`) did not work correctly in the
`/api/v1/retrieval` endpoint.

  ## Problem

  When using metadata conditions like:
  ```json
  {
    "metadata_condition": {
      "conditions": [
        {
          "name": "date",
          "comparison_operator": ">=",
          "value": "2027-01-13"
        }
      ]
    }
  }

  The filtering did not work as expected because:
  1. Operators >= and <= were not mapped to internal symbols ≥ and ≤
2. Date strings like "2027-01-13" failed to parse with
ast.literal_eval()
  3. Non-standard date formats were incorrectly compared as strings

  Solution

  Changes in common/metadata_utils.py:

  1. Added operator mapping in convert_conditions():
    - >= → ≥
    - <= → ≤
    - != → ≠
  2. Implemented strict date format detection in meta_filter():
- Only processes dates in YYYY-MM-DD format (10 characters, properly
formatted)
- When query value is a date, only matches data in the same standard
format
    - Non-standard formats (e.g., "2026年1月13日", "2026-1-22") are skipped
  3. Maintained backward compatibility:
    - Numeric comparisons still work
    - String comparisons still work
    - Only affects date-formatted queries

  Testing

  All test cases pass (8/8):
  -  Date >= comparison
  -  Date > comparison
  -  Date < comparison
  -  Date <= comparison
  -  Date = comparison
  -  Date range queries
  -  Non-date string comparison (backward compatibility)
  -  Numeric comparison (backward compatibility)

  Example Usage

  {
    "dataset_ids": ["xxx"],
    "question": "test",
    "metadata_condition": {
      "conditions": [
        {
          "name": "date",
          "comparison_operator": ">=",
          "value": "2027-01-13"
        }
      ]
    }
  }

  Notes

  - Only supports standard YYYY-MM-DD format
- Non-standard date formats in data are treated as data quality issues
and will not match
  - Users should ensure their date metadata is in the correct format

---------

Co-authored-by: Clint-chan <Clint-chan@users.noreply.github.com>
2026-02-05 13:52:51 +08:00
1349e6b7d1 Fix: adressing style without a default value (#13009)
### What problem does this PR solve?

Fix: adressing style without a default value #12396 #11510

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-05 13:52:23 +08:00
6361fc4b33 Feat: update stepfun list (#12991)
### What problem does this PR solve?

Update stepfun list.

Add TTS and Sequence2Text functionalities.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-02-05 12:47:04 +08:00
803b480f9c feat: Add optional document metadata in OpenAI-compatible response references (#12950)
### What problem does this PR solve?

This PR adds an opt‑in way to include document‑level metadata in
OpenAI‑compatible reference chunks. Until now, metadata could be used
for filtering but wasn’t returned in responses. The change enables
clients to show richer citations (author/year/source, etc.) while
keeping payload size and privacy under control via an explicit request
flag and optional field allowlist.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Contribution during my time at RAGcon GmbH.
2026-02-05 09:54:33 +08:00
2843570d8e Refact: Updated Agent template description. (#12995)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2026-02-05 09:50:44 +08:00
3a86e7c224 Feat: support doubao-embedding-vision model (#12983)
### What problem does this PR solve?

Add support `doubao-embedding-vision` model.
`doubao-embedding-large-text` is deprecated.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2026-02-05 09:49:46 +08:00
2ff2e72488 Fix: Fixed the issue where deleted images in the agent chat box would still be sent to the backend. (#12992)
### What problem does this PR solve?

Fix: Fixed the issue where deleted images in the agent chat box would
still be sent to the backend.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-05 09:49:01 +08:00
2627a7f5a8 Feat: Move the reasoning field to the root of the payload in the completion interface. (#12990)
### What problem does this PR solve?


### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-02-04 19:21:49 +08:00
4d4b5a978d feat: enable multi-file upload for chat and agent workflows (#12977)
### Closes: #12921 

### What problem does this PR solve?

Previously, multi-file upload was not working correctly across the
application:

- **Chat**: UI displayed "Upload max 5 files" but only the first file
was actually uploaded
- **Agent conversational mode**: Frontend sent multiple files but
backend only processed one
- **Agent task-mode file inputs**: Explicitly limited to single file
only

This PR enables proper multi-file upload support for both chat and agent
workflows, allowing users to upload and process multiple files (up to 5)
as the UI originally suggested.

**Changes:**
- `web/src/pages/next-chats/hooks/use-upload-file.ts`: Process all files
instead of only `files[0]`
- `api/apps/canvas_app.py`: Handle multiple files via
`files.getlist("file")`
- `web/src/pages/agent/debug-content/uploader.tsx`: Allow up to 5 files
with `multiple={true}`
- `agent/component/begin.py` & `fillup.py`: Support file arrays while
maintaining backward compatibility

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-02-04 18:03:21 +08:00
ffdf19b27f Fix: Variables within multiple parentheses cannot be displayed correctly. #12987 (#12988)
### What problem does this PR solve?
Fix: Variables within multiple parentheses cannot be displayed
correctly. #12987

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-04 17:54:02 +08:00
a37d287fad Fix: pdf chunking / table rotation (#12981)
### What problem does this PR solve?

Fix: PDF chunking issue for single-page documents
Refactor: Change the default refresh frequency to 5
Fix: Add a 0-degree threshold; require other rotation angles to exceed
it by at least 0.2
Fix: Put connector name tips to correct place
Fix: incorrect example response in delete datasets.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2026-02-04 17:00:25 +08:00
0470fc59b1 Docs: Update error message example in HTTP API reference (#12984)
### What problem does this PR solve?

Changed the error message example in the HTTP API reference
documentation from a duplicate dataset name error to a validation error
about string length requirements. This update reflects the current
behavior of the API when validation fails.

### Type of change

- [x] Documentation Update
2026-02-04 15:42:53 +08:00
6f31c5fed2 feat/add MySQL and PostgreSQL data source connectors (#12817)
### What problem does this PR solve?

This PR adds MySQL and PostgreSQL as data source connectors, allowing
users to import data directly from relational databases into RAGFlow for
RAG workflows.

Many users store their knowledge in databases (product catalogs,
documentation, FAQs, etc.) and currently have no way to sync this data
into RAGFlow without exporting to files first. This feature lets them
connect directly to their databases, run SQL queries, and automatically
create documents from the results.

Closes #763
Closes #11560

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### What this PR does

**New capabilities:**
- Connect to MySQL and PostgreSQL databases
- Run custom SQL queries to extract data
- Map database columns to document content (vectorized) and metadata
(searchable)
- Support incremental sync using a timestamp column
- Full frontend UI with connection form and tooltips

**Files changed:**

Backend:
- `common/constants.py` - Added MYSQL/POSTGRESQL to FileSource enum
- `common/data_source/config.py` - Added to DocumentSource enum
- `common/data_source/rdbms_connector.py` - New connector (368 lines)
- `common/data_source/__init__.py` - Exported the connector
- `rag/svr/sync_data_source.py` - Added MySQL and PostgreSQL sync
classes
- `pyproject.toml` - Added mysql-connector-python dependency

Frontend:
- `web/src/pages/user-setting/data-source/constant/index.tsx` - Form
fields
- `web/src/locales/en.ts` - English translations
- `web/src/assets/svg/data-source/mysql.svg` - MySQL icon
- `web/src/assets/svg/data-source/postgresql.svg` - PostgreSQL icon

### Testing done

Tested with MySQL 8.0 and PostgreSQL 16:
- Connection validation works correctly
- Full sync imports all query results as documents
- Incremental sync only fetches rows updated since last sync
- Custom SQL queries filter data as expected
- Invalid credentials show clear error messages
- Lint checks pass (`ruff check` returns no errors)

---------

Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com>
2026-02-04 10:14:32 +08:00
0ab02854d9 Refact: Updated UI tips (#12976)
### What problem does this PR solve?

Updated UI tips.

### Type of change

- [x] Refactoring
2026-02-04 09:48:59 +08:00
414e261eda Fix: If the agent debug sheet contains too much content, some of it will not be displayed. #12974 (#12975)
### What problem does this PR solve?

Fix: If the agent debug sheet contains too much content, some of it will
not be displayed. #12974

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-04 09:48:28 +08:00
7b230aadf4 chore(tests): move oceanbase peewee test under test/ and fix enum check (#12969)
### What problem does this PR solve?

This mistake was made by PR #12926 
This PR makes the OceanBase peewee unit test discoverable by the default
unit test runner/CI (by moving it under test/), so it’s included in the
unified unit test suite.
It also fixes `test_database_lock_enum_values` to correctly handle Enum
alias members (DatabaseLock uses the same value for MYSQL and
OCEANBASE).

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

### Screenshots
The original `test_oceanbase_peewee.py` was placed under tests/, which
isn’t included in the default unit test runner’s testpaths, so it wasn’t
picked up by the unit test suite. So we need to move it to correct path.
<img width="670" height="540" alt="image"
src="https://github.com/user-attachments/assets/69d39346-450f-46dc-8965-29c3d7b32bc9"
/>

When using old version in `test_oceanbase_peewee.py`:
```
    def test_database_lock_enum_values(self):
        """Test DatabaseLock enum has all expected values."""
        expected = {'MYSQL', 'OCEANBASE', 'POSTGRES'}
        actual = {e.name for e in DatabaseLock}
        assert expected.issubset(actual), f"Missing: {expected - actual}"
```
The old check iterated Enum members, so alias values were skipped and
only `MYSQL/POSTGRES` were seen, making OCEANBASE appear missing.

<img width="1998" height="931" alt="65e2837f23b7b298980a410c7d5c2f09"
src="https://github.com/user-attachments/assets/d8e98c5a-2cfa-4182-ae35-a3ef03554a27"
/>

and new version uses `DatabaseLock.__members__` and passes:
<img width="2024" height="1170" alt="1aa8c6facb28d24149270fe1bc4a9dd9"
src="https://github.com/user-attachments/assets/d8688936-ccac-4a39-a389-23dc6f0fe276"
/>
2026-02-03 17:28:53 +08:00
205ae769bb Fix "metadata table not exists" (#12949)
### What problem does this PR solve?

Fix "metadata table not exists" when updating a meta data.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-03 17:28:10 +08:00
ff7afcbe5f feat: add OceanBase memory store (#12955)
### What problem does this PR solve?

Add OceanBase memory store and extracting base class `OBConnectionBase`.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-03 16:46:17 +08:00
4d9a3a5739 fix(docs): fix #12963 , rename "total" field to "total_datasets" for clarity (#12967)
### What problem does this PR solve?

Update HTTP API reference to rename "total" field to

### Type of change

- [x] Bug Fix #12963

Co-authored-by: Yun.kou <yunkou@deepglint.com>
2026-02-03 15:40:17 +08:00
c3f71e9ef9 Fix:Incorrect ingestion pipeline template (#12961)
### What problem does this PR solve?

This PR fixes an incorrect variable reference in the Advanced Ingestion
Pipeline template, which causes a runtime failure in the Auto Keywords
stage.

When creating a pipeline using the `advanced ingestion pipeline`
template, the **Auto Keywords** stage fails with the following error:
Can't find variable: 'Splitter:NineTiesSin@chunks'

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: sunsui <suisun@trip.com>
2026-02-03 15:39:32 +08:00
32f9a87b2e Fix: default admin tenant (#12964)
### What problem does this PR solve?

Add tenant for default admin, and allow login to ragflow server as
default admin.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-03 15:37:36 +08:00
f11ca54e0e Fix: docx parser output consistent (#12965)
### What problem does this PR solve?

Fix: docx parser output consistent

> File "/home/bxy/ragflow/rag/flow/parser/parser.py", line 506, in _word
>     sections, tbls = docx_parser(name, binary=blob)
>     ^^^^^^^^^^^^^^
> ValueError: too many values to unpack (expected 2)
> 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-03 15:36:58 +08:00
deeae8dba4 feat(connector): add Seafile as data source (#12945)
### What problem does this PR solve?
This PR adds **Seafile** as a new data source connector for RAGFlow.

[Seafile](https://www.seafile.com/) is an open-source, self-hosted file
sync and share platform widely used by enterprises, universities, and
organizations that require data sovereignty and privacy. Users who store
documents in Seafile currently have no way to index and search their
content through RAGFlow.

This connector enables RAGFlow users to:
- Connect to self-hosted Seafile servers via API token
- Index documents from personal and shared libraries
- Support incremental polling for updated files
- Seamlessly integrate Seafile-stored documents into their RAG pipelines


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
### Changes included

- `SeaFileConnector` implementing `LoadConnector` and `PollConnector`
interfaces
- Support for API token
- Recursive file traversal across libraries
- Time-based filtering for incremental updates
- Seafile logo (sourced from Simple Icons, CC0)
- Connector configuration and registration

### Testing

- Tested against self-hosted Seafile Community Edition
- Verified authentication (token)
- Verified document ingestion from personal and shared libraries
- Verified incremental polling with time filters
2026-02-03 13:42:05 +08:00
25bb2e1616 Fix:Optimize metadata and optimize the empty state style of the agent page. (#12960)
### What problem does this PR solve?
Fix:Optimize metadata and optimize the empty state style of the agent
page.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-03 11:43:44 +08:00
fafaaa26c3 feat: memory status (#12959)
### What problem does this PR solve?

Add memory status indicator and detail message dialog 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-02-03 11:16:18 +08:00
ad06c042c4 Support operator constraints in semi-automatic metadata filtering (#12956)
### What problem does this PR solve?

#### Summary
This PR enhances the Semi-automatic metadata filtering mode by allowing
users to explicitly pre-define operators (e.g., contains, =, >, etc.)
for selected metadata keys. While the LLM still dynamically extracts the
filter value from the user's query, it is now strictly constrained to
use the operator specified in the UI configuration.

Using this feature is optional. By default the operator selection is set
to "automatic" resulting in the LLM choosing the operator (as
presently).

#### Rationale & Use Case
This enhancement was driven by a concrete challenge I encountered while
working with technical documentation.
In my specific use case, I was trying to filter for software versions
within a technical manual. In this dataset, a single document chunk
often applies to multiple software versions. These versions are stored
as a combined string within the metadata for each chunk.

When using the standard semi-automatic filter, the LLM would
inconsistently choose between the contains and equals operators. When it
chose equals, it would exclude every chunk that applied to more than one
version, even if the version I was searching for was clearly included in
that metadata string. This led to incomplete and frustrating retrieval
results.

By extending the semi-automatic filter to allow pre-defining the
operator for a specific key, I was able to force the use of contains for
the version field. This change immediately led to significantly improved
and more reliable results in my case.

I believe this functionality will be equally useful for others dealing
with "tagged" or multi-value metadata where the relationship between the
query and the field is known, but the specific value needs to remain
dynamic.

#### Key Changes
##### Backend & Core Logic
- `common/metadata_utils.py`: Updated apply_meta_data_filter to support
a mixed data structure for semi_auto (handling both legacy string arrays
and the new object-based format {"key": "...", "op": "..."}).
- `rag/prompts/generator.py`: Extended gen_meta_filter to accept and
pass operator constraints to the LLM.
- `rag/prompts/meta_filter.md`: Updated the system prompt to instruct
the LLM to strictly respect provided operator constraints.

##### Frontend
- `web/src/components/metadata-filter/metadata-semi-auto-fields.tsx`:
Enhanced the UI to include an operator dropdown for each selected
metadata key, utilizing existing operator constants.
- `web/src/components/metadata-filter/index.tsx`: Updated the validation
schema to accommodate the new state structure.

#### Test Plan
- Backward Compatibility: Verified that existing semi-auto filters
stored as simple strings still function correctly.
- Prompt Verification: Confirmed that constraints are correctly rendered
in the LLM system prompt when specified.
- Added unit tests as
`test/unit_test/common/test_apply_semi_auto_meta_data_filter.py`
 - Manual End-to-End:
- Configured a "Semi-automatic" filter for a "Version" key with the
"contains" operator.
   - Asked a version-specific query.
   - Result
   
<img width="1173" height="704" alt="Screenshot 2026-02-02 145359"
src="https://github.com/user-attachments/assets/510a6a61-a231-4dc2-a7fe-cdfc07219132"
/>




### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Philipp Heyken Soares <philipp.heyken-soares@am.ai>
2026-02-03 11:11:34 +08:00
7cbe8b5b53 feat: Add a custom header to the SDK for chatting with the agent. (#12430)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Liu An <asiro@qq.com>
2026-02-03 11:01:18 +08:00
aa8d0a36f1 Update default Docling version to 2.71.0 to resolve table parsing issues (#12952)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-03 10:24:51 +08:00
6c9ca45b30 Refactor: improve close for presentation (#12957)
### What problem does this PR solve?

improve close for presentation

### Type of change

- [x] Refactoring
2026-02-03 10:24:27 +08:00
f028f74883 Fixed 12787 with syntax error in generated MySql json path expression (#12929)
### What problem does this PR solve?

Fixed 12787 with syntax error in generated MySql json path expression
https://github.com/infiniflow/ragflow/issues/12787 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

#### What was fixed:
- Changed line 237 in ob_conn.py from value_str = get_value_str(value)
if value else "" to value_str = get_value_str(value)
- This fixes the bug where falsy but valid values (0, False, "", [], {})
were being converted to empty strings, causing invalid SQL syntax
#### What was tested:
- Comprehensive unit tests covering all edge cases
- Regression tests specifically for the bug scenario

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2026-02-03 09:50:14 +08:00
59d7f3f456 Sandbox (#12951)
### What problem does this PR solve?

Proofread the Sandbox Specification document and moved it to a dedicated
folder outside of the original docs.

### Type of change


- [x] Documentation Update
2026-02-03 09:43:41 +08:00
7be3dacdaa Fix: custom delimeter in docx (#12946)
### What problem does this PR solve?

Fix: custom delimeter in docx

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-02-03 09:43:18 +08:00
2e5a18602b refactor: optimize agent list payload and improve multimodal detection logic (#12942)
## Description
This PR focuses on API performance optimization and refining the model
capability detection logic in the Agent/Canvas module.

### 1. Performance Optimization (Backend)
- **Changes**: Removed `cls.model.dsl` from query fields in
`UserCanvasService.get_by_tenant_ids`.
- **Reasoning**: The `dsl` object is large and unnecessary for the Agent
list view. Excluding it reduces the payload size of the
`/v1/canvas/list` API, leading to faster serialization and reduced
network latency.
- **Consistency**: Full DSL data remains accessible via the individual
`/v1/canvas/get/<id>` endpoint used in the detail view.

### 2. Multimodal Detection Refinement (Frontend)
- **Changes**: Replaced `model_type === LlmModelType.Image2text` with
`tags?.includes('IMAGE2TEXT')`.
- **Reasoning**: In RAGFlow, `model_type` defines the primary role of a
model (e.g., `chat`). However, many advanced Chat models are also
vision-capable. Since `model_type` is a single-value field, it cannot
represent these multiple capabilities.
- **Solution**: Utilizing the `tags` field (which supports multiple
attributes) to check for `IMAGE2TEXT` ensures that models like
`gpt-5.2-pro` correctly display multimodal input options.



## Type of Change
- [x] Bug fix (logic correction for multimodal detection)
- [x] Optimization (performance improvement for list API)

## Main Changes
- `api/db/services/canvas_service.py`: Optimized DB query by excluding
heavy DSL fields.
- `web/src/pages/agent/form/agent-form/index.tsx`: Enhanced capability
detection using the tags system.

## Verification
- [x] Verified Agent list loads faster with reduced response payload.
- [x] Confirmed that `chat` models with the `IMAGE2TEXT` tag now
correctly enable the multimodal input UI.
2026-02-02 17:35:54 +08:00