Commit Graph

4763 Commits

Author SHA1 Message Date
aceca266ff Feat: Images appearing consecutively in the dialogue are merged and displayed in a carousel. #10427 (#12051)
### What problem does this PR solve?

Feat: Images appearing consecutively in the dialogue are merged and
displayed in a carousel. #10427
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 19:13:18 +08:00
d82e502a71 Add AI Badgr as OpenAI-compatible chat model provider (#12018)
## What problem does this PR solve?

Adds AI Badgr as an optional LLM provider in RAGFlow. Users can use AI
Badgr for chat completions and embeddings via its OpenAI-compatible API.

**Background:**
- AI Badgr provides OpenAI-compatible endpoints (`/v1/chat/completions`,
`/v1/embeddings`, `/v1/models`)
- Previously, RAGFlow didn't support AI Badgr
- This PR adds support following the existing provider pattern (e.g.,
CometAPI, DeerAPI)

**Implementation details:**
- Added AI Badgr to the provider registry and configuration
- Supports chat completions (via LiteLLMBase) and embeddings (via
AIBadgrEmbed)
- Uses standard API key authentication
- Base URL: `https://aibadgr.com/api/v1`
- Environment variables: `AIBADGR_API_KEY`, `AIBADGR_BASE_URL`
(optional)

## Type of change

- [x] New Feature (non-breaking change which adds functionality)

This is a new feature that adds support for a new provider without
changing existing functionality.

---------

Co-authored-by: michaelmanley <55236695+michaelbrinkworth@users.noreply.github.com>
2025-12-19 17:45:20 +08:00
0494b92371 Feat: Display error messages from intermediate nodes. #10427 (#12038)
### What problem does this PR solve?

Feat: Display error messages from intermediate nodes. #10427

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 17:44:45 +08:00
8683a5b1b7 Docs: How to call MinerU as a remote service (#12004)
### Type of change

- [x] Documentation Update
2025-12-19 17:06:32 +08:00
4cbe470089 Feat: Display error messages from intermediate nodes of the webhook. #10427 (#11954)
### What problem does this PR solve?

Feat: Remove HMAC from the webhook #10427

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 12:56:56 +08:00
6cd1824a77 Feat: chats completions API supports metadata filtering (#12023)
### What problem does this PR solve?

Chats completions API supports metadata filtering.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 11:36:35 +08:00
2844700dc4 Refa: better UX for adding OCR model (#12034)
### What problem does this PR solve?

Better UX for adding OCR model.

### Type of change

- [x] Refactoring
2025-12-19 11:34:21 +08:00
f8fd1ea7e1 Feat: Further update Bedrock model configs (#12029)
### What problem does this PR solve?

Feat: Further update Bedrock model configs #12020 #12008

<img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69"
src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63"
/>
<img width="700" alt="afe88ec3c58f745f85c5c507b040c250"
src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef"
/>
<img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69"
src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0"
/>

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-19 11:32:20 +08:00
57edc215d7 Feat:update webhook component (#11739)
### What problem does this PR solve?
issue:
https://github.com/infiniflow/ragflow/issues/10427

https://github.com/infiniflow/ragflow/issues/8115

change:

- Support for Multiple HTTP Methods (POST / GET / PUT / PATCH / DELETE /
HEAD)
- Security Validation
  1. max_body_size
  2. IP whitelist
  3. rate limit
  4. token / basic / jwt authentication
- File Upload Support
- Unified Content-Type Handling
- Full Schema-Based Extraction & Type Validation
- Two Execution Modes: Immediately / Streaming


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-18 19:34:39 +08:00
7a4044b05f Feat: use filepath for files with the same name for all data source types (#11819)
### What problem does this PR solve?

When there are multiple files with the same name the file would just
duplicate, making it hard to distinguish between the different files.
Now if there are multiple files with the same name, they will be named
after their folder path in the storage unit.

This was done for the webdav connector and with this PR also for Notion,
Confluence and S3 Storage.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Contribution by RAGcon GmbH, visit us [here](https://www.ragcon.ai/)
2025-12-18 17:42:43 +08:00
e84d5412bc Feat: bedrock iam authentication (#12020)
### What problem does this PR solve?

Feat: bedrock iam authentication #12008 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-18 17:13:09 +08:00
151480dc85 Feat: trace information can be returned by the agent completion API (#12019)
### What problem does this PR solve?

Trace information can be returned by the agent completion API.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-18 15:52:11 +08:00
2331b3a270 Refact: Update loggings (#12014)
### What problem does this PR solve?

Refact: Update loggings

### Type of change

- [x] Refactoring
2025-12-18 14:18:03 +08:00
5cd1a678c8 Fix: image edit in edit_chunk (#12009)
### What problem does this PR solve?

Fix: image edit in edit_chunk #11971

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-18 11:35:01 +08:00
cc9546b761 Fix IDE warnings (#12010)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-18 11:27:02 +08:00
a63dcfed6f Refactor: improve cohere calculate total counts (#12007)
### What problem does this PR solve?

improve cohere calculate total counts

### Type of change


- [x] Refactoring
2025-12-18 10:04:28 +08:00
4dd8cdc38b task executor issues (#12006)
### What problem does this PR solve?

**Fixes #8706** - `InfinityException: TOO_MANY_CONNECTIONS` when running
multiple task executor workers

### Problem Description

When running RAGFlow with 8-16 task executor workers, most workers fail
to start properly. Checking logs revealed that workers were
stuck/hanging during Infinity connection initialization - only 1-2
workers would successfully register in Redis while the rest remained
blocked.

### Root Cause

The Infinity SDK `ConnectionPool` pre-allocates all connections in
`__init__`. With the default `max_size=32` and multiple workers (e.g.,
16), this creates 16×32=512 connections immediately on startup,
exceeding Infinity's default 128 connection limit. Workers hang while
waiting for connections that can never be established.

### Changes

1. **Prevent Infinity connection storm** (`rag/utils/infinity_conn.py`,
`rag/svr/task_executor.py`)
- Reduced ConnectionPool `max_size` from 32 to 4 (sufficient since
operations are synchronous)
- Added staggered startup delay (2s per worker) to spread connection
initialization

2. **Handle None children_delimiter** (`rag/app/naive.py`)
   - Use `or ""` to handle explicitly set None values from parser config

3. **MinerU parser robustness** (`deepdoc/parser/mineru_parser.py`)
   - Use `.get()` for optional output fields that may be missing
- Fix DISCARDED block handling: change `pass` to `continue` to skip
discarded blocks entirely

### Why `max_size=4` is sufficient

| Workers | Pool Size | Total Connections | Infinity Limit |
|---------|-----------|-------------------|----------------|
| 16      | 32        | 512               | 128          |
| 16      | 4         | 64                | 128          |
| 32      | 4         | 128               | 128          |

- All RAGFlow operations are synchronous: `get_conn()` → operation →
`release_conn()`
- No parallel `docStoreConn` operations in the codebase
- Maximum 1-2 concurrent connections needed per worker; 4 provides
safety margin

### MinerU DISCARDED block bug

When MinerU returns blocks with `type: "discarded"` (headers, footers,
watermarks, page numbers, artifacts), the previous code used `pass`
which left the `section` variable undefined, causing:

- **UnboundLocalError** if DISCARDED is the first block
- **Duplicate content** if DISCARDED follows another block (stale value
from previous iteration)

**Root cause confirmed via MinerU source code:**

From
[`mineru/utils/enum_class.py`](https://github.com/opendatalab/MinerU/blob/main/mineru/utils/enum_class.py#L14):
```python
class BlockType:
    DISCARDED = 'discarded'
    # VLM 2.5+ also has: HEADER, FOOTER, PAGE_NUMBER, ASIDE_TEXT, PAGE_FOOTNOTE
```

Per [MinerU
documentation](https://opendatalab.github.io/MinerU/reference/output_files/),
discarded blocks contain content that should be filtered out for clean
text extraction.

**Fix:** Changed `pass` to `continue` to skip discarded blocks entirely.

### Testing

- Verified all 16 workers now register successfully in Redis
- All workers heartbeating correctly
- Document parsing works as expected
- MinerU parsing with DISCARDED blocks no longer crashes

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: user210 <user210@rt>
2025-12-18 10:03:30 +08:00
1a4822d6be Refactor: Improve the timestamp consistency (#11942)
### What problem does this PR solve?

Improve the timestamp consistency 

### Type of change
- [x] Refactoring
2025-12-18 09:40:33 +08:00
ce161f09cc feat: add image uploader in edit chunk dialog (#12003)
### What problem does this PR solve?

Add image uploader in edit chunk dialog for replacing image chunk

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-18 09:33:52 +08:00
672958a192 Fix: model not authorized (#12001)
### What problem does this PR solve?

Fix model not authorized. #11973.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-17 19:48:24 +08:00
3820de916c Fix: duplicated PDF parser (#12000)
### What problem does this PR solve?

Fix duplicated PDF parser.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-17 19:48:10 +08:00
ef44979b5c Fix table format warning in Markdown file (#12002)
### What problem does this PR solve?

As title

### Type of change

- [x] Documentation Update
- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 19:27:47 +08:00
d38f8a1562 Add license and Fix IDE warnings (#11985)
### What problem does this PR solve?

- Add license
- Fix IDE warnings

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-17 17:04:44 +08:00
8e4d011b15 Fix: parent-children chunking method. (#11997)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 16:50:36 +08:00
7baa67dfe8 Feat: Reject default admin account log in to normal services (#11994)
### What problem does this PR solve?

Feat: Reject default admin account log in to normal services
#11854
#11673

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 16:29:20 +08:00
e58271ef76 feat: add toc option in transformer node in ingestion pipeline (#11992)
### What problem does this PR solve?

Add TOC (Table of contents) option in Ingestion Pipeline canvas >
Transformer node

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 15:51:55 +08:00
4fd4a41e7c Fix: add multimodel models in chat api (#11986)
…tant, but model is available via UI

Fix: add multimodel models in chat api
Fixes #8549

### What problem does this PR solve?

Add a parameter model_type in chat api.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
2025-12-17 15:46:43 +08:00
82d4e5fb87 Ref: update loggings (#11987)
### What problem does this PR solve?

Ref: update loggins

### Type of change

- [x] Refactoring
2025-12-17 15:43:25 +08:00
d16643a53d Fix: Fixed the issue of empty memory parameters (#11988)
### What problem does this PR solve?

Fix: Fixed the issue of empty memory parameters

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-17 15:42:29 +08:00
93ca1e0b91 Fix: update document api sample reponse is out of date. (#11989)
### What problem does this PR solve?

Fix: update document api sample reponse is out of date.

### Type of change

- [x] Documentation Update
2025-12-17 15:39:12 +08:00
4046bffaf1 fix: unable to save ingestion pipeline config without modifying children delimiter (#11991)
…ildren delimiter

### What problem does this PR solve?

Fix the issue of unable to save **Files > Ingestion Pipeline (Modal)**
config without modifying children delimiter

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-17 15:37:28 +08:00
03f9be7cbb Refa: only support MinerU-API now (#11977)
### What problem does this PR solve?

Only support MinerU-API now, still need to complete frontend for
pipeline to allow the configuration of MinerU options.

### Type of change

- [x] Refactoring
2025-12-17 12:58:48 +08:00
5e05f43c3d Update default prompt (#11984)
### What problem does this PR solve?

New default prompt:

```
You are an intelligent assistant. Your primary function is to answer questions based strictly on the provided knowledge base.

**Essential Rules:**
- Your answer must be derived **solely** from this knowledge base: `{knowledge}`.
- **When information is available**: Summarize the content to give a detailed answer.
- **When information is unavailable**: Your response must contain this exact sentence: "The answer you are looking for is not found in the knowledge base!"
- **Always consider** the entire conversation history.
```

Also fix some grammar errors.

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 12:57:24 +08:00
205a6483f5 Feature:memory function complete (#11982)
### What problem does this PR solve?

memory function complete

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 12:35:26 +08:00
2595644dfd feat: add ingestion pipeline children delimiters configs (#11979)
### What problem does this PR solve?

Add children delimiters for Ingestion pipeline config

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-17 11:18:54 +08:00
30019dab9f Change knowledge base to dataset (#11976)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-17 10:03:33 +08:00
4d46726eb7 Docs: Updated executor manager prerequisites (#11978)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-12-16 19:59:56 +08:00
0e8b9588ba Fix error and format issue (#11975)
### What problem does this PR solve?

1. Fix error of book chunking.
2. Fix format issues.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-16 19:29:37 +08:00
344a106eba Feat: Enable image edit in edit_chunk (#11971)
### What problem does this PR solve?

Feat: Enable image edit in edit_chunk

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-16 17:57:00 +08:00
bccad7b4a8 Docs: Migrate to single bucket mode (contributed by community) (#11972)
### What problem does this PR solve?

Update

### Type of change

- [x] Documentation Update
2025-12-16 16:31:47 +08:00
f7926724aa Fix security issue (#11965)
### What problem does this PR solve?

- CVE-2024-6866
- CVE-2024-6844
- CVE-2024-6839

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-16 13:31:45 +08:00
5bba562048 Feature/excel export fix (#11914)
### PR details 
feat: Add Excel export support and fix variable reference regex
Changes:
- Add Excel export output format option to Message component
- Apply nest_asyncio patch to handle nested event loops
- Fix async generator iteration in canvas_app.py debug endpoint
- Add underscore support in variable reference regex pattern


### What problem does this PR solve?



### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Shivam Johri <shivamjohri@Shivams-MacBook-Air.local>
2025-12-16 13:15:52 +08:00
49c74d08e8 Feature/mineru improvements (#11938)
我已在下面的评论中用中文重复说明。

### What problem does this PR solve?

## Summary
This PR enhances the MinerU document parser with additional
configuration options, giving users more control over PDF parsing
behavior and improving support for multilingual documents.

## Changes

### Backend (`deepdoc/parser/mineru_parser.py`)
- Added configurable parsing options:
- **Parse Method**: `auto`, `txt`, or `ocr` — allows users to choose the
extraction strategy
- **Formula Recognition**: Toggle for enabling/disabling formula
extraction (useful to disable for Cyrillic documents where it may cause
issues)
- **Table Recognition**: Toggle for enabling/disabling table extraction
- Added language code mapping (`LANGUAGE_TO_MINERU_MAP`) to translate
RAGFlow language settings to MinerU-compatible language codes for better
OCR accuracy
- Improved parser configuration handling to pass these options through
the processing pipeline

### Frontend (`web/`)
- Created new `MinerUOptionsFormField` component that conditionally
renders when MinerU is selected as the layout recognition engine
- Added UI controls for:
  - Parse method selection (dropdown)
  - Formula recognition toggle (switch)
  - Table recognition toggle (switch)
- Added i18n translations for English and Chinese
- Integrated the options into both the dataset creation dialog and
dataset settings page

### Integration
- Updated `rag/app/naive.py` to forward MinerU options to the parser
- Updated task service to handle the new configuration parameters

## Why
MinerU is a powerful document parser, but the default settings don't
work well for all document types. This PR allows users to:
1. Choose the best parsing method for their documents
2. Disable formula recognition for Cyrillic/non-Latin scripts where it
causes issues
3. Control table extraction based on document needs
4. Benefit from automatic language detection for better OCR results

## Testing
- [x] Tested MinerU parsing with different parse methods
- [x] Verified UI renders correctly when MinerU is selected/deselected
- [x] Confirmed settings persist correctly in dataset configuration

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: user210 <user210@rt>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-16 13:15:25 +08:00
1112b6291b Remove unused py module dependencies (#11964)
### What problem does this PR solve?

```
pip list --not-required
Package                      Version
---------------------------- ---------------------
aiosmtplib                   5.0.0
akshare                      1.17.94
anthropic                    0.34.1
arxiv                        2.1.3
Aspose.Slides                24.7.0
atlassian-python-api         4.0.7
azure-identity               1.17.1
azure-storage-file-datalake  12.16.0
bio                          1.7.1
boxsdk                       10.2.0
captcha                      0.7.1
cn2an                        0.5.22
cohere                       5.6.2
Crawl4AI                     0.4.247
dashscope                    1.20.11
deepl                        1.18.0
demjson3                     3.0.6
discord.py                   2.3.2
dropbox                      12.0.2
duckduckgo_search            7.5.5
editdistance                 0.8.1
elasticsearch-dsl            8.12.0
exceptiongroup               1.3.1
extract-msg                  0.55.0
ffmpeg-python                0.2.0
flasgger                     0.9.7.1
Flask-Cors                   5.0.0
Flask-Login                  0.6.3
Flask-Mail                   0.10.0
Flask-Session                0.8.0
google-auth-oauthlib         1.2.3
google-genai                 1.55.0
google-generativeai          0.8.5
google_search_results        2.4.2
graspologic                  0.1.dev847+g38e680cab
groq                         0.9.0
grpcio-status                1.67.1
html_text                    0.6.2
imageio-ffmpeg               0.6.0
infinity_emb                 0.0.66
infinity-sdk                 0.6.11
jira                         3.10.5
json_repair                  0.35.0
langfuse                     3.10.5
mammoth                      1.11.0
Markdown                     3.6
markdown_to_json             2.1.1
markdownify                  1.2.2
mcp                          1.19.0
mini-racer                   0.12.4
minio                        7.2.4
mistralai                    0.4.2
moodlepy                     0.24.1
mypy-boto3-s3                1.40.26
Office365-REST-Python-Client 2.6.2
ollama                       0.6.1
onnxruntime-gpu              1.23.2
opencv-python                4.10.0.84
opencv-python-headless       4.10.0.84
opendal                      0.45.20
opensearch-py                2.7.1
ormsgpack                    1.5.0
pdfplumber                   0.10.4
pip                          25.3
pluginlib                    0.9.4
psycopg2-binary              2.9.11
pyclipper                    1.4.0
pycryptodomex                3.20.0
pyobvector                   0.2.18
pyodbc                       5.3.0
pypandoc                     1.16.2
pypdf                        6.4.0
PyPDF2                       3.0.1
python-calamine              0.6.1
python-docx                  1.2.0
python-pptx                  1.0.2
pywencai                     0.13.1
qianfan                      0.4.6
quart-auth                   0.11.0
quart-cors                   0.8.0
ranx                         0.3.20
readability-lxml             0.8.4.1
replicate                    0.31.0
reportlab                    4.4.6
roman-numbers                1.0.2
ruamel.base                  1.0.0
ruamel.yaml                  0.18.16
scholarly                    1.7.11
selenium-wire                5.1.0
slack_sdk                    3.37.0
socksio                      1.0.0
sqlglotrs                    0.9.0
StrEnum                      0.4.15
tavily-python                0.5.1
tencentcloud-sdk-python      3.0.1478
tika                         2.6.0
valkey                       6.0.2
vertexai                     1.70.0
volcengine                   1.0.194
voyageai                     0.2.3
webdav4                      0.10.0
webdriver-manager            4.0.1
wikipedia                    1.4.0
word2number                  1.1
xgboost                      1.6.0
xpinyin                      0.7.6
yfinance                     0.2.65
zhipuai                      2.0.1

```

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-16 12:40:03 +08:00
ef5d1d4b74 Fix: 'AzureEmbed' object has no attribute 'total_token_count_from_response' (#11962)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/11956

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-16 11:29:07 +08:00
a98887d4ca Fix: Bug fixes (#11960)
### What problem does this PR solve?

Fix: Bug fixes

New search popup style modification
Fixed multilingual settings not updating immediately on personal center
page
Changed overlapped percent to percentage format, with maximum value of
30%
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-16 09:44:06 +08:00
7ca3e11566 Update dataset config and retrieval testing (#11958)
### What problem does this PR solve?

1. Refactor the order of the dataset config items.
2. Refactor the text of retrieval test.
3. Refactor typos

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-15 19:56:28 +08:00
a2e080c2d3 feat: display name instead of key in user fillup form submission (#11931)
### What problem does this PR solve?

- Change the message format from 'key: value' to 'name: value' when user
submits the fillup form in agent chat.
- This resolves #11865
- I think this change makes sense, better aligning the form and the
replied message.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-15 19:12:01 +08:00
ad6f7fd4b0 Fix: pipeline ignore MinerU backend config and vllm module is missing (#11955)
### What problem does this PR solve?

Fix pipeline ignore MinerU backend config and vllm module is missing.
#11944, #11947.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-15 18:03:34 +08:00
2a0f835ffe Refactor: Improve the logic to calculate embedding total token count (#11943)
### What problem does this PR solve?

 Improve the logic to calculate embedding total token count 

### Type of change

- [x] Refactoring
2025-12-15 11:33:57 +08:00