Commit Graph

3738 Commits

Author SHA1 Message Date
b1baa91ff0 feat(next-search): Implements document preview functionality #3221 (#9465)
### What problem does this PR solve?

feat(next-search): Implements document preview functionality

- Adds a new document preview modal component
- Implements document preview page logic
- Adds document preview-related hooks
- Optimizes document preview rendering logic
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-14 12:11:53 +08:00
b55c3d07dc Test: Update error message assertions for chunk update tests (#9468)
### What problem does this PR solve?

Modify test cases to accept additional error message format when
updating chunks.
fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16942741621/job/48015850297

### Type of change

- [x] Update test cases
2025-08-14 12:11:20 +08:00
2b3318cd3d Fix: KB folder may not there while creating virtual file (#9431)
### What problem does this PR solve?

KB folder may not there while creating virtual file. #9423 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 09:40:30 +08:00
434b55be70 Feat: Display a separate chat multi-model comparison page #3221 (#9461)
### What problem does this PR solve?
Feat: Display a separate chat multi-model comparison page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-14 09:39:20 +08:00
98b4c67292 Trival. (#9460)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-14 09:39:00 +08:00
3d645ff31a Docs: Update HTTP API reference with simplified response format and parameters (#9454)
### What problem does this PR solve?

- Make `session_id` optional and add `inputs` parameter
- Remove deprecated `sync_dsl` parameter
- Update request/response examples to match current API behavior

### Type of change

- [x] Documentation Update
2025-08-13 21:02:54 +08:00
5e8cd693a5 Refa: split services about llm. (#9450)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2025-08-13 16:41:01 +08:00
29f297b850 Fix: update broken create agent session due to v0.20.0 changes (#9445)
### What problem does this PR solve?

 Update broken create agent session due to v0.20.0 changes. #9383


**NOTE: A session ID is no longer required to interact with the agent.**

See: #9241, #9309.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-13 16:01:54 +08:00
7235638607 Feat: Show multiple chat boxes #3221 (#9443)
### What problem does this PR solve?

Feat: Show multiple chat boxes #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-13 15:59:51 +08:00
00919fd599 Fix typo in issue template (#9444) 2025-08-13 14:27:15 +08:00
43c0792ffd Add issue template for agent scenario feature request (#9437) 2025-08-13 12:50:06 +08:00
4b1b68c5fc Fix: no doc hits after meta data filter. (#9435)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-13 12:43:31 +08:00
3492f54c7a Docs: Update HTTP API reference with new response fields (#9434)
### What problem does this PR solve?

Add `url`, `doc_type`, and `created_at` fields to the API response
example in the documentation.

### Type of change

- [x] Documentation Update
2025-08-13 12:18:39 +08:00
da5cef0686 Refactor:Improve the float compare for LocalAIRerank (#9428)
### What problem does this PR solve?
Improve the float compare for LocalAIRerank

### Type of change

- [x] Refactoring
2025-08-13 10:26:42 +08:00
9098efb8aa Feat: Fixed the issue where some fields in the chat configuration could not be displayed #3221 (#9430)
### What problem does this PR solve?

Feat: Fixed the issue where some fields in the chat configuration could
not be displayed #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-13 10:26:26 +08:00
421657f64b Feat: allows setting multiple types of default models in service config (#9404)
### What problem does this PR solve?

Allows set multiple types of default models in service config.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-13 09:46:05 +08:00
7ee5e0d152 Fix KeyError in session listing endpoint when accessing conversation reference (#9419)
- Add type and boundary checks for conv["reference"] access
- Prevent KeyError: 0 when reference list is empty or malformed
- Ensure reference is list type before indexing
- Handle cases where reference items are None or missing chunks
- Maintains backward compatibility with existing data structures

This resolves crashes in /api/v1/agents/<agent_id>/sessions endpoint
when conversation reference data is not properly structured.

### What problem does this PR solve?

This PR fixes a critical `KeyError: 0` that occurs in the
`/api/v1/agents/<agent_id>/sessions` endpoint when the system attempts
to access conversation reference data that is not properly structured.

**Background Context:**
The `list_agent_session` method in `api/apps/sdk/session.py` assumes
that `conv["reference"]` is always a properly indexed list with valid
dictionary structures. However, in real-world scenarios, this data can
be:
- Not a list type (could be None, string, or other types)
- An empty list when `chunk_num` tries to access index 0
- Contains None values or malformed dictionary structures
- Missing expected "chunks" keys in reference items

**Impact Before Fix:**
When malformed reference data is encountered, the API crashes with:
```json
{
    "code": 100,
    "data": null,
    "message": "KeyError(0)"
}
```
**Solution:**
Added comprehensive safety checks including type validation, boundary
checking, null safety, and structure validation to ensure the API
gracefully handles all reference data formats while maintaining backward
compatibility.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-08-13 09:23:52 +08:00
22915223d4 Fix: citation issue. (#9424)
### What problem does this PR solve?

#8474

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 18:53:34 +08:00
d7b4e84cda Refa: Update LLM stream response type to Generator (#9420)
### What problem does this PR solve?

Change return type of _generate_streamly from str to Generator[str,
None, None] to properly type hint streaming responses.

### Type of change

- [x] Refactoring
2025-08-12 18:05:52 +08:00
e845d5f9f8 Fix:valueERROR when file is optional but not exist value (#9414)
### What problem does this PR solve?

when begin component has optional file but not exist , it rase error

### Type of change

- [x] Bug Fix

Co-authored-by: Popmio <zhengyihao036@gamil.com>
2025-08-12 17:39:03 +08:00
3d18284dd6 Feat: Added meta data to the chat configuration page #8531 (#9417)
### What problem does this PR solve?

Feat: Added meta data to the chat configuration page #8531

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 16:19:23 +08:00
96783aa82c Fix: remove doc error. (#9413)
### What problem does this PR solve?

Close #9407

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 15:55:04 +08:00
a0c2da1219 Fix: Patch LiteLLM (#9416)
### What problem does this PR solve?

Patch LiteLLM refactor. #9408

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 15:54:30 +08:00
79e2edc835 Fix "File contains no valid workbook part" (#9360)
### What problem does this PR solve?

fix "File contains no valid workbook part"

stacktrace:
```
Traceback (most recent call last):
  File "/ragflow/deepdoc/parser/excel_parser.py", line 54, in _load_excel_to_workbook
    return RAGFlowExcelParser._dataframe_to_workbook(df)
  File "/ragflow/deepdoc/parser/excel_parser.py", line 69, in _dataframe_to_workbook
    ws.cell(row=row_num, column=col_num, value=value)
  File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/worksheet/worksheet.py", line 246, in cell
    cell.value = value
  File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 218, in value
    self._bind_value(value)
  File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 197, in _bind_value
    value = self.check_string(value)
  File "/ragflow/.venv/lib/python3.10/site-packages/openpyxl/cell/cell.py", line 165, in check_string
    raise IllegalCharacterError(f"{value} cannot be used in worksheets.")
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-08-12 14:58:36 +08:00
57b87fa9d9 Fix:TypeError: OllamaCV.chat() got an unexpected keyword argument 'stop' (#9363)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9351
Support filter argument before invoking
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-12 14:55:27 +08:00
153e430b00 Feat: add meta data filter. (#9405)
### What problem does this PR solve?

#8531 
#7417 
#6761 
#6573
#6477

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 14:12:56 +08:00
3ccaa06031 Fix: Before executing the SQL, remove tags in the format [ID: number] to avoid execution errors. (#9326)
### What problem does this PR solve?

Before executing the SQL, remove tags in the format [ID: number] to
avoid execution errors.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wangyazhou <wangyazhou@sdibd.cn>
2025-08-12 12:42:28 +08:00
569ab011c4 Add fallback to use 'calamine' parse engine in excel_parser.py (#9374)
### What problem does this PR solve?

add fallback to `calamine` engine when parse error raised using the
default `openpyxl` / `xlrd` engine.
e.g. the following error can be fixed:
```
Traceback (most recent call last):
  File "/ragflow/deepdoc/parser/excel_parser.py", line 53, in _load_excel_to_workbook
    df = pd.read_excel(file_like_object)
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 495, in read_excel
    io = ExcelFile(
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 1567, in __init__
    self._reader = self._engines[engine](
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_xlrd.py", line 46, in __init__
    super().__init__(
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 573, in __init__
    self.book = self.load_workbook(self.handles.handle, engine_kwargs)
  File "/ragflow/.venv/lib/python3.10/site-packages/pandas/io/excel/_xlrd.py", line 63, in load_workbook
    return open_workbook(file_contents=data, **engine_kwargs)
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/__init__.py", line 172, in open_workbook
    bk = open_workbook_xls(
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/book.py", line 68, in open_workbook_xls
    bk.biff2_8_load(
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/book.py", line 641, in biff2_8_load
    cd.locate_named_stream(UNICODE_LITERAL(qname))
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/compdoc.py", line 398, in locate_named_stream
    result = self._locate_stream(
  File "/ragflow/.venv/lib/python3.10/site-packages/xlrd/compdoc.py", line 429, in _locate_stream
    raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))
xlrd.compdoc.CompDocError: Workbook corruption: seen[2] == 4
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 12:41:33 +08:00
96b1538b3e Fix:HTTP request component failed to retrieve the corresponding value (#9399)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/9385
Based on my understanding, I think checking empty string is fine

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-08-12 12:27:22 +08:00
735570486f feat(next-search): Added AI summary functionality #3221 (#9402)
### What problem does this PR solve?

feat(next-search): Added AI summary functionality #3221

- Added the LlmSettingFieldItems component for AI summary settings
- Updated the SearchSetting component to integrate AI summary
functionality
- Added the updateSearch hook and related service methods
- Modified the ISearchAppDetailProps interface to add the llm_setting
field

### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 12:27:00 +08:00
da68f541b6 Feat: add full list of supported AWS Bedrock regions (#9395)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 11:01:16 +08:00
83771e500c Refa: migrate chat models to LiteLLM (#9394)
### What problem does this PR solve?

All models pass the mock response tests, which means that if a model can
return the correct response, everything should work as expected.
However, not all models have been fully tested in a real environment,
the real API_KEY. I suggest actively monitoring the refactored models
over the coming period to ensure they work correctly and fixing them
step by step, or waiting to merge until most have been tested in
practical environment.

### Type of change

- [x] Refactoring
2025-08-12 10:59:20 +08:00
a6d2119498 Refa: list canvas (#9341)
### What problem does this PR solve?

Refactor list canvas.

### Type of change

- [x] Refactoring
2025-08-12 10:58:06 +08:00
57b9f8cf52 Fix: Update test assertions and simplify test cases (#9400)
### What problem does this PR solve?

- Fix error message assertion in test_update_chunk.py to match new
ownership validation
- Simplify dataset listing test cases by removing lambda assertions for
sorting
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16885465524/job/47831942553

### Type of change

- [x] Fix test cases
2025-08-12 10:57:30 +08:00
5c3577c4c9 Python SDK: add meta_fields to Document class (#9387)
### What problem does this PR solve?

Python class Document was missing "meta_fields", e.g. when querying, the
document instances came without meta_fields

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-12 10:16:12 +08:00
76118000c1 Feat: Allow chat to use meta data #3221 (#9393)
### What problem does this PR solve?

Feat:  Allow chat to use meta data #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-12 10:15:10 +08:00
9433f64fe2 Feat: added functionality to choose all datasets if no id is provided (#9184)
### What problem does this PR solve?

Using the mcp server in n8n sometimes (with smaller models) results in
errors because the llm misses a char or adds one to the list of
dataset_ids provided. It first asks for the list of datasets and if you
got a larger list of them it makes a error recalling the list
completely. So adding the feature to just search through all available
datasets solves this and makes the retrieval of data more stable. The
functionality to just call special datasets by id is not changed, the
dataset_ids are now not required anymore (only the "question" is). You
can provide (like before) a list of datasets, a empty list or no list at
all.

### Type of change

- [X] New Feature (non-breaking change which adds functionality)
<img width="1897" height="880" alt="mcp error dataset id"
src="https://github.com/user-attachments/assets/71076d24-f875-4663-a69a-60839fc7a545"
/>
2025-08-11 17:20:35 +08:00
d7c9611d45 docs(sandbox): update /etc/hosts entry to include required services (#9144)
Fixes an issue where running the sandbox (code component) fails due to
unresolved hostnames. Added missing service names (es01, infinity,
mysql, minio, redis) to 127.0.0.1 in the /etc/hosts example.

Reference: https://github.com/infiniflow/ragflow/issues/8226

## What this PR does

Updates the sandbox quickstart documentation to fix a known issue where
the sandbox fails to resolve required service hostnames.

## Why

Following the original instruction leads to a `Failed to resolve 'none'`
error, as discussed in issue #8226. Adding the missing service names to
`127.0.0.1` resolves the problem.

## Related issue

https://github.com/infiniflow/ragflow/issues/8226

## Note

It might be better to add `127.0.0.1 es01 infinity mysql minio redis` to
docs/quickstart.mdx, but since no issues appeared at the time without
adding this line—and the problem occurred while working with the code
component—I added it here.

### Type of change

- [X] Documentation Update
2025-08-11 17:18:56 +08:00
79399f7f25 Support the case of one cell split by multiple columns. (#9225)
### What problem does this PR solve?
Support the case of one cell split by multiple columns. Besides, the
codes are compatible with the common cell case.
#8606 can be fixed.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

I provide a case of one cell split by multiple columns:

[test.xlsx](https://github.com/user-attachments/files/21578693/test.xlsx)

The chunk res:
<img width="236" height="57" alt="2025-06-17 16-04-07 的屏幕截图"
src="https://github.com/user-attachments/assets/b0a499ac-349d-4c3d-8c6e-0931c8fc26de"
/>
2025-08-11 17:17:56 +08:00
23522f1ea8 Fix: handle missing dataset_ids when creating chat assistant (#9324)
- Root cause: accessing req.get("dataset_ids") returns None when the key
is absent, causing KeyError.
- Fix: use req.get("dataset_ids", []) to default to empty list.
2025-08-11 17:17:20 +08:00
46dc3f1c48 Fix: Update test assertions and add GraphRAG config in dataset tests (#9386)
### What problem does this PR solve?

- Modify error message assertion in chunk update test to check for
document ownership
- Add GraphRAG configuration with `use_graphrag: False` in dataset
update tests
- Fix actions:
https://github.com/infiniflow/ragflow/actions/runs/16863637898/job/47767511582
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:15:48 +08:00
c9b156fa6d Fix: Remove default dataset_ids from Chat class initialization (#9381)
### What problem does this PR solve?

- The default dataset_ids "kb1" was removed from the Chat class. 
- The HTTP API response does not include the dataset_ids field.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:15:30 +08:00
83939b1a63 Feat: add full list of supported AWS Bedrock regions (#9378)
### What problem does this PR solve?

Add full list of supported AWS Bedrock regions.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 17:15:07 +08:00
7f08ba47d7 Fix "no tc element at grid_offset" (#9375)
### What problem does this PR solve?

fix "no `tc` element at grid_offset", just log warning and ignore.
stacktrace:
```
Traceback (most recent call last):
  File "/ragflow/rag/svr/task_executor.py", line 620, in handle_task
    await do_handle_task(task)
  File "/ragflow/rag/svr/task_executor.py", line 553, in do_handle_task
    chunks = await build_chunks(task, progress_callback)
  File "/ragflow/rag/svr/task_executor.py", line 257, in build_chunks
    cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
  File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
    return msg_from_thread.unwrap()
  File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    raise captured_error
  File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
    return result.unwrap()
  File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    raise captured_error
  File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
    ret = context.run(sync_fn, *args)
  File "/ragflow/rag/svr/task_executor.py", line 257, in <lambda>
    cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
  File "/ragflow/rag/app/naive.py", line 384, in chunk
    sections, tables = Docx()(filename, binary)
  File "/ragflow/rag/app/naive.py", line 230, in __call__
    while i < len(r.cells):
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 438, in cells
    return tuple(_iter_row_cells())
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 436, in _iter_row_cells
    yield from iter_tc_cells(tc)
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 424, in iter_tc_cells
    yield from iter_tc_cells(tc._tc_above)  # pyright: ignore[reportPrivateUsage]
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 741, in _tc_above
    return self._tr_above.tc_at_grid_offset(self.grid_offset)
  File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 98, in tc_at_grid_offset
    raise ValueError(f"no `tc` element at grid_offset={grid_offset}")
ValueError: no `tc` element at grid_offset=10
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:13:10 +08:00
ce3dd019c3 Fix broken data stream when writing image file (#9354)
### What problem does this PR solve?

fix "broken data stream when writing image file", just log warning and
ignore

Close #8379 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 17:07:49 +08:00
476c56868d Agent plans tasks by referring to its own prompt. (#9315)
### What problem does this PR solve?

Fixes the issue in the analyze_task execution flow where the Lead Agent
was not utilizing its own sys_prompt during task analysis, resulting in
incorrect or incomplete task planning.
https://github.com/infiniflow/ragflow/issues/9294
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 17:05:06 +08:00
b9c4954c2f Fix: Replace StrEnum with strenum in code_exec.py (#9376)
### What problem does this PR solve?

- The enum import was changed from Python's built-in StrEnum to the
strenum package.
- Fix error `Warning: Failed to import module code_exec: cannot import
name 'StrEnum' from 'enum' (/usr/lib/python3.10/enum.py)`

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-11 15:32:04 +08:00
a060672b31 Feat: Run eslint when the project is running to standardize everyone's code #9377 (#9379)
### What problem does this PR solve?

Feat: Run eslint when the project is running to standardize everyone's
code #9377

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 15:31:38 +08:00
f022504ef9 Support Russian in UI Update config.ts (#9361)
add ru

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2025-08-11 15:30:35 +08:00
1a78b8b295 Support Russian in UI (#9362)
### What problem does this PR solve?

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-11 14:06:18 +08:00