Commit Graph

2776 Commits

Author SHA1 Message Date
b578451e6a docs: update Docker build commands to specify platform as linux/amd64 (#6977)
### What problem does this PR solve?

Considering the ragflow_deps image is only available for `linux/amd64`
platform, if we try to run the docker build commands in ,macOS for
instance, without the platform flag, we get an error due to the
different platform. Specifying the platform in the docker build command
fixes this issue.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [X] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-04-14 10:07:39 +08:00
53c653b099 fix RAGFlowPdfParser AttributeError: 'PdfReader' object has no attribute 'close' err (#6859)
i use PdfParser in local(refer to this case:
https://github.com/infiniflow/ragflow/blob/main/rag/app/paper.py) like
this:
```
import re
import openpyxl

from ragflow.api.db import ParserType
from ragflow.rag.nlp import rag_tokenizer, tokenize, tokenize_table, add_positions, bullets_category, \
    title_frequency, \
    tokenize_chunks
from ragflow.rag.utils import num_tokens_from_string
from ragflow.deepdoc.parser import PdfParser, ExcelParser, DocxParser,PlainParser


def logger(prog=None, msg=""):
    print(msg)


class Pdf(PdfParser):
    def __init__(self):
        self.model_speciess = ParserType.MANUAL.value
        super().__init__()

    def __call__(self, filename, binary=None, from_page=0,
                 to_page=100000, zoomin=3, callback=None):
        from timeit import default_timer as timer
        start = timer()
        callback(msg="OCR is running...")

        self.__images__(
            filename if not binary else binary,
            zoomin,
            from_page,
            to_page,
            callback
        )
        callback(msg="OCR finished.")
        print("OCR:", timer() - start)
   
        self._layouts_rec(zoomin)
        callback(0.65, "Layout analysis finished.")
        print("layouts:", timer() - start)

        self._table_transformer_job(zoomin)
        callback(0.67, "Table analysis finished.")


        self._text_merge()
        tbls = self._extract_table_figure(True, zoomin, True, True)
        self._concat_downward()  
        self._filter_forpages()   
        callback(0.68, "Text merging finished")

        # clean mess
        for b in self.boxes:
            b["text"] = re.sub(r"([\t  ]|\u3000){2,}", " ", b["text"].strip())

        return [(b["text"], b.get("layout_no", ""), self.get_position(b, zoomin))
                for i, b in enumerate(self.boxes)], tbls


```

show err like this:
```
  File "xxxxx/third_party/ragflow/deepdoc/parser/pdf_parser.py", line 1039, in __images__
    self.pdf.close()
AttributeError: 'PdfReader' object has no attribute 'close'
```

i found ragflow source code use
`pdfplumber.open`(https://github.com/infiniflow/ragflow/blob/main/deepdoc/parser/pdf_parser.py#L1007C28-L1007C43)

and replace` self.pdf `with ` pdf2_read` (from pypdf import PdfReader as
pdf2_read)in line 1024
(https://github.com/infiniflow/ragflow/blob/main/deepdoc/parser/pdf_parser.py#L1024)
```
self.pdf = pdf2_read
```


---
and I found that `pdfplumber` can be used in this way:
```
file_path="xxx.pdf"
res = pdfplumber.open(file_path)
res.close()
```

but `pypdf.PdfReader` source code do not has `close` func, source code
use like this

```
 with open(stream, "rb") as fh:
         stream = BytesIO(fh.read())
          self._stream_opened = True
```
> https://github.com/py-pdf/pypdf/blob/main/pypdf/_reader.py#L156

so I moved the `self.pdf.close` function call and fixed this problem
hoping to help the project😊
2025-04-14 09:40:13 +08:00
b70abe52b2 Fix: Ensure lock is released in update_progress using context manager (#6975)
ragflow: v0.17 also encountered this problem. #1453 The task table shows
that the actual task has been completed. Since the process_msg of the
task is not synchronized to the document table, there is no progress
update on the page.
This may be caused by the lock not being released when the exception
occurs.

ragflow:v0.17同样碰到这个问题, 看task表实际任务已经完成,由于没有把task的process_msg同步给document表,
所以在页面看没有进度更新。
可能是这里异常时没有释放锁导致的。

```/api/ragflow_server.py
def update_progress():
    lock_value = str(uuid.uuid4())
    redis_lock = RedisDistributedLock("update_progress", lock_value=lock_value, timeout=60)
    logging.info(f"update_progress lock_value: {lock_value}")
    while not stop_event.is_set():
        try:
            if redis_lock.acquire():
                DocumentService.update_progress()
                redis_lock.release()
            stop_event.wait(6)
        except Exception:
            logging.exception("update_progress exception")
++       if redis_lock.acquired:
++               redis_lock.release()
```
2025-04-11 20:46:19 +08:00
98670c3755 Fix: KB update_time changed whenever system relaunched (#6959)
### What problem does this PR solve?

Fix KB update_time changed whenever system relaunched. #6953 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-11 20:10:49 +08:00
9b789c2ae9 Test: Added test cases for Update Session With Chat Assistant HTTP API (#6968)
### What problem does this PR solve?

cover [Update chat assistant's
sessions](https://ragflow.io/docs/dev/http_api_reference#update-chat-assistants-session)
endpoints

### Type of change

- [x] Update test cases
2025-04-11 20:10:24 +08:00
ffb9f01bea Test: Update test cases for PR 6906 ISSUE 6875 (#6971)
### What problem does this PR solve?

PR #6906 ISSUE #6875

### Type of change

- [ ] Update test cases
2025-04-11 20:09:44 +08:00
ed7244f5f5 Fix: Delete unused pages (#6973)
### What problem does this PR solve?

Fix: Delete unused pages

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-11 20:06:58 +08:00
e54c0e39b5 fix bug [ERROR][Exception]: 8 vs. 9 (#6955)
### What problem does this PR solve?

Sometimes, the **s** in **chunks (s, a)** is an empty string. This
causes the condition **if s and len(a) > 0** in the line **chunks = [(s,
a) for s, a in chunks if s and len(a) > 0]** to fail, which changes the
length of the new chunks. As a result, the final assertion **assert
len(chunks) - end == n_clusters, "{} vs. {}".format(len(chunks) - end,
n_clusters)** fails and raises a confusing error like 7 vs. 8

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-04-11 17:01:49 +08:00
056ea68e52 Fix: In the dark night theme, the message input box is not displayed correctly. #6950 (#6951)
### What problem does this PR solve?

Fix: In the dark night theme, the message input box is not displayed
correctly. #6950

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-11 12:37:16 +08:00
d9266ed65a Fix: incorrect total chunks count in retrieval function after similarity filtering (#6741) (#6932)
### Related Issue:
https://github.com/infiniflow/ragflow/issues/6741

### Environment:
Using nightly version
Commit version:
[[6051abb](6051abb4a3)]

### Bug Description:
The retrieval function in rag/nlp/search.py returns the original total
chunks number
even after chunks are filtered by similarity_threshold. This creates
inconsistency
between the actual returned chunks and the reported total.

### Changes Made:
Added code to count how many search results actually meet or exceed the
configured similarity threshold
Positioned the calculation after the doc_ids conditional logic to ensure
special cases are handled correctly
Updated the ranks["total"] value to store this filtered count instead of
using the raw search result count
Using NumPy leverages optimized C-level batch operations to optimize
speed
2025-04-11 12:31:36 +08:00
6051abb4a3 Miscellaneous UI updates (#6947)
### What problem does this PR solve?


### Type of change


- [x] Documentation Update
2025-04-10 20:09:46 +08:00
4b125f0ffe Feat: Add translation text to the prompt word of the generate operator to distinguish it from the prompt word of the knowledge base #6934 (#6935)
### What problem does this PR solve?

Feat: Add translation text to the prompt word of the generate operator
to distinguish it from the prompt word of the knowledge base #6934

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-10 19:24:04 +08:00
43cf321942 Added similarity scores in reference chunks (#6918)
- Returning 3 similarity scores to the chat completion's `reference`
field. It gives the user more transparency and added flexibility to
display/rerank the reference when needed

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2025-04-10 19:17:45 +08:00
9283e91aa0 Fix: remove deprecated permission field (#6912)
### What problem does this PR solve?

Fix: remove deprecated KB updating `permission` field. #6911 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-10 18:56:41 +08:00
dc59aba132 Test: Added test cases for List Sessions With Chat Assistant HTTP API (#6938)
### What problem does this PR solve?

cover [List chat assistant's
sessions](https://ragflow.io/docs/dev/http_api_reference#list-chat-assistants-sessions)
endpoints

### Type of change

- [x] Update test cases
2025-04-10 17:31:01 +08:00
8fb5edd927 Test: Update test cases for PR 6906 (#6929)
### What problem does this PR solve?

PR #6906

### Type of change

- [x] Update test cases
2025-04-10 12:28:56 +08:00
3bb1e012e6 Fix: assistant deleteion issue. (#6906)
### What problem does this PR solve?

#6875

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-09 20:29:40 +08:00
22758a2763 Test: Update test cases for PR 6888 ISSUE 6876 (#6907)
### What problem does this PR solve?

PR #6888 ISSUE #6876

### Type of change

- [x] Update test case
2025-04-09 20:29:29 +08:00
a008b38cf5 Fix: local variable referenced before assignment (#6909)
### What problem does this PR solve?

Fix: local variable referenced before assignment. #6803 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-09 20:29:12 +08:00
d0897312ac Added a guide on setting chat variables (#6904)
### What problem does this PR solve?



### Type of change

- [x] Documentation Update
2025-04-09 19:32:25 +08:00
aa99c6b896 Fix delete duplicate assistant (#6888)
### What problem does this PR solve?

resolve this issue:https://github.com/infiniflow/ragflow/issues/6876

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-04-09 19:10:08 +08:00
ae107f31d9 Test: Added test cases for Create Session With Chat Assistant HTTP API (#6902)
### What problem does this PR solve?

cover [create session with chat
assistant](https://ragflow.io/docs/dev/http_api_reference#create-session-with-chat-assistant)
endpoints

### Type of change

- [x] add test cases
2025-04-09 17:21:48 +08:00
9d9f2dacd2 fix Conversation roles must alternate user/assistant/user/assistant/... bug (#6880)
### What problem does this PR solve?

The old logic filters out all assistant messages from messages, which,
in multi-turn conversations, results in only user messages being
retained. This leads to an error in locally deployed models:
Conversation roles must alternate user/assistant/user/assistant/...

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-04-09 17:21:27 +08:00
08bc5d3521 Feat: Install sonner library #3221 (#6898)
### What problem does this PR solve?

Feat: Install sonner library #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-09 17:21:01 +08:00
6e7fb75618 Fix: handle waiting tasks when upstream is switch/categorize/relevant and normal path fails (#6874)
### What problem does this PR solve?

Fix the issue where waiting tasks couldn't be processed when upstream
components were "switch", "categorize", or "relevant" and the normal
processing path couldn't continue.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-04-09 12:37:21 +08:00
c26c38ee12 Test: Added test cases for Delete Chat Assistants HTTP API (#6879)
### What problem does this PR solve?

cover [delete chat
assistants](https://ragflow.io/docs/dev/http_api_reference#delete-chat-assistants)
endpoints

### Type of change

- [x] add test cases
2025-04-08 18:53:02 +08:00
dc2c74b249 Feat: add primitive support for function calls (#6840)
### What problem does this PR solve?

This PR introduces ​**​primitive support for function calls​**​,
enabling the system to handle basic function call capabilities.
However, this feature is currently experimental and ​**​not yet enabled
for general use​**​, as it is only supported by a subset of models,
namely, Qwen and OpenAI models.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-08 16:09:03 +08:00
a20439bf81 fix: add exception handling for get_by_id method (#6861)
### What problem does this PR solve?

Fixes #6548 

Add exception handling to prevent exceptions from propagating back to
the web, which may lead to failure in displaying conversation content.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: cm <caiming@sict.ac.cn>
2025-04-08 16:06:57 +08:00
a1fb32908d Fix: Error message is incorrect when updating chat name #6850 (#6851)
### What problem does this PR solve?

#6850 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-07 17:13:17 +08:00
0b89458eb8 Test: Added test cases for Update Chat Assistant HTTP API (#6843)
### What problem does this PR solve?

cover [update chat
assistant](https://ragflow.io/docs/v0.17.2/http_api_reference#update-chat-assistant)
endpoints

### Type of change

- [x] add test cases
2025-04-07 15:04:23 +08:00
14a3efd756 Fix: docx image exceptions. (#6839)
### What problem does this PR solve?

Close #6784

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-07 12:33:34 +08:00
d64c6870bb Fix:When parsing documents with graph, an error occurred:[ERROR][Exception]: 'method' (#6836)
[When parsing documents with graph, an error
occurred:[ERROR][Exception]: 'method']
(https://github.com/infiniflow/ragflow/issues/6835)
### What problem does this PR solve?

Close #6786

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: cm <caiming@sict.ac.cn>
2025-04-07 12:29:25 +08:00
dc87c91f9d Update broken discord link (#6841)
### Type of change

- [x] Documentation Update
2025-04-07 12:18:43 +08:00
d4574ffb49 Fix: improve Dockerfile build for China (#6812)
### What problem does this PR solve?
This PR addresses the build and dependency issues faced by developers in
regions with poor connectivity to official Ubuntu repositories and
standard dependency sources. Currently, developers in these regions
experience slow or failed Docker builds and dependency downloads,
significantly impacting development efficiency.

### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

The changes include:
1. Modified Dockerfile to use alternative Ubuntu mirrors with better
connectivity in affected regions
2. Added a new script (download_deps_CN.py) that provides
region-specific alternative download links for dependencies
2025-04-07 11:58:46 +08:00
5a8c479ff3 Miscellaneous editorial updates (#6805)
### What problem does this PR solve?



### Type of change

- [x] Documentation Update
2025-04-07 09:33:55 +08:00
c6b26a3159 update some setting to README_zh.md (#6737)
### What problem does this PR solve?
#6731 #6722 
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Documentation Update

---------

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2025-04-03 22:12:49 +08:00
2a5ad74ac6 Test: Update test cases for #6800 (#6804)
### What problem does this PR solve?

update test case for PR #6800 issue #6539

### Type of change

- [x] update test cases
2025-04-03 21:22:41 +08:00
2caf15b24c Refa: trival. (#6802)
### What problem does this PR solve?


### Type of change


- [x] Refactoring
2025-04-03 19:01:24 +08:00
f49588756e Feat: Load the dialog page, prohibit calling the dialog/get interface #6798 (#6799)
### What problem does this PR solve?

Feat: Load the dialog page, prohibit calling the dialog/get interface
#6798

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-03 18:04:40 +08:00
57e760883e Fix: update chunk, empty question issue. (#6800)
### What problem does this PR solve?

fix issue #6539, refer to pr #6405

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-03 18:04:19 +08:00
b213e88cca Test: Added test cases for List Chat Assistants HTTP API (#6792)
### What problem does this PR solve?

cover [list chat
assistant](https://ragflow.io/docs/v0.17.2/http_api_reference#list-chat-assistants)
endpoints

### Type of change

- [x] add test cases
2025-04-03 17:22:23 +08:00
e8f46c9207 Fix: missing redis pvc storageclass in helm (#6788)
fix redis pvc in helm deployment

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-03 16:55:47 +08:00
cded812b97 Feat: add OpenAI compatible API for agent (#6329)
### What problem does this PR solve?
add openai agent
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-04-03 16:51:37 +08:00
2acb02366e Feat: Clarify the use of OpenAI-API-compatible #6782 (#6783)
### What problem does this PR solve?

Feat: Clarify the use of OpenAI-API-compatible #6782

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-03 11:38:21 +08:00
9ecc78feeb Refa: copywriting refinement. (#6779)
### What problem does this PR solve?

Close #6762

### Type of change

- [x] Refactoring
2025-04-03 11:38:02 +08:00
fdc410e743 Fix set_graph on non-existing edge (#6777)
### What problem does this PR solve?

Fix set_graph on non-existing edge

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-03 11:09:04 +08:00
5b5558300a Feat: add gemini-2.5-pro-exp-03-25 (#6774)
### What problem does this PR solve?

#6733

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-03 10:48:58 +08:00
b5918e7158 Docs: Fix for issue #6713 (#6775)
### What problem does this PR solve?

update fo issue #6713

### Type of change

- [x] Documentation Update
2025-04-03 10:19:58 +08:00
58f8026632 Test: Update test cases for PR #6643 (#6766)
### What problem does this PR solve?

Update test cases for PR #6643 issue #6607

### Type of change

- [x] update test cases
2025-04-03 10:10:40 +08:00
a73fbc61ff Fix: Handle the case of deleting empty blocks. Update the relevant message (#6643)
…gic to return the correct deletion message. Add handling for empty
arrays to ensure no errors occur during the deletion operation. Update
the test cases to verify the new logic.

### What problem does this PR solve?

fix this bug:https://github.com/infiniflow/ragflow/issues/6607

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-04-02 19:20:17 +08:00