ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-12-08 20:42:30 +08:00

Author	SHA1	Message	Date
Stephen Hu	5fa6f2f151	Update embedding_model.py (#8836 ) ### What problem does this PR solve? Remove useless covert for bge encode_queries ### Type of change - [x] Performance Improvement	2025-07-15 14:04:58 +08:00
Stephen Hu	5383e254c4	Perf:Remove Useless Convert When BGE Embedding (#8816 ) ### What problem does this PR solve? FlagModel internal support returns as numpy ### Type of change - [x] Performance Improvement	2025-07-14 14:02:48 +08:00
Stephen Hu	8d027813f5	Refactor: Improve How To Handle QWenEmbed (#8765 ) ### What problem does this PR solve? Based on https://github.com/infiniflow/ragflow/issues/8740 1. A better handle for 'NoneType' object is not subscriptable 2. Add some logs to get the internal message ### Type of change - [x] Refactoring	2025-07-10 10:30:18 +08:00
Stephen Hu	19419281c3	Fix: Change Ollama Embedding Keep Alive (#8734 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8733 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-09 12:17:26 +08:00
Stephen Hu	e60ec0a31b	Fix:disallowed special token while embedding (#8692 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8567 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-07 14:13:37 +08:00
6607changchun	9580e99650	fix: retry embedding with Qwen family models when limits temporarily reached. (#8690 ) fix: retry embedding with Qwen family models when limits temporarily reached. APIs of Qwen family models are limited by calling rates. When reached, the "output" attribute of the "resp" will be None, and in turn cause TypeError when trying to retrieve "embeddings". Since these limits are almost temporary, I have added a simple retry mechanism to avoid it. Besides, if retry_max reached, the error can be early raised, instead of hidden behind "TypeError". ### What problem does this PR solve? Sometimes Qwen blocks calling due to rate limits, but it will cause the whole parsing procedure stops when creating knowledge base. In this situation, resp["output"] will be None, and resp["output"]["embeddings"] will cause TypeError. Since the limits are temporary, I apply a simple retry mechanism to solve it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-07-07 12:15:52 +08:00
Yongteng Lei	f8a6987f1e	Refa: automatic LLMs registration (#8651 ) ### What problem does this PR solve? Support automatic LLMs registration. ### Type of change - [x] Refactoring	2025-07-03 19:05:31 +08:00
Kevin Hu	d46c24045f	Feat: add GiteeAI as a llm provider. (#8572 ) ### What problem does this PR solve? #1853 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 11:22:11 +08:00
Kevin Hu	aafeffa292	Feat: add gitee as LLM provider. (#8545 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 09:22:31 +08:00
Rainman	49d67cbcb7	fix a bug when using huggingface embedding api (#8432 ) ### What problem does this PR solve? image_version: v0.19.1 This PR fixes a bug in the HuggingFaceEmBedding API method that was causing AssertionError: assert len(vects) == len(docs) during the document embedding process. #### Problem The HuggingFaceEmbed.encode() method had an early return statement inside the for loop, causing it to return after processing only the first text input instead of processing all texts in the input list. Error Messenge ```python AssertionError: assert len(vects) == len(docs) # input chunks != embedded vectors from embedding api File "/ragflow/rag/svr/task_executor.py", line 442, in embedding ``` Buggy code(/ragflow/rag/llm/embedding_model.py) ```python class HuggingFaceEmbed(Base): def __init__(self, key, model_name, base_url=None): if not model_name: raise ValueError("Model name cannot be None") self.key = key self.model_name = model_name.split("___")[0] self.base_url = base_url or "http://127.0.0.1:8080" def encode(self, texts: list): embeddings = [] for text in texts: response = requests.post(...) if response.status_code == 200: try: embedding = response.json() embeddings.append(embedding[0]) # ❌ Early return return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) except Exception as _e: log_exception(_e, response) else: raise Exception(...) ``` Fixed Code(I just Rollback this function to the v0.19.0 version) ```python Class HuggingFaceEmbed(Base): def __init__(self, key, model_name, base_url=None): if not model_name: raise ValueError("Model name cannot be None") self.key = key self.model_name = model_name.split("___")[0] self.base_url = base_url or "http://127.0.0.1:8080" def encode(self, texts: list): embeddings = [] for text in texts: response = requests.post(...) if response.status_code == 200: embedding = response.json() embeddings.append(embedding[0]) # ✅ Only append, no return else: raise Exception(...) return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) # ✅ Return after processing all ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-24 09:35:02 +08:00
Stephen Hu	ef5e7d8c44	Fix:embedding_model class SILICONFLOWEmbed(Base)Function reusing json (#8378 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8360 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-20 11:13:00 +08:00
Kevin Hu	65d5268439	Feat: implement novitaAI embedding and reranking. (#8250 ) ### What problem does this PR solve? Close #8227 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-13 15:42:17 +08:00
Kevin Hu	d36c8d18b1	Refa: make exception more clear. (#8224 ) ### What problem does this PR solve? #8156 ### Type of change - [x] Refactoring	2025-06-12 17:53:59 +08:00
Liu An	a43adafc6b	Refa: Add error handling for JSON decode in embedding models (#8162 ) ### What problem does this PR solve? Improve robustness of Jina, Nvidia, and SILICONFLOW embedding models by: 1. Adding try-catch blocks for JSON decode errors 2. Logging error details including response content 3. Raising exceptions with meaningful error messages ### Type of change - [x] Refactoring	2025-06-10 19:04:17 +08:00
Kevin Hu	156290f8d0	Fix: url path join issue. (#8013 ) ### What problem does this PR solve? Close #7980 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-03 14:18:40 +08:00
Stephen Hu	65537b8200	Fix:Set CUDA_VISIBLE_DEVICES In DefaultEmbedding (#7465 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/7420 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-06 14:38:36 +08:00
Alex Chen	46b5e32cd7	Feat: support vision llm for gpustack (#6636 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/6138 This PR is going to support vision llm for gpustack, modify url path from `/v1-openai` to `/v1` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-31 15:33:52 +08:00
Kevin Hu	b77ce4e846	Feat: support api-key for Ollama. (#6448 ) ### What problem does this PR solve? #6189 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-24 14:53:17 +08:00
zhou	85480f6292	Fix: the error of Ollama embeddings interface returning "500 Internal Server Error" (#6350 ) ### What problem does this PR solve? Fix the error where the Ollama embeddings interface returns a “500 Internal Server Error” when using models such as xiaobu-embedding-v2 for embedding. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 15:25:48 +08:00
Omar Leonardo Sanchez Granados	4f2816c01c	Add support to boto3 default connection (#5246 ) ### What problem does this PR solve? This pull request includes changes to the initialization logic of the `ChatModel` and `EmbeddingModel` classes to enhance the handling of AWS credentials. Use cases: - Use env variables for credentials instead of managing them on the DB - Easy connection when deploying on an AWS machine ### Type of change - [X] New Feature (non-breaking change which adds functionality)	2025-02-24 11:01:14 +08:00
Kevin Hu	4776fa5e4e	Refactor for total_tokens. (#4652 ) ### What problem does this PR solve? #4567 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-01-26 13:54:26 +08:00
Kevin Hu	f1d9f4290e	Fix TogetherAIEmbed. (#4623 ) ### What problem does this PR solve? #4567 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-01-24 10:29:30 +08:00
Kevin Hu	be5f830878	Truncate text for zhipu embedding. (#4490 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-01-15 14:36:27 +08:00
Alex Chen	7944aacafa	Feat: add gpustack model provider (#4469 ) ### What problem does this PR solve? Add GPUStack as a new model provider. [GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU cluster manager for running LLMs. Currently, locally deployed models in GPUStack cannot integrate well with RAGFlow. GPUStack provides both OpenAI compatible APIs (Models / Chat Completions / Embeddings / Speech2Text / TTS) and other APIs like Rerank. We would like to use GPUStack as a model provider in ragflow. [GPUStack Docs](https://docs.gpustack.ai/latest/quickstart/) Related issue: https://github.com/infiniflow/ragflow/issues/4064. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Testing Instructions 1. Install GPUStack and deploy the `llama-3.2-1b-instruct` llm, `bge-m3` text embedding model, `bge-reranker-v2-m3` rerank model, `faster-whisper-medium` Speech-to-Text model, `cosyvoice-300m-sft` in GPUStack. 2. Add provider in ragflow settings. 3. Testing in ragflow.	2025-01-15 14:15:58 +08:00
Kevin Hu	b93c136797	Fix gemini embedding error. (#4356 ) ### What problem does this PR solve? #4314 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-01-06 14:41:29 +08:00
Jin Hai	4abc144d3d	Fix error of changing embedding model (#4184 ) ### What problem does this PR solve? 1. Change embedding model of knowledge base won't change the default embedding model. 2. Retrieval test bug ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Signed-off-by: jinhai <haijin.chn@gmail.com>	2024-12-23 16:23:54 +08:00
Kevin Hu	d8fca43017	Make fast embed and default embed mutually exclusive. (#4121 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2024-12-19 17:27:09 +08:00
Kevin Hu	7474348394	Fix fastembed reloading issue. (#4117 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-19 16:18:18 +08:00
Kevin Hu	593ffc4067	Fix HuggingFace model error. (#3870 ) ### What problem does this PR solve? #3865 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-05 13:28:42 +08:00
Zhichang Yu	92ab7ef659	Refactor embedding batch_size (#3825 ) ### What problem does this PR solve? Refactor embedding batch_size. Close #3657 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2024-12-03 16:22:39 +08:00
Kevin Hu	6a0583f5ad	Fix voyage embedding. (#3818 ) ### What problem does this PR solve? #3816 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-03 09:33:54 +08:00
Zhichang Yu	d19f059f34	Detect invalid response from api.siliconflow.cn (#3792 ) ### What problem does this PR solve? Detect invalid response from api.siliconflow.cn. Close #2643 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-02 12:55:05 +08:00
devMls	59a5813f1b	add jina new models in jina connector (#3770 ) ### What problem does this PR solve? add new models in jinna connector, to allow use models that support multilingual models ### Type of change - [X] Other (please describe): new connectors no breaking change	2024-12-02 10:06:39 +08:00
Kevin Hu	57208d8e53	Fix batch size issue. (#3675 ) ### What problem does this PR solve? #3657 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-11-27 18:06:43 +08:00
liuhua	8b35776916	Fix a bug in VolcEngine (#3658 ) ### What problem does this PR solve? Fix a bug in VolcEngine #3553 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>	2024-11-27 09:30:49 +08:00
Kevin Hu	e5af18d5ea	Update docs for v0.14.0 (#3625 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2024-11-25 11:37:56 +08:00
liuhua	d42362deb6	Add api for sessions and add max_tokens for tenant_llm (#3472 ) ### What problem does this PR solve? Add api for sessions and add max_tokens for tenant_llm ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>	2024-11-19 14:51:33 +08:00
Zhichang Yu	4413683898	Introduced beartype (#3460 ) ### What problem does this PR solve? Introduced [beartype](https://github.com/beartype/beartype) for runtime type-checking. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-11-18 17:38:17 +08:00
Jin Hai	1e90a1bf36	Move settings initialization after module init phase (#3438 ) ### What problem does this PR solve? 1. Module init won't connect database any more. 2. Config in settings need to be used with settings.CONFIG_NAME ### Type of change - [x] Refactoring Signed-off-by: jinhai <haijin.chn@gmail.com>	2024-11-15 17:30:56 +08:00
Zhichang Yu	30f6421760	Use consistent log file names, introduced initLogger (#3403 ) ### What problem does this PR solve? Use consistent log file names, introduced initLogger ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2024-11-14 17:13:48 +08:00
roc king	fa54cd5f5c	exstract model dir from model‘s full name (#3368 ) ### What problem does this PR solve? When model’s group name contains 0-9，we can't find downloaded model，because we do not correctly exstract model dir's name from model‘s full name ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: 王志鹏 <zhipeng3.wang@midea.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-11-13 14:10:16 +08:00
Zhichang Yu	a2a5631da4	Rework logging (#3358 ) Unified all log files into one. ### What problem does this PR solve? Unified all log files into one. ### Type of change - [x] Refactoring	2024-11-12 17:35:13 +08:00
ksztone-huanggonghao	0dff64f6ad	fix: TypeError: only length-1 arrays can be converted to Python scalars (#3211 ) ### What problem does this PR solve? fix "TypeError: only length-1 arrays can be converted to Python scalars" while using cohere embedding model. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) ![image](https://github.com/user-attachments/assets/2c21a69f-cd76-4d25-b320-058964812db8)	2024-11-06 11:15:00 +08:00
0000sir	4991107822	Fix keys of Xinference deployed models, especially has the same model name with public hosted models. (#2832 ) ### What problem does this PR solve? Fix keys of Xinference deployed models, especially has the same model name with public hosted models. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: 0000sir <0000sir@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-10-16 10:21:08 +08:00
JobSmithManipulation	18f80743eb	support api-version and change default-model in adding azure-openai and openai (#2799 ) ### What problem does this PR solve? #2701 #2712 #2749 ### Type of change -[x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-10-11 11:26:42 +08:00
Kevin Hu	7f44cf543a	move import positions (#2753 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2024-10-09 10:34:58 +08:00
Omar Leonardo Sanchez Granados	34761fa4ca	Fix/bedrock issues (#2718 ) ### What problem does this PR solve? Adding a Bedrock API key for Claude Sonnet was broken. I find the issue came up when trying to test the LLM configuration, the system is a required parameter in boto3. As well, there were problems in Bedrock implementation for embeddings when trying to encode queries. ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue)	2024-10-05 16:44:50 +08:00
JobSmithManipulation	96f56a3c43	add huggingface model (#2624 ) ### What problem does this PR solve? #2469 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-09-27 19:15:38 +08:00
Kevin Hu	dda1367ab2	make it lighten (#2577 ) ### What problem does this PR solve? #2295 ### Type of change - [x] Refactoring	2024-09-25 13:38:40 +08:00
Kevin Hu	7bb28ca2bd	add lighten control (#2567 ) ### What problem does this PR solve? #2295 ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2024-09-24 19:22:01 +08:00

1 2 3

112 Commits