ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-31 15:45:08 +08:00

Author	SHA1	Message	Date
Zhichang Yu	65a8cd1772	Fix knowledge_graph_kwd on infinity. Close #6476 and #6624 (#6651 ) ### What problem does this PR solve? Fix knowledge_graph_kwd on infinity. Close #6476 and #6624 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-28 22:05:40 +08:00
Marcus Yuan	c61df5dd25	Dynamic Context Window Size for Ollama Chat (#6582 ) # Dynamic Context Window Size for Ollama Chat ## Problem Statement Previously, the Ollama chat implementation used a fixed context window size of 32768 tokens. This caused two main issues: 1. Performance degradation due to unnecessarily large context windows for small conversations 2. Potential business logic failures when using smaller fixed sizes (e.g., 2048 tokens) ## Solution Implemented a dynamic context window size calculation that: 1. Uses a base context size of 8192 tokens 2. Applies a 1.2x buffer ratio to the total token count 3. Adds multiples of 8192 tokens based on the buffered token count 4. Implements a smart context size update strategy ## Implementation Details ### Token Counting Logic ```python def count_tokens(text): """Calculate token count for text""" # Simple calculation: 1 token per ASCII character # 2 tokens for non-ASCII characters (Chinese, Japanese, Korean, etc.) total = 0 for char in text: if ord(char) < 128: # ASCII characters total += 1 else: # Non-ASCII characters total += 2 return total ``` ### Dynamic Context Calculation ```python def _calculate_dynamic_ctx(self, history): """Calculate dynamic context window size""" # Calculate total tokens for all messages total_tokens = 0 for message in history: content = message.get("content", "") content_tokens = count_tokens(content) role_tokens = 4 # Role marker token overhead total_tokens += content_tokens + role_tokens # Apply 1.2x buffer ratio total_tokens_with_buffer = int(total_tokens * 1.2) # Calculate context size in multiples of 8192 if total_tokens_with_buffer <= 8192: ctx_size = 8192 else: ctx_multiplier = (total_tokens_with_buffer // 8192) + 1 ctx_size = ctx_multiplier * 8192 return ctx_size ``` ### Integration in Chat Method ```python def chat(self, system, history, gen_conf): if system: history.insert(0, {"role": "system", "content": system}) if "max_tokens" in gen_conf: del gen_conf["max_tokens"] try: # Calculate new context size new_ctx_size = self._calculate_dynamic_ctx(history) # Prepare options with context size options = { "num_ctx": new_ctx_size } # Add other generation options if "temperature" in gen_conf: options["temperature"] = gen_conf["temperature"] if "max_tokens" in gen_conf: options["num_predict"] = gen_conf["max_tokens"] if "top_p" in gen_conf: options["top_p"] = gen_conf["top_p"] if "presence_penalty" in gen_conf: options["presence_penalty"] = gen_conf["presence_penalty"] if "frequency_penalty" in gen_conf: options["frequency_penalty"] = gen_conf["frequency_penalty"] # Make API call with dynamic context size response = self.client.chat( model=self.model_name, messages=history, options=options, keep_alive=60 ) return response["message"]["content"].strip(), response.get("eval_count", 0) + response.get("prompt_eval_count", 0) except Exception as e: return "ERROR: " + str(e), 0 ``` ## Benefits 1. Improved Performance: Uses appropriate context windows based on conversation length 2. Better Resource Utilization: Context window size scales with content 3. Maintained Compatibility: Works with existing business logic 4. Predictable Scaling: Context growth in 8192-token increments 5. Smart Updates: Context size updates are optimized to reduce unnecessary model reloads ## Future Considerations 1. Fine-tune buffer ratio based on usage patterns 2. Add monitoring for context window utilization 3. Consider language-specific token counting optimizations 4. Implement adaptive threshold based on conversation patterns 5. Add metrics for context size update frequency --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-28 12:38:27 +08:00
Kevin Hu	0758c04941	Refa: token similarity calculations. (#6614 ) ### What problem does this PR solve? #6507 ### Type of change - [x] Performance Improvement	2025-03-28 09:33:08 +08:00
Kevin Hu	d2043ff9f2	Fix: LmStudioChat issue. (#6591 ) ### What problem does this PR solve? #6577 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-27 14:59:15 +08:00
Zanyatta	82ccbd2cba	fix: Remove unnecessary minio initialization (#6544 ) ### What problem does this PR solve? Prevent applications from failing to start due to calling non-existent or incorrect Minio connection configurations when using file storage outside of Minio ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-27 09:54:25 +08:00
Yongteng Lei	df3890827d	Refa: change LLM chat output from full to delta (incremental) (#6534 ) ### What problem does this PR solve? Change LLM chat output from full to delta (incremental) ### Type of change - [x] Refactoring	2025-03-26 19:33:14 +08:00
Kevin Hu	cc8029a732	Fix: uploading in chat box issue. (#6547 ) ### What problem does this PR solve? #6228 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 15:37:48 +08:00
Zhichang Yu	6bf26e2a81	Optimize graphrag again (#6513 ) ### What problem does this PR solve? Removed set_entity and set_relation to avoid accessing doc engine during graph computation. Introduced GraphChange to avoid writing unchanged chunks. ### Type of change - [x] Performance Improvement	2025-03-26 15:34:42 +08:00
Kevin Hu	12ad746ee6	Fix: Bedrock model invocation error. (#6533 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 11:27:12 +08:00
Kevin Hu	60c3a253ad	Fix: api-key issue for xinference. (#6490 ) ### What problem does this PR solve? #2792 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 15:01:13 +08:00
Kevin Hu	095fc84cf2	Fix: claude max tokens. (#6484 ) ### What problem does this PR solve? #6458 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 10:41:55 +08:00
Kevin Hu	b77ce4e846	Feat: support api-key for Ollama. (#6448 ) ### What problem does this PR solve? #6189 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-24 14:53:17 +08:00
Kevin Hu	85eb3775d6	Refa: update Anthropic models. (#6445 ) ### What problem does this PR solve? #6421 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-24 12:34:57 +08:00
Kevin Hu	ee5aa51d43	Fix: point in tag issue. (#6436 ) ### What problem does this PR solve? #6414 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-24 10:45:29 +08:00
zhou	a6aed0da46	Fix: rerank with YoudaoRerank issue. (#6396 ) ### What problem does this PR solve? Fix rerank with YoudaoRerank issue，"'YoudaoRerank' object has no attribute '_dynamic_batch_size'" ![17425412353825](https://github.com/user-attachments/assets/9ed304c7-317a-440e-acff-fe895fc20f07) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-24 10:09:16 +08:00
fansir	efc4796f01	Fix ratelimit errors during document parsing (#6413 ) ### What problem does this PR solve? When using the online large model API knowledge base to extract knowledge graphs, frequent Rate Limit Errors were triggered, causing document parsing to fail. This commit fixes the issue by optimizing API calls in the following way: Added exponential backoff and jitter to the API call to reduce the frequency of Rate Limit Errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-22 23:07:03 +08:00
fansir	0e0ebaac5f	Feat: Adds hierarchical title path tracking for tables in DOCX documents to improve context association (#6374 ) ### What problem does this PR solve? Adds hierarchical title path tracking for tables in DOCX documents to improve context association. Previously, extracted tables lacked positional context within document structure. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-21 18:42:36 +08:00
Kevin Hu	a2a4bfe3e3	Fix: change ollama default num_ctx. (#6395 ) ### What problem does this PR solve? #6163 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 16:22:03 +08:00
zhou	85480f6292	Fix: the error of Ollama embeddings interface returning "500 Internal Server Error" (#6350 ) ### What problem does this PR solve? Fix the error where the Ollama embeddings interface returns a “500 Internal Server Error” when using models such as xiaobu-embedding-v2 for embedding. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 15:25:48 +08:00
Kevin Hu	d83911b632	Fix: huggingface rerank model issue. (#6385 ) ### What problem does this PR solve? #6348 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 12:43:32 +08:00
Zhichang Yu	ca9c3e59fa	Call register_scripts on connecting redis (#6361 ) ### What problem does this PR solve? Call register_scripts on connecting redis ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 23:20:37 +08:00
Zhichang Yu	dba0caa00b	Fix update_progress (#6340 ) ### What problem does this PR solve? Fix update_progress ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 17:01:28 +08:00
Kevin Hu	95497b4aab	Fix: adapt to old configurations. (#6321 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 14:50:59 +08:00
Kevin Hu	5b04b7d972	Fix: rerank with vllm issue. (#6306 ) ### What problem does this PR solve? #6301 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 11:52:42 +08:00
Yongteng Lei	9611185eb4	Feat: add VLM-boosted DocX parser (#6307 ) ### What problem does this PR solve? Add VLM-boosted DocX parser ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-20 11:24:44 +08:00
Yongteng Lei	e4380843c4	Feat: add fallback for PDF figure parser (#6305 ) ### What problem does this PR solve? Add fallback for PDF figure parser ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-20 10:48:38 +08:00
lgphone	046f0bba74	Fix: optimize setting config initialization to resolve Minio initialization error (#6282 ) ### What problem does this PR solve? Optimize setting configuration initialization to resolve Minio initialization error caused by using a specific storage. Reproduction Scenario: Using Aliyun OSS as the backend storage with the STORAGE_IMPL environment variable set to OSS. The service_conf.yaml.template configuration file contains OSS-related configurations, while other storage configurations are commented out. When the service starts, it still attempts to initialize the Minio storage. Since there is no Minio configuration in service_conf.yaml.template, it results in an error due to the missing configuration file. Optimization Measures: Automatically determine the required initialization configuration based on the environment variable. Do not initialize configurations for unused resources. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 10:45:40 +08:00
Yongteng Lei	1d6760dd84	Feat: add VLM-boosted PDF parser (#6278 ) ### What problem does this PR solve? Add VLM-boosted PDF parser if VLM is set. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-20 09:39:32 +08:00
Zhichang Yu	bb869aca33	Fix get_unacked_iterator (#6280 ) ### What problem does this PR solve? Fix get_unacked_iterator. Close #6132 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 17:46:58 +08:00
zhou	9cad60fa6d	Fix: Add a basic example when the example of content_tagging is empty (#6276 ) ### What problem does this PR solve? When using LLM for auto-tag, if there are no examples, the tag format generated by LLM may be wrong. This will cause Elasticsearch insert errors. Adding basic examples can avoid this problem. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 17:30:47 +08:00
Kevin Hu	c6e1a2ca8a	Feat: add TTS support for SILICONFLOW. (#6264 ) ### What problem does this PR solve? #6244 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-19 12:52:12 +08:00
Kevin Hu	49086964b8	Fix: type violations. (#6262 ) ### What problem does this PR solve? #6238 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 12:12:34 +08:00
Kevin Hu	dd81c30976	Fix: tag_feas deletion error. (#6257 ) ### What problem does this PR solve? #6218 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 11:25:11 +08:00
Kevin Hu	a087d13ccb	Feat: text file support position retaining. (#6231 ) ### What problem does this PR solve? #5832 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-18 16:55:11 +08:00
Kevin Hu	6e8d0e3177	Fix: rank feat issue. (#6225 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-18 16:07:29 +08:00
Yongteng Lei	5cf610af40	Feat: add vision LLM PDF parser (#6173 ) ### What problem does this PR solve? Add vision LLM PDF parser ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-18 14:52:20 +08:00
Kevin Hu	e9a6675c40	Fix: enable ollama api-key. (#6205 ) ### What problem does this PR solve? #6189 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-18 13:37:34 +08:00
Kevin Hu	1333d3c02a	Fix: float transfer exception. (#6197 ) ### What problem does this PR solve? #6177 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-18 11:13:44 +08:00
Kevin Hu	7e4d693054	Fix: in case response.choices[0].message.content is None. (#6190 ) ### What problem does this PR solve? #6164 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-18 10:00:27 +08:00
Kevin Hu	3a99c2b5f4	Refa: PARALLEL_DEVICES is a static parameter. (#6168 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-03-17 16:49:54 +08:00
Kevin Hu	fabc5e9259	Refa: fix re-rank scope. (#6152 ) ### What problem does this PR solve? #6140 ### Type of change - [x] Refactoring	2025-03-17 13:26:29 +08:00
Kevin Hu	bfa8d342b3	Fix: retrieval debug mode issue. (#6150 ) ### What problem does this PR solve? #6139 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-17 13:07:13 +08:00
Debug Doctor	3e19044dee	Feat: add OCR's muti-gpus and parallel processing support (#5972 ) ### What problem does this PR solve? Add OCR's muti-gpus and parallel processing support ### Type of change - [x] New Feature (non-breaking change which adds functionality) @yuzhichang I've tried to resolve the comments in #5697. OCR jobs can now be done on both CPU and GPU. ( By the way, I've encountered a “Generate embedding error” issue #5954 that might be due to my outdated GPUs? idk. ) Please review it and give me suggestions. GPU: ![gpu_ocr](https://github.com/user-attachments/assets/0ee2ecfb-a665-4e50-8bc7-15941b9cd80e) ![smi](https://github.com/user-attachments/assets/a2312f8c-cf24-443d-bf89-bec50503546d) CPU: ![cpu_ocr](https://github.com/user-attachments/assets/1ba6bb0b-94df-41ea-be79-790096da4bf1)	2025-03-17 11:58:40 +08:00
Zhichang Yu	89a69eed72	Introduced task priority (#6118 ) ### What problem does this PR solve? Introduced task priority ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-14 23:43:46 +08:00
Kevin Hu	e5a8b23684	Fix: empty tag field issue. (#6103 ) ### What problem does this PR solve? #6102 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-14 17:35:57 +08:00
Zhichang Yu	4fffee6695	Regards kb_id at ElasticSearch insert, update, delete. (#6105 ) ### What problem does this PR solve? Regards kb_id at ElasticSearch insert, update, delete. Close #6066 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-14 17:34:02 +08:00
Kevin Hu	485bc7d7d6	Fix: limit the depth of DFS (#6101 ) ### What problem does this PR solve? #6085 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-14 17:10:38 +08:00
Zhichang Yu	5d75b6be62	Fix executor name (#6080 ) ### What problem does this PR solve? Fix executor name ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-14 14:13:47 +08:00
Kevin Hu	56b228f187	Refa: remove max toekns for image2txt models. (#6078 ) ### What problem does this PR solve? #6063 ### Type of change - [x] Refactoring	2025-03-14 13:51:45 +08:00
utopia2077	2d4a60cae6	Fix: Reduce excessive IO operations by loading LLM factory configurations (#6047 ) …ions ### What problem does this PR solve? This PR fixes an issue where the application was repeatedly reading the llm_factories.json file from disk in multiple places, which could lead to "Too many open files" errors under high load conditions. The fix centralizes the file reading operation in the settings.py module and stores the data in a global variable that can be accessed by other modules. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [x] Performance Improvement - [ ] Other (please describe):	2025-03-14 09:54:38 +08:00

1 2 3 4 5 ...

633 Commits