Improve task executor heartbeat handling and cleanup (#12390)

Improve task executor heartbeat handling and cleanup.

### What problem does this PR solve?

- **Reduce lock contention during executor cleanup**: The cleanup lock
is acquired only when removing expired executors, not during regular
heartbeat reporting, reducing potential lock contention.

- **Optimize own heartbeat cleanup**: Each executor removes its own
expired heartbeat using `zremrangebyscore` instead of `zcount` +
`zpopmin`, reducing Redis operations and improving efficiency.

- **Improve cleanup of other executors' heartbeats**: Expired executors
are detected by checking their latest heartbeat, and stale entries are
removed safely.

- **Other improvements**: IP address and PID are captured once at
startup, and unnecessary global declarations are removed.

### Type of change

- [x] Performance Improvement

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
This commit is contained in:
OliverW
2026-01-04 11:24:05 +08:00
committed by GitHub
parent d39fa75d36
commit d6e006f086
2 changed files with 79 additions and 44 deletions

View File

@ -273,6 +273,17 @@ class RedisDB:
self.__open__()
return None
def zremrangebyscore(self, key: str, min: float, max: float):
try:
res = self.REDIS.zremrangebyscore(key, min, max)
return res
except Exception as e:
logging.warning(
f"RedisDB.zremrangebyscore {key} got exception: {e}"
)
self.__open__()
return 0
def incrby(self, key: str, increment: int):
return self.REDIS.incrby(key, increment)