Commit af06676a by Julien Muchembled

Fix potential deadlock when connecting to primary master

This is a regression caused by commit eef52c27
("Tickless poll loop, for lowest latency and cpu usage"), affecting:
- admins
- storages
- primary masters of backup clusters
parent 9531c9cb
...@@ -136,6 +136,12 @@ class BootstrapManager(EventHandler): ...@@ -136,6 +136,12 @@ class BootstrapManager(EventHandler):
if conn is None: if conn is None:
# open the connection # open the connection
conn = ClientConnection(em, self, self.current) conn = ClientConnection(em, self, self.current)
# Yes, the connection may be already closed. This happens when
# the kernel reacts so quickly to a closed port that 'connect'
# fails on the first call. In such case, poll(1) would deadlock
# if there's no other connection to timeout.
if conn.isClosed():
continue
# still processing # still processing
em.poll(1) em.poll(1)
return (self.current, conn, self.uuid, self.num_partitions, return (self.current, conn, self.uuid, self.num_partitions,
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or sign in to comment