Commit 4d3f3723 authored by Julien Muchembled's avatar Julien Muchembled

storage: start replicating the partition which is furthest behind

This fixes the following case when the backup is far behing the upstream DB,
and there are transactions being committed at the same time:

1. replicate partition 0
2. replicate partition 0
3. replicate partition 1
4. replicate partition 0
5. replicate partition 1
6. replicate partition 2
7. replicate partition 0
...
and so on in a quadratic way.

When the upstream activity was too high, the backup could even be stuck looping
on the first partitions.
parent 17af3b47
Pipeline #4471 skipped
......@@ -214,6 +214,10 @@ class Replicator(object):
self.updateBackupTID()
self._nextPartition()
def _nextPartitionSortKey(self, offset):
p = self.partition_dict[offset]
return p.next_obj, bool(p.max_ttid)
def _nextPartition(self):
# XXX: One connection to another storage may remain open forever.
# All other previous connections are automatically closed
......@@ -227,12 +231,12 @@ class Replicator(object):
if self.current_partition is not None or not self.replicate_dict:
return
app = self.app
# Choose a partition with no unfinished transaction if possible.
# Start replicating the partition which is furthest behind,
# to increase the overall backup_tid as soon as possible.
# Then prefer a partition with no unfinished transaction.
# XXX: When leaving backup mode, we should only consider UP_TO_DATE
# cells.
for offset in self.replicate_dict:
if not self.partition_dict[offset].max_ttid:
break
offset = min(self.replicate_dict, key=self._nextPartitionSortKey)
try:
addr, name = self.source_dict[offset]
except KeyError:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment