• Julien Muchembled's avatar
    client: fix possible data corruption after conflict resolutions with replicas · 46c36465
    Julien Muchembled authored
    This really fixes the bug described in
    commit 40bac312,
    which could probably be reverted. It only reduced the probability of failure.
    
    What happened is that the second conflict on 'a' for t3 what first reported by
    an answer to first store with:
    - a base serial at which a=0
    - a conflict serial at which a=7
    However, the cached data is not 8 anymore but 12, since a second store already
    occurred after the first conflict (reported by the other storage node).
    
    When this conflict was resolved before receiving the conflict for second store,
    it gave:
    
      resolve(old=0, saved=7, new=12) -> 19
    
    instead of:
    
      resolve(old=4, saved=7, new=12) -> 15
    
    (if we still had the data of the first store, we could also do
      resolve(old=0, saved=7, new=8)
     but that would be inefficient from a memory point of view)
    
    The bug was difficult to reproduce. testNotifyReplicated had to be run many
    many times before that race conditions trigger it. The test was changed to
    enforce some of them, and the above scenario now happens almost always.
    46c36465
master.py 6.95 KB