Commit 7af9d2d3 authored by Julien Muchembled's avatar Julien Muchembled

storage: partially fix a potential crash during replication

And document 3 bugs found by running many times testBackupNodeLost. About the
tic() issue, I had a case where the test exited instead of looping forever after
the storage crash.
parent 6da59ae8
Although NEO is considered ready for production use in most cases, there are Although NEO is considered ready for production use in most cases, there are
a few bugs to know because they concern basic features of ZODB, or promised a few bugs to know because they concern basic features of ZODB, or promised
features of NEO. features of NEO.
All the listed bugs will be fixed with high priority. All the listed bugs will be fixed with high priority.
Conflict resolution not fully implemented Conflict resolution not fully implemented
...@@ -10,6 +11,7 @@ When a cluster has several storage nodes, so-called 'deadlock avoidance' may ...@@ -10,6 +11,7 @@ When a cluster has several storage nodes, so-called 'deadlock avoidance' may
happen to in order to resolve conflicts. In such cases, conflicts will not be happen to in order to resolve conflicts. In such cases, conflicts will not be
resolved even if your _p_resolveConflict() method would succeed, leading to a resolved even if your _p_resolveConflict() method would succeed, leading to a
normal ConflictError. normal ConflictError.
Although this should happen rarely enough not to affect performance, this can Although this should happen rarely enough not to affect performance, this can
be an issue if your application can't afford restarting the transaction, be an issue if your application can't afford restarting the transaction,
e.g. because it interacted with external environment. e.g. because it interacted with external environment.
...@@ -27,4 +29,37 @@ Storage failure or update may lead to POSException or break undoLog() ...@@ -27,4 +29,37 @@ Storage failure or update may lead to POSException or break undoLog()
Storage nodes are only queried once at most and if all (for the requested Storage nodes are only queried once at most and if all (for the requested
partition) failed, the client raises instead of asking the master whether it partition) failed, the client raises instead of asking the master whether it
had an up-to-date partition table (and retry if useful). had an up-to-date partition table (and retry if useful).
In the case of undoLog(), incomplete results may be returned. In the case of undoLog(), incomplete results may be returned.
Storage does not discard answers from aborted replications
----------------------------------------------------------
In some cases, this can lead to data corruption (wrong AnswerFetch*) or crashes
(e.g. KeyError because self.current_partition is None at the beginning of
Replicator.fetchObjects).
The assumption that aborting the replication of a partition implies the closure
of the connection turned out to be wrong, e.g. when a partition is aborted by a
third party, like CORRUPTED/DISCARDED event from the master.
Workaround: do not replicate or tweak while checking replicas.
Currently, this can be reproduced by running testBackupNodeLost
(neo.tests.threaded.testReplication.ReplicationTests) many times.
A backup cell may be wrongly marked as corrupted while checking replicas
------------------------------------------------------------------------
This happens in the following conditions:
1. a backup cluster starts to check replicas whereas a cell is outdated
2. this cell becomes updated, but only up to a tid smaller than the max tid
to check (this can't happen for a non-backup cluster)
3. the cluster actually starts to check the related partition
4. the cell is checked completely before it could replicate up to the max tid
to check
Workaround: make sure all cells are up-to-date before checking replicas.
Found by running testBackupNodeLost many times.
...@@ -23,8 +23,8 @@ from neo.lib.protocol import Errors, NodeStates, Packets, ProtocolError, \ ...@@ -23,8 +23,8 @@ from neo.lib.protocol import Errors, NodeStates, Packets, ProtocolError, \
def checkConnectionIsReplicatorConnection(func): def checkConnectionIsReplicatorConnection(func):
def wrapper(self, conn, *args, **kw): def wrapper(self, conn, *args, **kw):
assert self.app.replicator.getCurrentConnection() is conn if self.app.replicator.getCurrentConnection() is conn:
return func(self, conn, *args, **kw) return func(self, conn, *args, **kw)
return wraps(func)(wrapper) return wraps(func)(wrapper)
def checkFeedingConnection(check): def checkFeedingConnection(check):
......
...@@ -83,6 +83,10 @@ class Serialized(object): ...@@ -83,6 +83,10 @@ class Serialized(object):
The epoll object of each node is hooked so that thread switching happens The epoll object of each node is hooked so that thread switching happens
before polling for network activity. An extra epoll object is used to before polling for network activity. An extra epoll object is used to
detect which node has a readable epoll object. detect which node has a readable epoll object.
XXX: It seems wrong to rely only on epoll as way to know if there are
pending network messages. I had rare random failures due to tic()
returning prematurely.
""" """
check_timeout = False check_timeout = False
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment