Commits · c6b80f7b25c0bda112aa594c901667ff8c7328e9 · nexedi / neoppod

13 Dec, 2015 2 commits

importer: allow truncation after the last tid to import, during or after the import · c6b80f7b

Julien Muchembled authored Dec 13, 2015

This is a partial implementation. To truncate at a smaller tid, you must wait
that data is imported up to this tid and stop using the Importer backend.

c6b80f7b

importer: do not implement deleteTransaction, now only used for replication · 24a9f1b8

Julien Muchembled authored Dec 13, 2015

This backend does not support replication. Even if we implemented it, such node
could only be a source for other nodes so we should never delete transactions.

24a9f1b8

12 Dec, 2015 1 commit
- neolog: fix crash on unknown packets · af8a8370
  Julien Muchembled authored Dec 12, 2015
  
  af8a8370
11 Dec, 2015 1 commit
- client: dump cache stats on SIGRTMIN+2 · 9e543d76
  Julien Muchembled authored Dec 11, 2015
  
  9e543d76
09 Dec, 2015 1 commit
- client: fix spurious connection timeouts · 06a64d80
  Julien Muchembled authored Dec 09, 2015
```
This fixes a regression caused by
commit eef52c27
```
  06a64d80
02 Dec, 2015 1 commit
- Release version 1.6 · f180b00e
  Julien Muchembled authored Dec 02, 2015
  
  f180b00e
01 Dec, 2015 3 commits

master: fix verification when nodes don't have any readable cell · cd669221
Julien Muchembled authored Nov 24, 2015

cd669221
Bump protocol version and upgrade storages automatically · ca2caf87
Julien Muchembled authored Nov 25, 2015

ca2caf87

Safer DB truncation, new 'truncate' ctl command · d3c8b76d

Julien Muchembled authored Dec 01, 2015

With the previous commit, the request to truncate the DB was not stored
persistently, which means that this operation was still vulnerable to the case
where the master is restarted after some nodes, but not all, have already
truncated. The master didn't have the information to fix this and the result
was a DB partially truncated.

-> On a Truncate packet, a storage node only stores the tid somewhere, to send
   it back to the master, which stays in RECOVERING state as long as any node
   has a different value than that of the node with the latest partition table.

We also want to make sure that there is no unfinished data, because a user may
truncate at a tid higher than a locked one.

-> Truncation is now effective at the end on the VERIFYING phase, just before
   returning the last ids to the master.

At last all nodes should be truncated, to avoid that an offline node comes back
with a different history. Currently, this would not be an issue since
replication is always restart from the beginning, but later we'd like they
remember where they stopped to replicate.

-> If a truncation is requested, the master waits for all nodes to be pending,
   even if it was previously started (the user can still force the cluster to
   start with neoctl). And any lost node during verification also causes the
   master to go back to recovery.

Obviously, the protocol has been changed to split the LastIDs packet and
introduce a new Recovery, since it does not make sense anymore to ask last ids
during recovery.

d3c8b76d

30 Nov, 2015 10 commits

Perform DB truncation during recovery, send PT to storages before verification · 3e3eab5b

Julien Muchembled authored Nov 25, 2015

Currently, the database may only be truncated when leaving backup mode, but
the issue will be the same when neoctl gets a new command to truncate at an
arbitrary tid: we want to be sure that all nodes are truncated before anything
else.

Therefore, we stop sending Truncate orders before stopping operation because
nodes could fail/exit before actually processing them. Truncation must also
happen before asking nodes their last ids.

With this commit, if a truncation is requested:
- this is always the first thing done when a storage node connects to the
  primary master during the RECOVERING phase,
- and the cluster does not start automatically if there are missing nodes,
  unless an admin forces it.

Other changes:
- Connections to storage nodes don't need to be aborted anymore when leaving
  backup mode.
- The master always initiates communication when a storage node identifies,
  which simplifies code and reduces the number of exchanged packets.

3e3eab5b

master: fix possible blockage during recovery after a storage disconnection · 2485f151

Julien Muchembled authored Nov 19, 2015

At some point, the master asks a storage node its partition table. If this node
is lost before getting an answer, another node (or the same one if it comes
back) must be asked.

Before this change, the master node had to be restarted.

2485f151

master: last tid/oid after recovery/verification · dec81519

Julien Muchembled authored Nov 20, 2015

The important bugfix is to update the last oid when the master verifies a
transaction with new oids.

By resetting the transaction manager at the beginning of the recovery phase,
it become possible to avoid tid/oid holes:
- by reallocating previously unused allocated oids
- when going back "in the past", i.e. reverting to an older version of the
  database (with fewer oids) and/or adjusting the clock

dec81519

Go back/stay in RECOVERING state when the partition table can't be operational · e1f9a7da

Julien Muchembled authored Nov 25, 2015

This fixes several cases where the partition table could become corrupt and
the whole cluster being stuck in VERIFYING state.

This also reduces the probability the have cells out of date when restarting
several storage nodes simultaneously.

At last, if a master node becomes primary again, a cluster must not be started
automatically if nodes with readable cells are missing, in order to avoid
a split of the database. This could happen if this master node was previously
forced to start it.

e1f9a7da

Minimize the amount of work during tpc_finish · 7eb7cf1b

Julien Muchembled authored Nov 25, 2015

NEO did not ensure that all data and metadata are written on disk before
tpc_finish, and it was for example vulnerable to ENOSPC errors.
In other words, some work had to be moved to tpc_vote:

- In tpc_vote, all involved storage nodes are now asked to write all metadata
  to ttrans/tobj and _commit_. Because the final tid is not known yet, the tid
  column of ttrans and tobj now contains NULL and the ttid respectively.

- In tpc_finish, AskLockInformation is still required for read locking,
  ttrans.tid is updated with the final value and this change is _committed_.

- The verification phase is greatly simplified, more reliable and faster. For
  all voted transactions, we can know if a tpc_finish was started by getting
  the final tid from the ttid, either from ttrans or from trans. And we know
  that such transactions can't be partial so we don't need to check oids.

So in addition to minimizing the risk of failures during tpc_finish, we also
fix a bug causing the verification phase to discard transactions with objects
for which readCurrent was called.

On performance side:

- Although tpc_vote now asks all involved storages, instead of only those
  storing the transaction metadata, the client has been improved to do this
  in parallel. The additional commits are also all done in parallel.

- A possible improvement to compensate the additional commits is to delay the
  commit done by the unlock.

- By minimizing the time to lock transactions, objects are read-locked for a
  much shorter period. This is even more important that locked transactions
  must be unlocked in the same order.

Transactions with too many modified objects will now timeout inside tpc_vote
instead of tpc_finish. Of course, such transactions may still cause other
transaction to timeout in tpc_finish.

7eb7cf1b

Do not send useless node information to bootstraping node · 99ac542c
Julien Muchembled authored Nov 23, 2015

99ac542c
fixup! storage: fix pruning of data when deleting partial transactions during verification · cff279af
Julien Muchembled authored Nov 30, 2015
```
This fixes a regression in commit 83fe64bf
when ttrans has several rows to the same data_id.
```
cff279af
threaded: prevent neoctl to loop forever when something went wrong during the test · a63bf12f
Julien Muchembled authored Nov 26, 2015

a63bf12f
ssl: fix handshaking connections being stuck when they're aborted · fe487c07
Julien Muchembled authored Nov 27, 2015

fe487c07

ssl: consider connections completed after the handshake · aaefaf8b

Julien Muchembled authored Nov 27, 2015

- Server connections can now be in 'connecting' state.
- connectionAccepted event (which has never been used so far) is merged into
  connectionCompleted.

aaefaf8b

25 Nov, 2015 13 commits
- storage: always restart replication of outdated cells from the beginning (ZERO_TID) · 6b1f198f
  Julien Muchembled authored Nov 25, 2015
```
This is a workaround to fix holes if replication is interrupted after new data
is committed.
```
  6b1f198f
- threaded: fix typo · 949f7e0f
  Julien Muchembled authored Nov 25, 2015
  
  949f7e0f
- Ignore but log exceptions while closing a connection for which a assertion failed · 34a2fea3
  Julien Muchembled authored Nov 24, 2015
```
AssertionError are certainly more severe that any other exception
(including OperationFailure) because the process is in an unknown state.
```
  34a2fea3
- threaded: make it possible to send packets from a connection filter · 50134569
  Julien Muchembled authored Nov 24, 2015
```
This could have been useful in testStorageFailureDuringTpcFinish:
close() could not be called from answerTransactionFinished because it
deadlocked while trying to send notifications.
```
  50134569
- tests: clarify intention in testStorageFailureDuringTpcFinish · c5913373
  Julien Muchembled authored Nov 24, 2015
```
The test was relying on fact on the fact that 'c.abort()' caused an assertion
failure, which closed the connection and then raised OperationFailure.
Actually, I wanted to close the connection on master, but it's clearer this way.
```
  c5913373
- TODO: review election timeouts and transaction aborting on client disconnection · 20b7cecd
  Julien Muchembled authored Nov 20, 2015
  
  20b7cecd
- Small optimizations & cleanups · 79ea07c8
  Julien Muchembled authored Nov 19, 2015
  
  79ea07c8
- Fix 2 'except' statements that will bug when moving to Python 3 · 0d36de7b
  Julien Muchembled authored Nov 19, 2015
```
Previous code relied on the fact that the exception target is kept past
the end of the except clause. 2to3 is not smart enough to detect that.

Without this change, a different OperationalError exception would be
ignored because there's already a local variable of the same name.
```
  0d36de7b
- mysql: drop 'bigdata' table when erasing the database · b0023b43
  Julien Muchembled authored Nov 19, 2015
```
This was forgotten when this table was introduced in
commit f9a8500d
```
  b0023b43
- threaded: new method to sort storage nodes · 9d24294a
  Julien Muchembled authored Nov 13, 2015
```
If needed, sortStorageList can be extended in the
future to support a 'readable' parameter.
```
  9d24294a
- threaded: expose a method to stop a A/M/S node · 93f5b0d8
  Julien Muchembled authored Nov 13, 2015
  
  93f5b0d8
- neolog: new --node option to filter logs produced by threaded tests · e57b1bdd
  Julien Muchembled authored Nov 12, 2015
  
  e57b1bdd
- master: simplify code in verification by removing useless checks · 259539e5
  Julien Muchembled authored Nov 09, 2015
```
We can never receive several answers from the same node.

testVerification is dropped for the same reason as for testEvent and most of
testConnection, since there is much incoming changes for verification.
```
  259539e5
03 Nov, 2015 2 commits
- storage: fix pruning of data when deleting partial transactions during verification · 83fe64bf
  Julien Muchembled authored Nov 02, 2015
  
  83fe64bf
- master: fix 2 bugs in verification phase · daa83cb4
  Julien Muchembled authored Nov 02, 2015
```
- Last known TID was not updated when recovering a transaction.
- Missing OIDs were ignored, which caused partial transactions to be committed
  instead of being deleted.
```
  daa83cb4
29 Oct, 2015 2 commits
- BUGS: mark whether bugs concern basic features of ZODB or promised features of NEO · 524463e8
  Julien Muchembled authored Oct 27, 2015
  
  524463e8
- TODO: safer tpc_finish and faster storage · 63324838
  Julien Muchembled authored Oct 27, 2015
  
  63324838
26 Oct, 2015 2 commits
- Release version 1.5.1 · 6275f7c6
  Julien Muchembled authored Oct 26, 2015
  
  6275f7c6
- storage: faster resumption when many transactions have already been imported to MySQL · 7469e55b
  Julien Muchembled authored Oct 26, 2015
```
The previous SQL query caused a full table scan of the 'trans' table at startup.
```
  7469e55b
21 Oct, 2015 2 commits
- tests: regenerate patch to ZODB3 using git · 0f0700a8
  Julien Muchembled authored Oct 21, 2015
```
I used git-diff for each file and concatenated the result to preverse the order.
```
  0f0700a8
- client: add assertion in cache to detect wrong invalidation · badd9de3
  Julien Muchembled authored Oct 21, 2015
  
  badd9de3