Commits · 38583af937e2ca0c568f960345c6426daa720ed9 · Iliya Manolov / neoppod

27 Jul, 2016 5 commits

client: better exception handling in tpc_abort · 38583af9
Julien Muchembled authored Jul 27, 2016

38583af9

client: do not limit the number of open connections to storage nodes · 77132157

Julien Muchembled authored Jul 27, 2016

There was a bug that connections were not maintained during a TPC,
which caused transactions to be aborted when the limit was reached.

Given that oids are spreaded evenly over all partitions, and that clients always
write to all cells of each involved partitions, clients would spend their time
reconnecting to storage nodes as soon as the limit is reached. So such feature
really looks counter-productive.

77132157

client: small optimization when iterating over storage connections · cfe1b5ca
Julien Muchembled authored Jul 27, 2016

cfe1b5ca

client: fix conflict of node id by never reading from storage without being connected to the master · 11d83ad9

Julien Muchembled authored Jul 26, 2016

Client nodes ignored the state of the connection to the master node when reading
data from storage, as long as their partition tables were recent enough. This
way, they were able to finish read-only transactions even if they could't reach
the master, which could be useful for high availability. The downside is that
the master node ignored that their node ids were still used, which causes "uuid"
conflicts when reallocating them.

Rejected solutions:
- An unused NEO Storage should not insist in staying connected to master node.
- Reverting to big random node identifiers is a lot of work and it would make
  debugging annoying (see commit 23fad3af).
- Always increasing node ids could have been a simple solution if we accepted
  that the cluster dies after that all 2^24 possible ids were allocated.

Given that reading from storage without being connected to the master can only
be useful to finish the current transaction (because we always ping the master
at the beginning of every transaction), keeping such feature is not worth the
effort.

This commit fixes id conflicts in a very simple way, by clearing the partition
table upon primary node failure, which forces reconnection to the master before
querying any storage node. In such case, we raise a special exception that will
cause the transaction to be restarted, so that the user does not get errors for
temporary connection failures.

11d83ad9

storage: add comment about the idea to lock an oid before reporting a resolvable conflict · 4e17456b

Julien Muchembled authored Jul 26, 2016

Currently, another argument not to lock is that we would not be able to test
incremental resolution anymore. We can think about this again when deadlock
resolution is implemented.

4e17456b

24 Jul, 2016 5 commits

Fix race conditions in EventManager between _poll/connection_dict and (un)registration · 8b91706a

Julien Muchembled authored Jul 24, 2016

The following error was reported on a client node:

    #0x0000 Error                   < None (2001:...:2051)
    1 (Retry Later)
    connection closed for <MTClientConnection(uuid=None, address=2001:...:2051, handler=PrimaryNotificationsHandler, closed, client) at 7f1ea7c42f90>
    Event Manager:
    connection started for <MTClientConnection(uuid=None, address=2001:...:2051, handler=PrimaryNotificationsHandler, fd=13, on_close=onConnectionClosed, connecting, client) at 7f1ea7c25c10>
    #0x0000 RequestIdentification          > None (2001:...:2051)
      Readers: []
      Writers: []
      Connections:
        13: <MTClientConnection(uuid=None, address=2001:...:2051, handler=PrimaryNotificationsHandler, fd=13, on_close=onConnectionClosed, connecting, client) at 7f1ea7c25c10> (pending=False)
    Node manager : 1 nodes
    * None |   MASTER | 2001:...:2051 | UNKNOWN
    <ClientCache history_size=0 oid_count=0 size=0 time=0 queue_length=[0] (life_time=10000 max_history_size=100000 max_size=20971520)>
    poll raised, retrying
    Traceback (most recent call last):
      File "neo/lib/threaded_app.py", line 93, in _run
        poll(1)
      File "neo/lib/event.py", line 134, in poll
        self._poll(0)
      File "neo/lib/event.py", line 164, in _poll
        conn = self.connection_dict[fd]
    KeyError: 13

which means that:
- while the poll thread is getting a (13, EPOLLIN) event because it is
  closed (aborted by the master)
- another thread processes the error packet, by closing it in
  PrimaryBootstrapHandler.notReady
- next, the poll thread resumes the execution of EpollEventManager._poll
  and fails to find fd=13 in self.connection_dict

So here, we have a race condition between epoll_wait and any further use
of connection_dict to map returned fds.

However, what commit a4731a0c does to handle
the case of fd reallocation only works for mono-threaded applications.
In EPOLLIN, wrapping 'self.connection_dict[fd]' the same way as for other
events is not enough. For example:
- case 1:
  - thread 1: epoll returns fd=13
  - thread 2: close(13)
  - thread 2: open(13)
  - thread 1: self.connection_dict[13] does not match
              but this would be handled by the 'unregistered' list
- case 2:
  - thread 1: reset 'unregistered'
  - thread 2: close(13)
  - thread 2: open(13)
  - thread 1: epoll returns fd=13
  - thread 1: self.connection_dict[13] matches
              but it would be wrongly ignored by 'unregistered'
- case 3:
  - thread 1: about to call readable/writable/onTimeout on a connection
  - thread 2: this connection is closed
  - thread 1: readable/writable/onTimeout wrongly called on a closed connection

We could protect _poll() with a lock, and make unregister() use wakeup() so
that it gets a chance to acquire it, but that causes threaded tests to deadlock
(continuing in this direction seems too complicated).

So we have to deal with the fact that there can be race conditions at any time
and there's no way to make 'connection_dict' match exactly what epoll returns.
We solve this by preventing fd reallocation inside _poll(), which is fortunately
possible with sockets, using 'shutdown': the closing of fds is delayed.

For above case 3, readable/writable/onTimeout for MTClientConnection are also
changed to test whether the connection is still open while it has the lock.
Just for safety, we do the same for 'process'.

At last, another kind of race condition that this commit also fixes concerns
the use of itervalues() on EventManager.connection_dict.

8b91706a

Indent many lines before any real change · 4a0b936f

Julien Muchembled authored Jul 22, 2016

This is a preliminary commit, without any functional change,
just to make the next one easier to review.

4a0b936f

client: remove redundant check of new connections to the master · 9f4dd15e
Julien Muchembled authored Jul 24, 2016
```
We already have logs when a connection fails,
and ask() raises ConnectionClosed if the connection is closed.
```
9f4dd15e
Control verbose locking via en environment variable · e791dc3f
Vincent Pelletier authored Jun 04, 2016

e791dc3f
client: avoid (harmless) variable shadowing · b7e0ec7f
Vincent Pelletier authored Jun 04, 2016

b7e0ec7f

13 Jul, 2016 1 commit
- setup: first try to get 'mock.py' from the backup in repository · a4f34eaa
  Julien Muchembled authored Jul 13, 2016
```
SourceForge currently has too many issues.
```
  a4f34eaa
17 Jun, 2016 2 commits
- tests: an expected failure was actually due to a misuse of undo API · 4dfdf05a
  Julien Muchembled authored Jun 17, 2016
```
Obviously, oids can't be automatically invalidated if the undo is done directly
at the storage level.

In commit 9cca0f8e, only 1 bug was found.
```
  4dfdf05a
- tests: comment the assertion that detects file descriptor leaks · 920484d7
  Julien Muchembled authored Jun 17, 2016
  
  920484d7
15 Jun, 2016 2 commits
- Release version 1.6.3 · b57d0dae
  Julien Muchembled authored Jun 15, 2016
  
  b57d0dae
- client: use @implementer instead of deprecated implements() when declaring interfaces · a5ffd19d
  Julien Muchembled authored Jun 15, 2016
  
  a5ffd19d
08 Jun, 2016 2 commits
- Refresh patch to ZODB test suite, for new 4.3.1 release · e57ef6cc
  Julien Muchembled authored Jun 08, 2016
  
  e57ef6cc
- client: remove obsolete comment · 25a2f1cc
  Julien Muchembled authored Jun 08, 2016
```
FileStorage has been fixed in commit b7ea4e6f708dcded329332b24a9d70211a6b6368
```
  25a2f1cc
26 May, 2016 1 commit

client: fix the count of history items in the cache · e61b017f

Julien Muchembled authored May 26, 2016

Cache items are stored in double-linked chains. In order to quickly know the
number of history items, an extra attribute is used to count them. It was not
always decremented when a history item was removed.

This led to the following exception:
  <ClientCache history_size=100000 oid_count=1959 size=20970973 time=2849049 queue_length=[1, 7, 738, 355, 480, 66, 255, 44, 3, 5, 2, 1, 3, 4, 2, 2] (life_time=10000 max_history_size=100000 max_size=20971520)>
  poll raised, retrying
  Traceback (most recent call last):
    ...
    File "neo/client/handlers/master.py", line 137, in packetReceived
      cache.store(oid, data, tid, None)
    File "neo/client/cache.py", line 247, in store
      self._add(head)
    File "neo/client/cache.py", line 129, in _add
      self._remove(head)
    File "neo/client/cache.py", line 136, in _remove
      level = item.level
  AttributeError: 'NoneType' object has no attribute 'level'

e61b017f

25 Apr, 2016 1 commit
- fixup! Recover from failures during tpc_finish when the transaction got successfully committed · e2536c08
  Julien Muchembled authored Apr 25, 2016
  
  e2536c08
20 Apr, 2016 2 commits

neoctl: better error message when connection to admin fails · c329ab95
Julien Muchembled authored Apr 20, 2016

c329ab95

storage: fix crash when trying to replicate from an unreachable node · 8b07ff98

Julien Muchembled authored Apr 20, 2016

This fixes the following issue:

WARNING replication aborted for partition 1
DEBUG   connection started for <ClientConnection(uuid=None, address=...:43776, handler=StorageOperationHandler, fd=10, on_close=onConnectionClosed, connecting, client) at 7f5d2067fdd0>
DEBUG   connect failed for <SocketConnectorIPv6 at 0x7f5d2067fe10 fileno 10 ('::', 0), opened to ('...', 43776)>: ENETUNREACH (Network is unreachable)
WARNING replication aborted for partition 5
DEBUG   connection started for <ClientConnection(uuid=None, address=...:43776, handler=StorageOperationHandler, fd=10, on_close=onConnectionClosed, connecting, client) at 7f5d1c409510>
PACKET  #0x0000 RequestIdentification          > None (...:43776)  | (<EnumItem STORAGE (1)>, None, ('...', 60533), '...')
ERROR   Pre-mortem data:
ERROR   Traceback (most recent call last):
ERROR     File "neo/storage/app.py", line 157, in run
ERROR       self._run()
ERROR     File "neo/storage/app.py", line 197, in _run
ERROR       self.doOperation()
ERROR     File "neo/storage/app.py", line 285, in doOperation
ERROR       poll()
ERROR     File "neo/storage/app.py", line 95, in _poll
ERROR       self.em.poll(1)
ERROR     File "neo/lib/event.py", line 121, in poll
ERROR       self._poll(blocking)
ERROR     File "neo/lib/event.py", line 165, in _poll
ERROR       if conn.readable():
ERROR     File "neo/lib/connection.py", line 481, in readable
ERROR       self._closure()
ERROR     File "neo/lib/connection.py", line 539, in _closure
ERROR       self.close()
ERROR     File "neo/lib/connection.py", line 531, in close
ERROR       handler.connectionClosed(self)
ERROR     File "neo/lib/handler.py", line 135, in connectionClosed
ERROR       self.connectionLost(conn, NodeStates.TEMPORARILY_DOWN)
ERROR     File "neo/storage/handlers/storage.py", line 59, in connectionLost
ERROR       replicator.abort()
ERROR     File "neo/storage/replicator.py", line 339, in abort
ERROR       self._nextPartition()
ERROR     File "neo/storage/replicator.py", line 260, in _nextPartition
ERROR       None if name else app.uuid, app.server, name or app.name))
ERROR     File "neo/lib/connection.py", line 562, in ask
ERROR       raise ConnectionClosed
ERROR   ConnectionClosed

8b07ff98

18 Apr, 2016 1 commit
- client: fix abort for storages where only current serials were checked · 2bd827fa
  Julien Muchembled authored Apr 18, 2016
```
This fixes a lock leak on storages, causing further transactions to timeout.
```
  2bd827fa
01 Apr, 2016 1 commit
- Document cluster states · 35491a4a
  Julien Muchembled authored Apr 01, 2016
  
  35491a4a
31 Mar, 2016 1 commit
- Update list of excluded tests in testSSL · b1ea96c0
  Julien Muchembled authored Apr 01, 2016
  
  b1ea96c0
30 Mar, 2016 2 commits
- Found 2 bugs in undo, add tests · 9cca0f8e
  Julien Muchembled authored Mar 30, 2016
  
  9cca0f8e
- doc: fix rst formatting in README · 8ec0faf7
  Julien Muchembled authored Mar 30, 2016
  
  8ec0faf7
28 Mar, 2016 2 commits
- Add support for recent ZODB · 03042a69
  Julien Muchembled authored Mar 28, 2016
  
  03042a69
- tests: sort hunks of ZODB3.patch to match output of git-show · 04681e54
  Julien Muchembled authored Mar 28, 2016
  
  04681e54
22 Mar, 2016 2 commits
- client: fix invalidation issues when reconnecting to the master · 694c27f4
  Julien Muchembled authored Mar 22, 2016
  
  694c27f4
- Recover from failures during tpc_finish when the transaction got successfully committed · dd74d662
  Julien Muchembled authored Mar 21, 2016
  
  dd74d662
21 Mar, 2016 3 commits

master: do never abort a prepared transaction · 7ee7ff4e

Julien Muchembled authored Mar 20, 2016

This fixes the following crash (for example when a client disconnects during
tpc_finish):

Traceback (most recent call last):
  ...
  File "neo/master/handlers/storage.py", line 68, in answerInformationLocked
    self.app.tm.lock(ttid, conn.getUUID())
  File "neo/master/transactions.py", line 338, in lock
    if self._ttid_dict[ttid].lock(uuid) and self._queue[0][1] == ttid:
IndexError: list index out of range

7ee7ff4e

storage: fix crash when a client disconnects just after it requested to finish a transaction · 7aecdada
Julien Muchembled authored Mar 21, 2016

7aecdada
doc: minor changes in importer.conf · b05b961b
Julien Muchembled authored Mar 20, 2016

b05b961b

09 Mar, 2016 2 commits
- Release version 1.6.2 · 1705d828
  Julien Muchembled authored Mar 09, 2016
  
  1705d828
- BUGS: possible "uuid" conflict issue after clients got disconnected from the master · 24780e8e
  Julien Muchembled authored Mar 09, 2016
  
  24780e8e
08 Mar, 2016 2 commits
- tests: check case of multiple conflict resolutions for the same (oid, txn) · eee74faf
  Julien Muchembled authored Mar 08, 2016
  
  eee74faf
- tests: new helper to synchronize threads · 15bcd495
  Julien Muchembled authored Mar 08, 2016
  
  15bcd495
04 Mar, 2016 3 commits

storage: move the commit at tpc_vote from the backends to the unique caller · 645920e8
Julien Muchembled authored Mar 04, 2016

645920e8

storage: defer commit when unlocking a transaction (-> better performance) · eaa07e25

Julien Muchembled authored Mar 04, 2016

Before this change, a storage node did 3 commits per transaction:
- once all data are stored
- when locking the transaction
- when unlocking the transaction

The last one is not important for ACID. In case of a crash, the transaction
is unlocked again (verification phase). By deferring it by 1 second, we
only have 2 commits per transaction during high activity because all pending
changes are merged with the commits caused by other transactions.

This change compensates the extra commit(s) per transaction that were
introduced in commit 7eb7cf1b
("Minimize the amount of work during tpc_finish").

eaa07e25

client: optimize cache by not keeping items with counter=0 in history queue · 254878a8
Julien Muchembled authored Mar 02, 2016

254878a8