An error occurred fetching the project authors.
- 31 Mar, 2017 2 commits
-
-
Julien Muchembled authored
The bug could lead to data corruption (if a partition is wrongly marked as UP_TO_DATE) or crashes (assertion failure on either the storage or the master). The protocol is extended to handle the following scenario: S M partition 0 outdated <-- UnfinishedTransactions ------> replication of partition 0 ... partition 1 outdated --- UnfinishedTransactions ... ... replication finished --- ReplicationDone ... tweak <-- partition 1 discarded -------- tweak <-- partition 1 outdated --------- ... UnfinishedTransactions --> ... ReplicationDone ---------> The master can't simply mark all outdated cells as being updatable when it receives an UnfinishedTransactions packet.
-
Julien Muchembled authored
After an attempt to read from a non-readable, which happens when a client has a newer or older PT than storage's, the client now retries to read. This bugfix is for all kinds of read-access except undoLog, which can still report incomplete results.
-
- 23 Mar, 2017 1 commit
-
-
Julien Muchembled authored
It becomes possible to answer with several packets: - the last is the usual associated answer packet - all other (previously sent) packets are notifications Connection.send does not return the packet id anymore. This is not useful enough, and the caller can inspect the sent packet (getId).
-
- 18 Mar, 2017 1 commit
-
-
Julien Muchembled authored
Traceback (most recent call last): ... File "neo/lib/handler.py", line 72, in dispatch method(conn, *args, **kw) File "neo/master/handlers/client.py", line 70, in askFinishTransaction conn.getPeerId(), File "neo/master/transactions.py", line 387, in prepare assert node_list, (ready, failed) AssertionError: (set([]), frozenset([])) Master log leading to the crash: PACKET #0x0009 StartOperation > S1 PACKET #0x0004 BeginTransaction < C1 DEBUG Begin <...> PACKET #0x0004 AnswerBeginTransaction > C1 PACKET #0x0001 NotifyReady < S1 It was wrong to process BeginTransaction before receiving NotifyReady. The changes in the storage are cosmetics: the 'ready' attribute has become redundant with 'operational'.
-
- 21 Feb, 2017 1 commit
-
-
Julien Muchembled authored
This is a first version with several optimizations possible: - improve EventQueue (or implement a specific queue) to minimize deadlocks - turn the RebaseObject packet into a notification Sorting oids could also be useful to reduce the probability of deadlocks, but that would never be enough to avoid them completely, even if there's a single storage. For example: 1. C1 does a first store (x or y) 2. C2 stores x and y; one is delayed 3. C1 stores the other -> deadlock When solving the deadlock, the data of the first store may only exist on the storage. 2 functional tests are removed because they're redundant, either with ZODB tests or with the new threaded tests.
-
- 14 Feb, 2017 1 commit
-
-
Julien Muchembled authored
-
- 18 Jan, 2017 1 commit
-
-
Julien Muchembled authored
-
- 04 Jan, 2017 1 commit
-
-
Julien Muchembled authored
It is extended to check that the storage is only notified about the transactions that existed at the time it asked for them. Otherwise, Replicator.transactionFinished would be called more than once, and `self.ttid_set.remove(ttid)` would raise KeyError. The functional version also contained an annoying 'sleep(10)'.
-
- 22 Dec, 2016 1 commit
-
-
Julien Muchembled authored
-
- 29 Nov, 2016 1 commit
-
-
Kirill Smelkov authored
-
- 27 Nov, 2016 2 commits
-
-
Julien Muchembled authored
When Client (including backup master) and admin nodes are identified, the primary master now sends them automatically all nodes with NotifyNodeInformation, as with storage nodes.
-
Julien Muchembled authored
-
- 21 Mar, 2016 1 commit
-
-
Julien Muchembled authored
This fixes the following crash (for example when a client disconnects during tpc_finish): Traceback (most recent call last): ... File "neo/master/handlers/storage.py", line 68, in answerInformationLocked self.app.tm.lock(ttid, conn.getUUID()) File "neo/master/transactions.py", line 338, in lock if self._ttid_dict[ttid].lock(uuid) and self._queue[0][1] == ttid: IndexError: list index out of range
-
- 25 Jan, 2016 1 commit
-
-
Julien Muchembled authored
-
- 01 Dec, 2015 1 commit
-
-
Julien Muchembled authored
With the previous commit, the request to truncate the DB was not stored persistently, which means that this operation was still vulnerable to the case where the master is restarted after some nodes, but not all, have already truncated. The master didn't have the information to fix this and the result was a DB partially truncated. -> On a Truncate packet, a storage node only stores the tid somewhere, to send it back to the master, which stays in RECOVERING state as long as any node has a different value than that of the node with the latest partition table. We also want to make sure that there is no unfinished data, because a user may truncate at a tid higher than a locked one. -> Truncation is now effective at the end on the VERIFYING phase, just before returning the last ids to the master. At last all nodes should be truncated, to avoid that an offline node comes back with a different history. Currently, this would not be an issue since replication is always restart from the beginning, but later we'd like they remember where they stopped to replicate. -> If a truncation is requested, the master waits for all nodes to be pending, even if it was previously started (the user can still force the cluster to start with neoctl). And any lost node during verification also causes the master to go back to recovery. Obviously, the protocol has been changed to split the LastIDs packet and introduce a new Recovery, since it does not make sense anymore to ask last ids during recovery.
-
- 30 Nov, 2015 2 commits
-
-
Julien Muchembled authored
Currently, the database may only be truncated when leaving backup mode, but the issue will be the same when neoctl gets a new command to truncate at an arbitrary tid: we want to be sure that all nodes are truncated before anything else. Therefore, we stop sending Truncate orders before stopping operation because nodes could fail/exit before actually processing them. Truncation must also happen before asking nodes their last ids. With this commit, if a truncation is requested: - this is always the first thing done when a storage node connects to the primary master during the RECOVERING phase, - and the cluster does not start automatically if there are missing nodes, unless an admin forces it. Other changes: - Connections to storage nodes don't need to be aborted anymore when leaving backup mode. - The master always initiates communication when a storage node identifies, which simplifies code and reduces the number of exchanged packets.
-
Julien Muchembled authored
This fixes several cases where the partition table could become corrupt and the whole cluster being stuck in VERIFYING state. This also reduces the probability the have cells out of date when restarting several storage nodes simultaneously. At last, if a master node becomes primary again, a cluster must not be started automatically if nodes with readable cells are missing, in order to avoid a split of the database. This could happen if this master node was previously forced to start it.
-
- 21 May, 2015 1 commit
-
-
Julien Muchembled authored
-
- 07 Jan, 2014 1 commit
-
-
Julien Muchembled authored
-
- 21 Aug, 2012 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 20 Aug, 2012 3 commits
-
-
Vincent Pelletier authored
-
Julien Muchembled authored
-
Julien Muchembled authored
- catch OperationFailure - reset transaction manager when leaving backup mode - send appropriate target tid to a storage that updates a outdated cell - clean up partition table when leaving BACKINGUP state unexpectedly - make sure all readable cells of a partition have the same 'backup_tid' if they have the same data, so that we know when internal replication is finished when leaving backup mode - fix storage not finished internal replication when leaving backup mode
-
- 20 Mar, 2012 1 commit
-
-
Julien Muchembled authored
-
- 13 Mar, 2012 1 commit
-
-
Julien Muchembled authored
-
- 12 Mar, 2012 1 commit
-
-
Julien Muchembled authored
This includes an API change of Node.isIdentified, which now tells whether identification packets have been exchanged or not. All handlers must be updated to implement '_acceptIdentification' instead of overriding EventHandler.acceptIdentification: this patch only does it for StorageOperationHandler
-
- 24 Feb, 2012 1 commit
-
-
Julien Muchembled authored
Replication is also fully reimplemented: - It is not done anymore on whole partitions. - It runs at lowest priority not to degrades performance for client nodes. Schema of MySQL table is changed to optimize storage layout: rows are now grouped by age, for good partial replication performance. This certainly also speeds up simple loads/stores.
-
- 10 Jan, 2012 1 commit
-
-
Julien Muchembled authored
-
- 26 Oct, 2011 1 commit
-
-
Julien Muchembled authored
-
- 08 Feb, 2011 1 commit
-
-
Grégory Wisniewski authored
- Storage nodes start to replicate a partition when all transactions that were pending when the oudated partition was added are committed. - Transactions are registered by the master from the tpc_begin step. Signed-off-by: Grégory <gregory@nexedi.com> git-svn-id: https://svn.erp5.org/repos/neo/trunk@2649 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
- 17 Jan, 2011 1 commit
-
-
Olivier Cros authored
In order to prepare the eggification of the different neo parts, we created a new neo/lib module, containing all of the main neo's functions. It allows to make neo a virtual namespace, and so not containing module code anymore. git-svn-id: https://svn.erp5.org/repos/neo/trunk@2615 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
- 11 Jan, 2011 1 commit
-
-
Grégory Wisniewski authored
- AnswerInformationLocked give ttid instead of tid - Master transaction manager always use ttid in data structures - It's no more makes sense to check if the tid is greater than the last generated as it never comes back from a storage, just check if the ttid is well known by the transaction manager. - Rename all tid variable that now hold a ttid - Transaction manager's queue contains ttids but the corresponding tids are increasing to keep commit order. - Adjust tests git-svn-id: https://svn.erp5.org/repos/neo/trunk@2613 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
- 22 Dec, 2010 1 commit
-
-
Grégory Wisniewski authored
git-svn-id: https://svn.erp5.org/repos/neo/trunk@2563 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
- 14 Dec, 2010 2 commits
-
-
Vincent Pelletier authored
This allows parallel execution of tpc_begin, stores & related conflict resolution and tpc_vote for different transactions. This requires an extension to ZODB allowing to keep TID secret until tpc_finish (ie, so that it doesn't require tpc_vote to return tid for each stored object). git-svn-id: https://svn.erp5.org/repos/neo/trunk@2534 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
Vincent Pelletier authored
This ensures invalidations are sent in strict ascending TID values order. git-svn-id: https://svn.erp5.org/repos/neo/trunk@2530 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
- 07 Dec, 2010 1 commit
-
-
Vincent Pelletier authored
git-svn-id: https://svn.erp5.org/repos/neo/trunk@2478 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
- 08 Nov, 2010 2 commits
-
-
Grégory Wisniewski authored
git-svn-id: https://svn.erp5.org/repos/neo/trunk@2427 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
Grégory Wisniewski authored
A storage might notify the master after the cluster as fall down to the verification state. git-svn-id: https://svn.erp5.org/repos/neo/trunk@2426 71dcc9de-d417-0410-9af5-da40c76e7ee4
-
- 05 Nov, 2010 1 commit
-
-
Vincent Pelletier authored
Some requests can be safely ignored when received over a closed connection. This was previously done explicitly in handlers, but it turns out it would cause a lot of code duplication. Instead, define the policy on a packet type basis, and apply it to all packets upon reception, before passing it to handler. Also, protect request handlers when they respond, as connection might be closed. git-svn-id: https://svn.erp5.org/repos/neo/trunk@2419 71dcc9de-d417-0410-9af5-da40c76e7ee4
-