neoppod:bd5ba87ae6e0b8169b0da15fed0e32e56e7964af commitshttps://lab.nexedi.com/nexedi/neoppod/-/commits/bd5ba87ae6e0b8169b0da15fed0e32e56e7964af2019-05-09T15:22:16+02:00https://lab.nexedi.com/nexedi/neoppod/-/commit/bd5ba87ae6e0b8169b0da15fed0e32e56e7964afFix undo of transactions during which readCurrent() was used2019-05-09T15:22:16+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/1a72a60f4ad7d2d4129ef325732e89c0fc7e7799storage: require backends to use @fallback implementation explicitly2019-05-09T15:04:54+02:00Julien Muchembledjm@nexedi.com
... rather than logging when the backend does not override.https://lab.nexedi.com/nexedi/neoppod/-/commit/042f5ac0a72dfcddc504eb68e30c0533cfb0a1c1importer: fix writeback of transactions during which readCurrent() was used2019-04-30T17:36:03+02:00Julien Muchembledjm@nexedi.com
Contrary to FileStorage, NEO remembers uses of readCurrent().https://lab.nexedi.com/nexedi/neoppod/-/commit/68f2641575bb372475d18dd08f876fee2e7c23bbimporter: forbid truncation when writeback is active2019-04-30T17:09:50+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/dba07e72874f34a85a20725648a9f3fdc081138cmaster: fix crash in STARTING_BACKUP when connecting to an upstream secondary...2019-04-30T16:53:59+02:00Julien Muchembledjm@nexedi.com
This fixes the following assertion:
Traceback (most recent call last):
File "neo/master/app.py", line 172, in run
self._run()
File "neo/master/app.py", line 182, in _run
self.playPrimaryRole()
File "neo/master/app.py", line 302, in playPrimaryRole
self.backup_app.provideService())
File "neo/master/backup_app.py", line 114, in provideService
node, conn = bootstrap.getPrimaryConnection()
File "neo/lib/bootstrap.py", line 74, in getPrimaryConnection
poll(1)
File "neo/lib/event.py", line 160, in poll
to_process.process()
File "neo/lib/connection.py", line 504, in process
self._handlers.handle(self, self._queue.pop(0))
File "neo/lib/connection.py", line 92, in handle
self._handle(connection, packet)
File "neo/lib/connection.py", line 107, in _handle
pending[0][1].packetReceived(connection, packet)
File "neo/lib/handler.py", line 125, in packetReceived
self.dispatch(*args)
File "neo/lib/handler.py", line 75, in dispatch
method(conn, *args, **kw)
File "neo/lib/handler.py", line 159, in notPrimaryMaster
assert primary != self.app.server
AttributeError: 'BackupApplication' object has no attribute 'server'https://lab.nexedi.com/nexedi/neoppod/-/commit/e3cd5c5bfe1350bd9402c7b7e47901e0c2b9bad8qa: add testrunner options to dump/check the format of network packets2019-04-28T02:02:30+02:00Julien Muchembledjm@nexedi.com
With the switch to msgpack, there was no schema anymore whereas it was
sometimes used for both automatic conversion (e.g. the last argument of
AskStoreTransaction must now be explicitly cast to list) and type checking.
This somewhat reintroduces a kind of schema that:
- is used by the test suite for type checking
- can be generated automatically from the test suite
when one change the procotolhttps://lab.nexedi.com/nexedi/neoppod/-/commit/9d0bf97a1327182ac29e95d65fd9e18742c43d1fprotocol: switch to msgpack for packet serialization2019-04-28T02:02:30+02:00Julien Muchembledjm@nexedi.com
Not only for performance reasons (at least 3% faster) but also because of
several ugly things in the way packets were defined:
- packet field names, which are only documentary; for roots fields,
they even just duplicate the packet names
- a lot of repetitions for packet names, and even confusion between the name
of the packet definition and the name of the actual notify/request packet
- the need to implement field types for anything, like PByte to support new
compression formats, since PBoolean is not enough
neo/lib/protocol.py is now much smaller.https://lab.nexedi.com/nexedi/neoppod/-/commit/6332112cba979dfd29b40fe9f98d097911fde696Release version 1.122019-04-28T02:00:15+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/55a6dd0fd3515a46171c1479b1f312984e1f4949master: reject drop/tweak ctl commands that could lead to unwanted status2019-04-27T15:54:10+02:00Julien Muchembledjm@nexedi.com
The following 2 operations can be onerous and they should not be
directly usable without some kind of confirmation by the user:
- Dropping a node now requires to first stop it.
- Tweaking does not exclude anymore automatically DOWN nodes,
because a node could go DOWN between the moment the user sends
the command to tweak and the actual tweak by the master.https://lab.nexedi.com/nexedi/neoppod/-/commit/ef4d58f69035fffbfb8c72982142b3780c2044ccqa: extend test reproducing the migration of a big ZODB to NEO2019-04-27T15:54:10+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/ab082d7eb5f2ae499e6e48844220beaa378e55b0neoctl: better display of full partition tables2019-04-27T15:54:10+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/c6453626137f0c2381f4e0524429d170881e5c75Bump protocol version2019-04-27T15:53:52+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/2a27239de74ba8f4794df1fcd68d9e15b67b3feetweak: add option to simulate2019-04-27T15:52:46+02:00Julien Muchembledjm@nexedi.com
Initially, I wanted to do the simulation inside neoctl but it has no knowledge
of the topology (the master don't send devpath values of storage nodes).
Therefore, the work is delegated to the master node, which implies a change
of the protocol.https://lab.nexedi.com/nexedi/neoppod/-/commit/3839d224ebaab2b8ba8f02e7a0cb25f9eece87e6tweak: do not crash when trying to remove all nodes2019-04-27T15:52:46+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/8a645d9f0f16e0669e598525c2278bb82578e681tweak: do not touch cells of nodes that are intended to be dropped2019-04-27T15:52:45+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/c2c9e99da4ba995fb383c052625d16100f165ebaBetter error reporting from the master to neoctl for denied requests2019-04-27T15:52:45+02:00Julien Muchembledjm@nexedi.com
This stops abusing ProtocolError, which disconnects the admin node needlessly.
The many 'if ... raise RuntimeError' in neo/neoctl/neoctl.py
could be turned into assertions.https://lab.nexedi.com/nexedi/neoppod/-/commit/21190ee7a6e535bfc27a60d896a4e06aa6fe4127Make 'neoctl print pt' report the number of replicas2019-04-27T15:52:45+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/ef5fc50859c73345c86499ac381ab164b4e00023Make the number of replicas modifiable when the cluster is running2019-04-27T15:52:45+02:00Julien Muchembledjm@nexedi.com
neoctl gets a new command to change the number of replicas.
The number of replicas becomes a new partition table attribute and
like the PT id, it is stored in the config table. On the other side,
the configuration value for the number of partitions is dropped,
since it can be computed from the partition table, which is
always stored in full.
The -p/-r master options now only apply at database creation.
Some implementation notes:
- The protocol is slightly optimized in that the master now sends
automatically the whole partition tables to the admin & client
nodes upon connection, like for storage nodes.
This makes the protocol more consistent, and the master is the
only remaining node requesting partition tables, during recovery.
- Some parts become tricky because app.pt can be None in more cases.
For example, the extra condition in NodeManager.update
(before app.pt.dropNode) was added for this is the reason.
Or the 'loadPartitionTable' method (storage) that is not inlined
because of unit tests.
Overall, this commit simplifies more than it complicates.
- In the master handlers, we stop hijacking the 'connectionCompleted'
method for tasks to be performed (often send the full partition
table) on handler switches.
- The admin's 'bootstrapped' flag could have been removed earlier:
race conditions can't happen since the AskNodeInformation packet
was removed (commit <a href="/nexedi/neoppod/-/commit/d048a52d2ef88e1791370f422e9d29ce64ba729b" data-original="d048a52d2ef88e1791370f422e9d29ce64ba729b" data-link="false" data-link-reference="false" data-project="72" data-commit="d048a52d2ef88e1791370f422e9d29ce64ba729b" data-reference-type="commit" data-container="body" data-placement="top" data-html="true" title="Remove AskNodeInformation packet" class="gfm gfm-commit has-tooltip">d048a52d</a>).https://lab.nexedi.com/nexedi/neoppod/-/commit/27e3f620789bb5d747aff33129281320018b73caNew --new-nid storage option for fast cloning2019-04-27T15:52:45+02:00Julien Muchembledjm@nexedi.com
It is often faster to set up replicas by stopping a node (and any
underlying database server like MariaDB) and do a raw copy of the
database (e.g. with rsync). So far, it required to stop the whole
cluster and use tools like 'mysql' or sqlite3' to edit:
- the 'pt' table in databases,
- the 'config.nid' values of the new nodes.
With this new option, if you already have 1 replica, you can set up
new replicas with such fast raw copy, and without interruption of
service. Obviously, this implies less redundancy during the operation.https://lab.nexedi.com/nexedi/neoppod/-/commit/64e02391fb595fc57d8b152550f62366525a7422qa: fix 2 tests with ZODB52019-04-27T15:51:37+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/491f4c89a1d80531cd5752e92beb9104262590efqa: new tools/stress options to evaluate MySQL engines2019-04-26T19:14:24+02:00Julien Muchembledjm@nexedi.com
--kill-mysqld should be combined with something like -f .3 -r .1
to give storage nodes enough time to recover.
And also -D 0 to focus testing on the storage backend rather than NEO.https://lab.nexedi.com/nexedi/neoppod/-/commit/c11410ef1570938b1f3568ba77d6623f6f267480qa: provide a way to let tests start 1 mysqld per storage node2019-04-26T19:14:24+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/74ec44e3c824e152e9f7a14e3e6d828111f1c022mysql: make 'user' actually optional in the DB connection string2019-04-26T19:14:24+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/87c1de3b4d079a82b7af745e8d02d0a50b0225e4mysql: specify column families for RocksDB2019-04-26T19:14:24+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/aa7b654f8855cb9eacec74b32fd5859cc4e7caa1qa: add testIncremental (testImporter) test2019-04-16T23:33:43+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/d5834ee9cdd9194cdbebb25d78401e1bdf085d39importer: fix hidden "maximum recursion depth exceeded" at startup2019-04-16T23:33:43+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/c37bcfa3579663760e15aae9cb52d487c3cde1f3importer: fix closure of ZODB, and also do it when the import is finished2019-04-16T23:33:43+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/6608a86886a36d24314dafdff8989d33b7714d0bsqlite: fix resumption of migration to NEO with Importer2019-04-16T23:33:43+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/989e9920567bf104881524a997b26c502f75692cqa: fix a random failure in threaded tests2019-04-16T23:33:28+02:00Julien Muchembledjm@nexedi.com
This also reverts commit <a href="/nexedi/neoppod/-/commit/442bb43ad1262f87f7f6a05dbf9bf56084435b8b" data-original="442bb43ad1262f87f7f6a05dbf9bf56084435b8b" data-link="false" data-link-reference="false" data-project="72" data-commit="442bb43ad1262f87f7f6a05dbf9bf56084435b8b" data-reference-type="commit" data-container="body" data-placement="top" data-html="true" title="qa: add a log in case that a mysterious bug happens again" class="gfm gfm-commit has-tooltip">442bb43a</a>.https://lab.nexedi.com/nexedi/neoppod/-/commit/26b1246afb9efc686a7dc7e825d5a652fdbd3011importer: speed up startup when the import is already finished2019-04-05T20:24:48+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/9d14ea1b30c7499549255be970292e887c48c111importer: fix replication (as source) once import is finished2019-04-05T20:24:44+02:00Julien Muchembledjm@nexedi.com
This fixes up commit <a href="/nexedi/neoppod/-/commit/be839e92bf47e191c7de2cd5ca196da89cad7035" data-original="be839e92bf47e191c7de2cd5ca196da89cad7035" data-link="false" data-link-reference="false" data-project="72" data-commit="be839e92bf47e191c7de2cd5ca196da89cad7035" data-reference-type="commit" data-container="body" data-placement="top" data-html="true" title="storage: speed up replication by not getting object next_serial for nothing" class="gfm gfm-commit has-tooltip">be839e92</a>.https://lab.nexedi.com/nexedi/neoppod/-/commit/c58d4862402eb68316b72355834b3bbc70a8bc90storage: fix DatabaseManager.getLastTID with max_tid2019-04-05T20:21:47+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/b10cc7506aaad619ad42c5cda321069f26648530qa: remove 2 useless unit tests2019-04-01T16:49:30+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/15369269da7102f1774754b095ef05175673b32bstorage: allow the master to change our node id2019-03-21T19:44:34+01:00Julien Muchembledjm@nexedi.com
This is not used currently.https://lab.nexedi.com/nexedi/neoppod/-/commit/e8473a238e633ca8df7b804013371a2e58c716a2Rename --uuid command-line options into --nid2019-03-21T15:57:13+01:00Julien Muchembledjm@nexedi.com
This breaks compatibily but it was mentionned from the beginning
that these options are only there for testing purpose.
TODO: rename all remaining occurrences of UUID into NID in the codehttps://lab.nexedi.com/nexedi/neoppod/-/commit/e387ad5915c444d1776f6aad6ac74ee42b7d7064importer: fix possible data loss on writeback2019-03-16T16:21:30+01:00Julien Muchembledjm@nexedi.com
If the source DB is lost during the import and then restored from a backup,
all new transactions have to written back again on resume. It is the most
common case for which the writeback hits the maximum number of transactions
per partition to process at each iteration; the previous code was buggy in
that it could skip transactions.https://lab.nexedi.com/nexedi/neoppod/-/commit/48d936cbd500933e8bb39bdb7ed32dcea7cd0c22Release version 1.112019-03-11T19:18:32+01:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/af2e209b70c90ed5b9d953c344d3161fd8e718bfFix short descriptions of neoctl & neomigrate in their headers2019-03-11T19:17:01+01:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/342168cd163a01cedf824f65ce8f6c1afa20f7abUpdate copyright year2019-03-11T19:15:41+01:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/38e98a12337c07aada699606a182e9211a8e7022qa: new tool to stress-test NEO2019-02-26T16:35:53+01:00Julien Muchembledjm@nexedi.com
Example output:
stress: yes (toggle with F1)
cluster state: RUNNING
last oid: 0x44c0
last tid: 0x3cdee272ef19355 (2019-02-26 15:35:11.002419)
clients: 2308, 2311, 2302, 2173, 2226, 2215, 2306, 2255, 2314, 2356 (+48)
8m53.988s (42.633861/s)
pt id: 4107
RRRDDRRR
0: OU......
1: ..UO....
2: ....OU..
3: ......UU
4: OU......
5: ..UO....
6: ....OU..
7: ......UU
8: OU......
9: ..UO....
10: ....OU..
11: ......UU
12: OU......
13: ..UO....
14: ....OU..
15: ......UU
16: OU......
17: ..UO....
18: ....OU..
19: ......UU
20: OU......
21: ..UO....
22: ....OU..
23: ......UU