- 30 May, 2018 4 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
I made a mistake in commit 13a64cfe ("Simplify definition of packets by computing automatically their codes"). My intention was that the code an answer packet continues to only differ by the highest bit, as implemented now by this commit. Before: 0x0001, 0x8002 Ask1, Answer1 0x0003 Notify2 0x0004, 0x8005 Ask3, Answer3 0x0006, 0x8007 Ask4, Answer4 After: 0x0001, 0x8001 Ask1, Answer1 0x0002 Notify2 0x0003, 0x8003 Ask3, Answer3 0x0004, 0x8004 Ask4, Answer4 This makes the protocol easier to document. And by not wasting the range of possible values, it seems we have enough space to shrink to a single byte. This also removes code that became meaningless since that codes are generated automatically.
-
Julien Muchembled authored
Although data that are already transferred aren't transferred again, checking that the data are there for a whole partition can still be a lot of work for big databases. This commit is a major performance improvement in that a storage node that gets disconnected for a short time now gets fully operational quite instantaneously because it only has to replicate the new data. Before, the time to recover depended on the size of the DB. For OUT_OF_DATE cells, the difficult part was that they are writable and can then contain holes, so we can't just take the last TID in trans/obj (we wrongly did that at the beginning, and then committed 6b1f198f as a workaround). We solve that by storing up to where it was up-to-date: this value is initialized from the last TIDs in trans/obj when the state switches from UP_TO_DATE/FEEDING. There's actually one such OUT_OF_DATE TID per assigned cell (backends store these values in the 'pt' table). Otherwise, a cell that still has a lot to replicate would still cause all other cells to resume from the a very small TID, or even ZERO_TID; the worse case is when a new cell is assigned to a node (as a result of tweak). For UP_TO_DATE cells of a backup cluster, replication was resumed from the maximum TID at which all assigned cells are known to be fully replicated. Like for OUT_OF_DATE cells, the presence of a late cell could cause a lot of extra work for others, the worst case being when setting up a backup cluster (it always restarted from ZERO_TID as long as at least 1 cell was still empty). Because UP_TO_DATE cells are guaranteed to have no holes, there's no need to store extra information: we simply look at the last TIDs in trans/obj. We even handle trans & obj independently, to minimize the work in 1 table (i.e. trans since it's processed first) if the other is late (obj). There's a small change in the protocol so that OUT_OF_DATE enum value equals 0. This way, backends can store the OUT_OF_DATE TID (verbatim) in the same column as the cell state. Note about MySQL changes in commit ca58ccd7: what we did as a workaround is not one any more. Now, we do so much on Python side that it's unlikely we could reduce the number of queries using GROUP BY. We even stopped doing that for SQLite.
-
Julien Muchembled authored
-
- 25 May, 2018 1 commit
-
-
Julien Muchembled authored
-
- 24 May, 2018 9 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
It was confusing and there's already the 'Unlock TXN' log just before abort() is called (in this case, it's more a cleanup than an abort).
-
Julien Muchembled authored
Future migration steps are likely to alter tables, possibly with transformation of data, and this is complicated for both supported backend.
-
Julien Muchembled authored
-
Julien Muchembled authored
Some changes in the storage format are minor and applying them automatically would cost too much for big databases. Here, we apply them manually so that testStorageUpgrade will be able to compare dumps. We hope however that with improvements like https://jira.mariadb.org/browse/MDEV-12836 we'll be able to implement more migration steps and revert parts of this commit.
-
Julien Muchembled authored
These dumps were generated with an old version of NEO, plus a backport of the test that will use them. In MySQL dumps, --hex-blob was used only for inserts in the 'data' table.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 17 May, 2018 1 commit
-
-
Julien Muchembled authored
- for FileStorage DB, make sure a transaction index is built at most once - for other DB types, reopen the DB in the subprocess Now that we have specific code for FileStorage, the generic case is not tested anymore. We should add a test using ZEO. Or better, and in some way crazy, one with NEO, but one would need to fix a special case in getObject.
-
- 16 May, 2018 5 commits
-
-
Julien Muchembled authored
The protocol version is increased to ensure that client nodes are able to handle an empty 'extension' field in AnswerTransactionInformation. It also means that once new transactions are written, going back to a previous revision is not possible.
-
Julien Muchembled authored
The correct way to specify a start/stop tid is when constructing the 'source' object, hence the remove of start/stop args. In fact, source.iterator() does not always take such args. On the other hand, when resuming import, Application.importFrom must manage with incomplete preindex.
-
Julien Muchembled authored
Same as previous commit: only cosmetics so optional.
-
Julien Muchembled authored
'title' means both process name and command line. This is cosmetics so it won't fail if the 'setproctitle' module is not available.
-
Julien Muchembled authored
A new subprocess is used to: - fetch data from the source DB - repickle to change oids (when merging several DB) - compress - checksum This is mostly useful for the second step, which is relatively much slower than any other step, while not releasing the GIL. By using a second CPU core, it is also often possible to use a better compression algorithm for free (e.g. zlib=9). Actually, smaller data can speed up the writing process. In addition to greatly speed up the import by parallelizing fetch+process with write, it also makes the main process more reactive to queries from client nodes.
-
- 15 May, 2018 1 commit
-
-
Julien Muchembled authored
By doing the work with secondary connections to the underlying databases, asynchronously and in a separate process, this should have minimal impact on the performance of the storage node. Extra complexity comes from backends that may lose connection to the database (here MySQL): this commit fully implements reconnection.
-
- 11 May, 2018 3 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
For FileStorage DB, this avoids: - keeping a lock on the source DB during the whole import, - saving the whole index when the import was resumed.
-
Julien Muchembled authored
-
- 07 May, 2018 4 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 18 Apr, 2018 3 commits
-
-
Julien Muchembled authored
It was disabled by mistake in commit fd80cc30.
-
Julien Muchembled authored
- Stop using NEO source code as sample data. - For ZODB5, add a test that does not merge several DB.
-
Julien Muchembled authored
-
- 16 Apr, 2018 3 commits
-
-
Julien Muchembled authored
In the Importer storage backend, the repickler code never really worked with ZODB 5 (use of protocol > 1), and now the test does not pass anymore. The other issues caused by ZODB commit 12ee41c47310156027a674932df34b60de86ba36 are fixed: TypeError: list indices must be integers, not binary ValueError: unsupported pickle protocol: 3 Although not necessary as long as we don't support Python 3, this commit also replaces `str` by `bytes` in a few places.
-
Julien Muchembled authored
-
Julien Muchembled authored
When importing a FileStorage DB without interruption and without having to serve client nodes, the index built by speedupFileStorageTxnLookup is useless. Such case happens when doing simulation tests and on DB with many oids, it can take a lot of time and memory for nothing.
-
- 13 Apr, 2018 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
This was forgotten in commit 5de0ff3a.
-
- 12 Apr, 2018 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
The Importer storage backend already does this.
-
- 10 Apr, 2018 1 commit
-
-
Julien Muchembled authored
This fixes a random failure in testSafeTweak: failureException: 'UU.|U.U|.UU' != 'UU.|.UU|U.U'
-
- 29 Mar, 2018 1 commit
-
-
Julien Muchembled authored
This is a follow-up of commit 2ca7c335, which changed 'tweak' not to discard readable cells too quickly. The scenario of a storage being lost whereas it has feeding cells was forgotten. These must be discarded immediately, otherwise we end up with more up-to-date cells than wanted. Without the change in outdate(), testSafeTweak would end with: UU.|U.U|UUU Once replication is optimized not to always restart checking cells from the beginning: - Remembering that an out-of-date cell was feeding could be a safer option, but it may not be worth the extra complexity. - Another possibility may be to replace the FEEDING state by an automatic partial tweak that only discards up-to-date cells too many whenever a cell becomes up-to-date.
-