CHANGELOG.rst 16 KB
Newer Older
Julien Muchembled's avatar
Julien Muchembled committed
1 2 3
Change History
==============

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
1.7.0 (2016-12-19)
------------------

- Identification issues, mainly caused by id conflicts, are fixed:

  - Storage nodes now only accept clients that are known by the master.
  - When reconnecting to a master, a client get a new id if the previous id is
    already reallocated to another client.
  - The consequences were either crashes or clients being unable to connect.

- Added support for the latest versions of ZODB (4.4.4 & 5.0.1). A notable
  change is that lastTransaction() does not ping the master anymore (but it
  still causes a connection to the master if the client is disconnected).

- A cluster in BACKUPING state can now serve regular clients in read-only mode.
  But without invalidation yet, so clients must reconnect whenever they want
  to see newer data.

- Fixed crash of client nodes (including backup master) while trying to process
  notifications before complete initialization, instead of ignoring them.

- Client:

  - Fix race condition leading to invalid mapping between internal connection
    objects and their file descriptors. This resulted in KeyError exceptions.
  - Fix item eviction from cache, which could break loading from storage.
  - Better exception handling in tpc_abort.
  - Do not limit the number of open connections to storage nodes.

- Storage:

  - Fix crash when a client loses connection to the master just before voting.
  - MySQL: Force index for a few queries. Unfortunately, this is not perfect
    because sometimes MySQL still ignores our hints.
  - MySQL: Do not use unsafe TRUNCATE statement.

- Make 'neoctl print ids' display time of TIDs.
- Various neoctl/neolog formatting improvements/fixes.
- Plus a few other changes for debugging and developers, as well as small
  optimizations.

45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
1.6.3 (2016-06-15)
------------------

- Added support for ZODB 4.x

- Clients are now able to recover from failures during tpc_finish when the
  transaction got successfully committed.

- Other fixes related to node disconnection:

  - storage: fix crash when a client disconnects just after it requested to
    finish a transaction
  - storage: fix crash when trying to replicate from an unreachable node
  - master: do never abort a prepared transaction (for example,
    a client disconnecting during tpc_finish could cause a crash)
  - client: fix invalidation issues when reconnecting to the master

- Client:

  - fix abort for storages where only current serials were checked
  - fix the count of history items in the cache

- neoctl: better error message when connection to admin fails

69 70 71 72 73 74 75 76 77 78 79
1.6.2 (2016-03-09)
------------------

- storage: switch to a maintained fork of MySQL-python (mysqlclient)
- storage: for better performance, the backend commit after an unlocked
  transaction is deferred by 1 second, with the hope it's merged by a
  subsequent commit (in case of a crash, the transaction is unlocked again),
  so there are only 2 commits per transaction during high activity
- client: optimize cache by not keeping items with counter=0 in history queue
- client: fix possible assertion failure on load in case of a late invalidation

Julien Muchembled's avatar
Julien Muchembled committed
80 81 82 83 84 85 86 87 88 89 90 91 92
1.6.1 (2016-01-25)
------------------

NEO repository has moved to https://lab.nexedi.com/nexedi/neoppod.git

- client: fix spurious connection timeouts
- client: add cache stats to information dumped on SIGRTMIN+2
- storage: when using the Importer backend, allow truncation after the last
  tid to import, during or after the import
- neoctl: don't print 'None' on successful check/truncate commands
- neolog: fix crash on unknown packets
- plus a few other changes for debugging and developers

Julien Muchembled's avatar
Julien Muchembled committed
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
1.6 (2015-12-02)
----------------

This release has changes in storage format. The upgrade is done automatically,
but only if the cluster was stopped cleanly: see UPGRADE notes for more
information.

- NEO did not ensure that all data and metadata were written on disk before
  tpc_finish, and it was for example vulnerable to ENOSPC errors. In order to
  minimize the risk of failures during tpc_finish, the writing of metadata to
  temporary tables is now done in tpc_vote. See commit `7eb7cf1`_ for more
  information about possible changes on performance side.

  This change comes with a new algorithm to verify unfinished data, which also
  fixes a bug discarding transactions with objects for which readCurrent was
  called.

- The RECOVERING/VERIFYING phases, as well as transitions from/to other states,
  have been completely reviewed, to fix many bugs:

  - Possible corruption of partition table.
  - The cluster could be stuck in RECOVERING or VERIFYING state.
  - The probability to have cells out-of-date when restarting several storage
    nodes simultaneously has been reduced.
  - During recovery, a newly elected master now always waits all the storage
    nodes with readable cells to be pending, in order to avoid a split of the
    database.
  - The last tid/oid could be wrong in several cases, for example after
    transactions are recovered during VERIFYING phase.

- neoctl gets a new command to truncate the database at an arbitrary TID.
  Internally, NEO was already able to truncate the database, because this was
  necessary to make the database consistent when leaving the backup mode.
  However, there were several bugs that caused the database to be partially
  truncated:

  - The master now first stores persistently the decision to truncate,
    so that it can recover from any kind of connection failure.
  - The cluster goes back to RUNNING state only after an acknowledgment from
    all storage nodes (including those without any readable cell) that they
    truncated.

- Storage:

  - As a workaround to fix holes if replication is interrupted after new data
    is committed, outdated cells always restart to replicate from the beginning.
  - The deletion of partial transactions during verification didn't try to free
    the associated raw data.
  - The MySQL backend didn't drop the 'bigdata' table when erasing the database.

- Handshaking SSL connections could be stuck when they're aborted.

Julien Muchembled's avatar
Julien Muchembled committed
145
- 'neoctl print ids' displays a new value in backup mode: the highest common TID
Julien Muchembled's avatar
Julien Muchembled committed
146 147 148
  up to which all readable cells have replicated, i.e. the TID at which the
  database would be truncated when leaving the backup mode.

Julien Muchembled's avatar
Julien Muchembled committed
149
.. _7eb7cf1: https://lab.nexedi.com/nexedi/neoppod/commit/7eb7cf1
Julien Muchembled's avatar
Julien Muchembled committed
150

151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171
1.5.1 (2015-10-26)
------------------

Several bugs and performance issues have been fixed in this release, mainly
in the storage node.

- Importer storage backend:

  - Fix retrieval of an object from ZODB when next serial in NEO.
  - Fix crash of storage nodes when a transaction is aborted.
  - Faster resumption when many transactions
    have already been imported to MySQL.

- MySQL storage backend:

  - Refuse to start if max_allowed_packet is too small.
  - Faster commit of transaction metadata.

- Replication & checking of replicas:

  - Fix crash when a corruption is found while checking TIDs.
172
    2 other issues remain unfixed: see BUGS.rst file.
173 174 175 176 177 178 179
  - Speed up checking of replicas, at the cost of storage nodes being
    less responsive to other events.

- The master wrongly sent invalidations for objects on which only readCurrent
  was called, which caused invalid entries in client caches, or assertion
  failures in Connection._setstate_noncurrent.

Julien Muchembled's avatar
Julien Muchembled committed
180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211
1.5 (2015-10-05)
----------------

In this version, the connectivity between nodes has been greatly improved:

- Added SSL support.
- IPv4 & IPv6 can be mixed: some nodes can have an IPv4 binding address,
  whereas other listen on IPv6.
- Version 1.4 fixed several cases where nodes could reconnect too quickly,
  using 100% CPU and flooding logs. This is now fixed completely, for example
  when a backup storage node was rejected because the upstream cluster was not
  ready.
- Tickless poll loop, for lower latency and CPU usage: nodes don't wake up
  every second anymore to check if a timeout has expired.
- Connections could be wrongly processed before being polled (for reading or
  writing). This happened if a file descriptor number was reallocated by the
  kernel for a connection, just after a connection was closed.

Other changes are:

- IStorage: history() did not wait the oid to be unlocked. This means that the
  latest version of an object could be missing from the result.
- Log files can now be specified in configuration files.
- ~(user) construction are expanded for all paths in configuration (file or
  command line). This does not concern non-daemon executables like neoctl.
- For neoctl, -l option now logs everything on disk automatically.
- The admin node do not reset anymore the list of known masters from
  configuration when reconnecting, for consistency with client nodes.
- Code refactoring and improvements to logging and debugging.
- An notable change in the test suite is that the occasional deadlocks that
  affected threaded tests have been fixed.

Julien Muchembled's avatar
Julien Muchembled committed
212
1.4 (2015-07-13)
Julien Muchembled's avatar
Julien Muchembled committed
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245
----------------

This version comes with a change in the SQL tables format, to fix a potential
crash of storage nodes when storing values that only differ by the compression
flag. See UPGRADE notes if you think your application may be affected by this
bug.

- Performance and features:

  - 'Importer' storage backend has been significantly sped up.

  - Support for TokuDB has been added to MySQL storage backend. The engine is
    still InnoDB by default, and it can be selected via a new 'neostorage'
    option.

  - A 'neomaster' option has been added to automatically start a new cluster
    if the number of pending storage nodes is greater than or equal to the
    specified value.

- Bugfixes:

  - Storage crashed when reading empty transactions. We still need to decide
    whether NEO should:

    - continue to store such transactions;
    - ignore them on commit, like other ZODB implementation;
    - or fail on commit.

  - Storage crashed when a client tries to "steal" the UUID of another client.

  - Client could get stuck forever on unreadable cells when not connected to the
    master.

Julien Muchembled's avatar
Julien Muchembled committed
246 247 248 249
  - Client could only instantiate NEOStorage from the main thread, and the
    RTMIN+2 signal displayed logs for only 1 NEOStorage. Now, RTMIN+2 & RTMIN+3
    are setup when neo.client module is imported.

Julien Muchembled's avatar
Julien Muchembled committed
250 251
- Plus fixes and improvements to logging and debugging.

Julien Muchembled's avatar
Julien Muchembled committed
252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276
1.3 (2015-01-13)
----------------

- Version 1.2 added a new 'Importer' storage backend but it had 2 bugs.

  - An interrupted migration could not be resumed.
  - Merging several ZODB only worked if NEO could import all classes used by
    the application. This has been fixed by repickling without loading any
    object.

- Logging has been improved for a better integration with the environment:

  - RTMIN+1 signal was changed to reopen logs. RTMIN+1 & RTMIN+2 signals, which
    were previously used for debugging, have been remapped to RTMIN+2 & RTMIN+3
  - In Zope, client registers automatically for log rotation (USR2).
  - NEO logs are SQLite DB that are not open anymore with a persistent journal,
    because this is incompatible with the rename+reopen way to rotate logs,
    and we want to support logrotate.
  - 'neolog' can now open gzip/bz2 compressed logs transparently.
  - 'neolog' does not spam the console anymore when piped to a process that
    exits prematurely.

- MySQL backend has been updated to work with recent MariaDB (>=10).
- 2 'neomaster' command-line options were added to set upstream cluster/masters.

Julien Muchembled's avatar
Julien Muchembled committed
277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306
1.2 (2014-07-30)
----------------

The most important changes in this version are the work about conversion of
databases from/to NEO:

- A new 'Importer' storage backend has been implemented and this is now the
  recommended way to migrate existing Zope databases. See 'importer.conf'
  example file for more information.
- 'neomigrate' command refused to run since version 1.0
- Exported data serials by NEO iterator were wrong. There are still differences
  with FileStorage:

  - NEO always resolves to original serial, to avoid any indirection
    (which slightly speeds up undo at the expense of a more complex pack code)
  - NEO does not make any difference between object deletion and creation undone
    (data serial always null in storage)

  Apart from that, conversion of database back from NEO should be fixed.

Other changes are:

- A warning was added in 'neo.conf' about a possible misuse of replicas.
- Compatibility with Python 2.6 has been dropped.
- Support for recent version of SQlite has been added.
- A memory leak has been fixed in replication.
- MySQL backend now fails instead of silently reconnecting if there is any
  pending change, which could cause data loss.
- Optimization and minor bugfixes.

Julien Muchembled's avatar
Julien Muchembled committed
307 308 309 310 311 312 313 314 315 316 317 318
1.1 (2014-01-07)
----------------

- Client failed at reconnecting properly to master. It could kill the master
  (during tpc_finish!) or end up with invalid caches (i.e. possible data
  corruption). Now, connection to master is even optional between
  transaction.begin() and tpc_begin, as long as partition table contains
  up-to-date data.
- Compatibility with ZODB 3.9 has been dropped. Only 3.10.x branch is supported.
- checkCurrentSerialInTransaction was not working.
- Optimization and minor bugfixes.

319
1.0 (2012-08-28)
320 321
----------------

Julien Muchembled's avatar
Julien Muchembled committed
322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340
This version mainly comes with stabilized SQL tables format and efficient backup
feature, relying on replication, which has been fully reimplemented:

- It is now incremental, instead of being done on whole partitions.
  Schema of MySQL tables have been changed in order to optimize storage layout,
  for good partial replication performance.
- It runs at lowest priority not to degrade performance for client nodes.
- A cluster in the new BACKINGUP state is a client to a normal cluster and all
  its storage nodes are notified of invalidations and replicate from upstream
  nodes.

Other changes are:

- Compatibility with Python < 2.6 and ZODB < 3.9 has been dropped.
- Cluster is now automatically started when all storage nodes of UP_TO_DATE
  cells are available, similarly to ``mdadm assemble --no-degraded`` behaviour.
- NEO learned to check replicas, to detect data corruption or bugs during
  replication. When done on a backup cluster, upstream data is used as
  reference. This is still limited to data indexes (tid & oid/serial).
Julien Muchembled's avatar
Julien Muchembled committed
341 342 343 344
- NEO logs now are SQLite DB that always contain all debugging information
  including exchanged packets. Records are first kept in RAM, at most 16 MB by
  default, and there are flushed to disk only upon RTMIN signal or any important
  record. A 'neolog' script has been written to help reading such DB.
Julien Muchembled's avatar
Julien Muchembled committed
345 346 347 348
- Master addresses must be separated by spaces. '/' can't be used anymore.
- Adding and removing master nodes is now easier: unknown incoming master nodes
  are now accepted instead of rejected, and nodes can be given a path to a file
  that maintains a list of known master nodes.
349 350
- Node UUIDs have been shortened from 16 to 4 bytes, for better performance and
  easier debugging.
Julien Muchembled's avatar
Julien Muchembled committed
351 352

Also contains code clean-ups and bugfixes.
353

Julien Muchembled's avatar
Julien Muchembled committed
354
0.10.1 (2012-03-13)
Julien Muchembled's avatar
Julien Muchembled committed
355 356 357 358 359
-------------------

- Client didn't limit its memory usage when committing big transactions.
- Master failed to disconnect clients when cluster leaves RUNNING state.

Julien Muchembled's avatar
Julien Muchembled committed
360
0.10 (2011-10-17)
361 362
-----------------

363 364
- Storage was unable or slow to process large-sized transactions.
  This required to change protocol and MySQL tables format.
365 366 367
- NEO learned to store empty values (although it's useless when managed by
  a ZODB Connection).

Julien Muchembled's avatar
Julien Muchembled committed
368
0.9.2 (2011-10-17)
369 370
------------------

371 372
- storage: a specific socket can be given to MySQL backend
- storage: a ConflictError could happen when client is much faster than master
373
- 'verbose' command line option of 'neomigrate' did not work
374
- client: ZODB monkey-patch randomly raised a NameError
375

Julien Muchembled's avatar
Julien Muchembled committed
376
0.9.1 (2011-09-24)
Julien Muchembled's avatar
Julien Muchembled committed
377 378 379 380 381 382 383 384
------------------

- client: method to retrieve history of persistent objects was incompatible
  with recent ZODB and needlessly asked all storages systematically.
- neoctl: 'print node' command (to get list of all nodes) raised an
  AssertionError.
- 'neomigrate' raised a TypeError when converting NEO DB back to FileStorage.

Julien Muchembled's avatar
Julien Muchembled committed
385 386 387 388 389 390 391 392 393 394 395
0.9 (2011-09-12)
----------------

Initial release.

NEO is considered stable enough to replace existing ZEO setups, except that:

- there's no backup mechanism (aka efficient snapshoting): there's only
  replication and underlying MySQL tools

- MySQL tables format may change in the future