1. 16 Jan, 2017 4 commits
  2. 13 Jan, 2017 4 commits
  3. 12 Jan, 2017 1 commit
    • Julien Muchembled's avatar
      qa: make closure of NEOCluster more reliable in treaded tests · e2183483
      Julien Muchembled authored
      Instances of NEOCluster were not deleted as soon as the only referrers were
      weak proxies (at least that's what a quick check with the 'gc' module showed
      at the beginning of tearDown). In some cases, __del__ was called while the next
      test was logging a message, which led to deadlocks.
      
      Without those proxies, it may be reliable, but only on CPython. See
        http://doc.pypy.org/en/latest/cpython_differences.html#differences-related-to-garbage-collection-strategies
      
      Relying on __del__ to close a cluster was wrong. NEOCluster is now a context
      manager that does it explicitly at exit, in addition to automatically stop it.
      The NEOCluster.stop method combines the previous stop/__del__/reset methods.
      
      A new 'with_cluster' decorator is also added to avoid excessive indentation
      in tests. Unindentation of existing tests will be done later.
      e2183483
  4. 11 Jan, 2017 3 commits
  5. 09 Jan, 2017 1 commit
  6. 06 Jan, 2017 3 commits
  7. 04 Jan, 2017 2 commits
  8. 03 Jan, 2017 1 commit
  9. 30 Dec, 2016 1 commit
  10. 28 Dec, 2016 6 commits
  11. 27 Dec, 2016 1 commit
  12. 26 Dec, 2016 5 commits
  13. 23 Dec, 2016 1 commit
  14. 22 Dec, 2016 1 commit
  15. 21 Dec, 2016 3 commits
    • Julien Muchembled's avatar
      storage: start replicating the partition which is furthest behind · 4d3f3723
      Julien Muchembled authored
      This fixes the following case when the backup is far behing the upstream DB,
      and there are transactions being committed at the same time:
      
      1. replicate partition 0
      2. replicate partition 0
      3. replicate partition 1
      4. replicate partition 0
      5. replicate partition 1
      6. replicate partition 2
      7. replicate partition 0
      ...
      and so on in a quadratic way.
      
      When the upstream activity was too high, the backup could even be stuck looping
      on the first partitions.
      4d3f3723
    • Julien Muchembled's avatar
      master: fix possibly wrong knowledge of cells' backup_tid when resuming backup · 17af3b47
      Julien Muchembled authored
      The issue happens when there were commits while the backup cluster was down.
      In this case, the master thinks that these commits are already replicated,
      reporting wrong backup_tid to neoctl. It solved by itself once:
      - there are new commits triggering replication for all partitions;
      - all storage nodes have really replicated.
      
      This also resulted in an inconsistent database when leaving backup mode during
      this period.
      17af3b47
    • Julien Muchembled's avatar
      Minor comment/doc changes · c95c6c39
      Julien Muchembled authored
      c95c6c39
  16. 20 Dec, 2016 1 commit
  17. 06 Dec, 2016 2 commits
    • Julien Muchembled's avatar
      master,client: ignore notifications before complete initialization · 36b2d141
      Julien Muchembled authored
      A backup master crashed with the following traceback after a reconnection:
      
          Traceback (most recent call last):
            File "neo/master/app.py", line 127, in run
              self._run()
            File "neo/master/app.py", line 147, in _run
              self.playPrimaryRole()
            File "neo/master/app.py", line 348, in playPrimaryRole
              self.backup_app.provideService())
            File "neo/master/backup_app.py", line 123, in provideService
              poll(1)
            File "neo/lib/event.py", line 126, in poll
              to_process.process()
            File "neo/lib/connection.py", line 500, in process
              self._handlers.handle(self, self._queue.pop(0))
            File "neo/lib/connection.py", line 110, in handle
              self._handle(connection, packet)
            File "neo/lib/connection.py", line 125, in _handle
              handler.packetReceived(connection, packet)
            File "neo/lib/handler.py", line 117, in packetReceived
              self.dispatch(*args)
            File "neo/lib/handler.py", line 66, in dispatch
              method(conn, *args, **kw)
            File "neo/master/handlers/backup.py", line 52, in invalidateObjects
              app.invalidatePartitions(tid, partition_set)
            File "neo/master/backup_app.py", line 257, in invalidatePartitions
              self.triggerBackup(node)
            File "neo/master/backup_app.py", line 281, in triggerBackup
              assert cell_list, offset
          AssertionError: 0
      36b2d141
    • Julien Muchembled's avatar