1. 16 Jan, 2017 1 commit
  2. 13 Jan, 2017 4 commits
  3. 12 Jan, 2017 1 commit
    • Julien Muchembled's avatar
      qa: make closure of NEOCluster more reliable in treaded tests · e2183483
      Julien Muchembled authored
      Instances of NEOCluster were not deleted as soon as the only referrers were
      weak proxies (at least that's what a quick check with the 'gc' module showed
      at the beginning of tearDown). In some cases, __del__ was called while the next
      test was logging a message, which led to deadlocks.
      
      Without those proxies, it may be reliable, but only on CPython. See
        http://doc.pypy.org/en/latest/cpython_differences.html#differences-related-to-garbage-collection-strategies
      
      Relying on __del__ to close a cluster was wrong. NEOCluster is now a context
      manager that does it explicitly at exit, in addition to automatically stop it.
      The NEOCluster.stop method combines the previous stop/__del__/reset methods.
      
      A new 'with_cluster' decorator is also added to avoid excessive indentation
      in tests. Unindentation of existing tests will be done later.
      e2183483
  4. 11 Jan, 2017 3 commits
  5. 09 Jan, 2017 1 commit
  6. 06 Jan, 2017 3 commits
  7. 04 Jan, 2017 2 commits
  8. 03 Jan, 2017 1 commit
  9. 30 Dec, 2016 1 commit
  10. 28 Dec, 2016 6 commits
  11. 27 Dec, 2016 1 commit
  12. 26 Dec, 2016 5 commits
  13. 23 Dec, 2016 1 commit
  14. 22 Dec, 2016 1 commit
  15. 21 Dec, 2016 3 commits
    • Julien Muchembled's avatar
      storage: start replicating the partition which is furthest behind · 4d3f3723
      Julien Muchembled authored
      This fixes the following case when the backup is far behing the upstream DB,
      and there are transactions being committed at the same time:
      
      1. replicate partition 0
      2. replicate partition 0
      3. replicate partition 1
      4. replicate partition 0
      5. replicate partition 1
      6. replicate partition 2
      7. replicate partition 0
      ...
      and so on in a quadratic way.
      
      When the upstream activity was too high, the backup could even be stuck looping
      on the first partitions.
      4d3f3723
    • Julien Muchembled's avatar
      master: fix possibly wrong knowledge of cells' backup_tid when resuming backup · 17af3b47
      Julien Muchembled authored
      The issue happens when there were commits while the backup cluster was down.
      In this case, the master thinks that these commits are already replicated,
      reporting wrong backup_tid to neoctl. It solved by itself once:
      - there are new commits triggering replication for all partitions;
      - all storage nodes have really replicated.
      
      This also resulted in an inconsistent database when leaving backup mode during
      this period.
      17af3b47
    • Julien Muchembled's avatar
      Minor comment/doc changes · c95c6c39
      Julien Muchembled authored
      c95c6c39
  16. 20 Dec, 2016 1 commit
  17. 06 Dec, 2016 2 commits
    • Julien Muchembled's avatar
      master,client: ignore notifications before complete initialization · 36b2d141
      Julien Muchembled authored
      A backup master crashed with the following traceback after a reconnection:
      
          Traceback (most recent call last):
            File "neo/master/app.py", line 127, in run
              self._run()
            File "neo/master/app.py", line 147, in _run
              self.playPrimaryRole()
            File "neo/master/app.py", line 348, in playPrimaryRole
              self.backup_app.provideService())
            File "neo/master/backup_app.py", line 123, in provideService
              poll(1)
            File "neo/lib/event.py", line 126, in poll
              to_process.process()
            File "neo/lib/connection.py", line 500, in process
              self._handlers.handle(self, self._queue.pop(0))
            File "neo/lib/connection.py", line 110, in handle
              self._handle(connection, packet)
            File "neo/lib/connection.py", line 125, in _handle
              handler.packetReceived(connection, packet)
            File "neo/lib/handler.py", line 117, in packetReceived
              self.dispatch(*args)
            File "neo/lib/handler.py", line 66, in dispatch
              method(conn, *args, **kw)
            File "neo/master/handlers/backup.py", line 52, in invalidateObjects
              app.invalidatePartitions(tid, partition_set)
            File "neo/master/backup_app.py", line 257, in invalidatePartitions
              self.triggerBackup(node)
            File "neo/master/backup_app.py", line 281, in triggerBackup
              assert cell_list, offset
          AssertionError: 0
      36b2d141
    • Julien Muchembled's avatar
  18. 01 Dec, 2016 3 commits
    • Julien Muchembled's avatar
      Remove dead code found by coverage · 23b9544d
      Julien Muchembled authored
      23b9544d
    • Julien Muchembled's avatar
      Remove some useless unit tests · 1e4a4178
      Julien Muchembled authored
      Many "unit" tests (!= "threaded" tests) don't do more than checking
      implementation details, and increase coverage artificially. As with testEvent
      in commit 71e30fb9, most of these tests will
      either be removed or rewritten as threaded tests.
      
      The fact that the remaining unit tests actually cover code that other test
      don't gives motivation to maintain them. It will be also less code to update
      when switching to https://pypi.python.org/pypi/mock
      
      I proceeded as follows:
      
      1. Measure coverage for all tests except unit tests. While checking my work,
         I found that coverage stats for threaded/functional/zodb tests are quite
         unstable, so I restarted from the beginning by doing this measure several
         times and only keeping the intersection of coverage data.
      
      2. Measure coverage individually for each 'unit' tests, and substract the
         each result with the data in 1.
      
      3. The candidates for deletion are those without any code covered.
      
      Tests I didn't delete:
      
      - neo.tests.master.testElectionHandler: I always do minimal changes about
        election, as long as there's no serious review.
      
      - neo.tests.master.testMasterPT.MasterPartitionTableTests.test_13_outdate
      
      - 4 tests in neo.tests.testPT:
        test_01_Cell, test_04_removeCell, test_06_clear, test_08_filled
      
      - neo.tests.storage.testStorage{MySQL,SQLite}
      
      - neo.tests.testUtil.UtilTests.testReadBufferRead
      
      In a way, this commit is actually quite conservative. There are still many
      useless tests that only check error paths and for simple tested methods, this
      is just duplicating thie tested code.
      1e4a4178
    • Julien Muchembled's avatar