1. 15 Sep, 2015 1 commit
  2. 28 Aug, 2015 1 commit
    • Julien Muchembled's avatar
      Fix occasional deadlocks in threaded tests · 0b93b1fb
      Julien Muchembled authored
      deadlocks mainly happened while stopping a cluster, hence the complete review
      of NEOCluster.stop()
      
      A major change is to make the client node handle its lock like other nodes
      (i.e. in the polling thread itself) to better know when to call
      Serialized.background() (there was a race condition with the test of
      'self.poll_thread.isAlive()' in ClientApplication.close).
      0b93b1fb
  3. 14 Aug, 2015 1 commit
    • Julien Muchembled's avatar
      Do not reconnect too quickly to a node after an error · d898a83d
      Julien Muchembled authored
      For example, a backup storage node that was rejected because the upstream
      cluster was not ready could reconnect in loop without delay, using 100% CPU
      and flooding logs.
      
      A new 'setReconnectionNoDelay' method on Connection can be used for cases where
      it's legitimate to quickly reconnect.
      
      With this new delayed reconnection, it's possible to remove the remaining
      time.sleep().
      d898a83d
  4. 12 Aug, 2015 1 commit
  5. 13 Jul, 2015 1 commit
  6. 10 Jul, 2015 1 commit
  7. 25 Jul, 2014 2 commits
  8. 20 Jun, 2014 1 commit
    • Julien Muchembled's avatar
      client: clean up import/export code · d562bf8f
      Julien Muchembled authored
      Export:
      - Remove leftover warning about a bug that was fixed in
        commit e76af297
      - In neomigrate script, open NEO storage read-only.
      - IStorageIteration is already implemented.
      
      Import:
      - Review comments.
      - In neomigrate script, warn that IStorageRestoreable is not implemented.
      - Do not call 'close' method on source iterator. BaseStorage does not do it and
        this is not part of ZODB API. In the case of FileStorage, resource are freed
        automatically during garbage collection.
      d562bf8f
  9. 03 Jun, 2014 1 commit
  10. 29 May, 2014 1 commit
  11. 07 Jan, 2014 1 commit
    • Julien Muchembled's avatar
      Add test showing that clients may be stuck on an old snapshot in case of failure during tpc_finish · fd4cfaa9
      Julien Muchembled authored
      If anything wrong happens after a transaction is locked and before the end of
      onTransactionCommitted, recovery phase should be run again, so that the master
      gets correct last tid.
      
      Following patch by Vincent is an attempt to fix this:
      
      --- a/neo/master/app.py
      +++ b/neo/master/app.py
      @@ -329,8 +329,8 @@ def playPrimaryRole(self):
      
               # recover the cluster status at startup
               try:
      -            self.runManager(RecoveryManager)
                   while True:
      +                self.runManager(RecoveryManager)
                       self.runManager(VerificationManager)
                       try:
                           if self.backup_tid:
      @@ -338,10 +338,6 @@ def playPrimaryRole(self):
                                   raise RuntimeError("No upstream cluster to backup"
                                                      " defined in configuration")
                               self.backup_app.provideService()
      -                        # Reset connection with storages (and go through a
      -                        # recovery phase) when leaving backup mode in order
      -                        # to get correct last oid/tid.
      -                        self.runManager(RecoveryManager)
                               continue
                           self.provideService()
                       except OperationFailure:
      fd4cfaa9
  12. 23 Aug, 2012 1 commit
  13. 20 Aug, 2012 2 commits
    • Julien Muchembled's avatar
      Comment about backup limitations · dd556379
      Julien Muchembled authored
      dd556379
    • Julien Muchembled's avatar
      More bugfixes to backup mode · 08742377
      Julien Muchembled authored
      - catch OperationFailure
      - reset transaction manager when leaving backup mode
      - send appropriate target tid to a storage that updates a outdated cell
      - clean up partition table when leaving BACKINGUP state unexpectedly
      - make sure all readable cells of a partition have the same 'backup_tid'
        if they have the same data, so that we know when internal replication is
        finished when leaving backup mode
      - fix storage not finished internal replication when leaving backup mode
      08742377
  14. 16 Aug, 2012 1 commit
  15. 15 Aug, 2012 1 commit
  16. 10 Aug, 2012 1 commit
    • Julien Muchembled's avatar
      Start renaming UUID into NID, because node IDs are not 128 bits length anymore · b81ae60a
      Julien Muchembled authored
      SQL tables can be upgraded using:
        UPDATE config SET name = 'nid' WHERE name = 'uuid';
      
      then for MySQL:
        ALTER TABLE pt CHANGE uuid nid INT NOT NULL;
      
      or SQLite:
        ALTER TABLE pt RENAME TO old_pt;
        CREATE TABLE pt (rid INTEGER NOT NULL, nid INTEGER NOT NULL, state INTEGER NOT NULL, PRIMARY KEY (rid, nid));
        INSERT INTO pt SELECT * from old_pt;
        DROP TABLE old_pt;
      b81ae60a
  17. 23 Jul, 2012 2 commits
  18. 13 Jul, 2012 1 commit
  19. 06 Jul, 2012 1 commit
  20. 05 Jul, 2012 1 commit
  21. 23 Apr, 2012 1 commit
    • Vincent Pelletier's avatar
      Document an RC bug on tpc_finish. · 6c500078
      Vincent Pelletier authored
      Also, change the way TTIDs are generated in preparation for that bug's fix:
      we will need TTID to be monotonous across master restarts and TID generator
      provides this feature.
      6c500078
  22. 12 Mar, 2012 2 commits
  23. 24 Feb, 2012 1 commit
    • Julien Muchembled's avatar
      Implements backup using specialised storage nodes and relying on replication · 8e3c7b01
      Julien Muchembled authored
      Replication is also fully reimplemented:
      - It is not done anymore on whole partitions.
      - It runs at lowest priority not to degrades performance for client nodes.
      
      Schema of MySQL table is changed to optimize storage layout: rows are now
      grouped by age, for good partial replication performance.
      This certainly also speeds up simple loads/stores.
      8e3c7b01
  24. 29 Sep, 2011 1 commit
  25. 09 Sep, 2011 1 commit
  26. 03 Sep, 2011 1 commit
  27. 10 Jun, 2011 1 commit
    • Julien Muchembled's avatar
      Introduce light functional tests, using threads and serialized processing · 1ef149c2
      Julien Muchembled authored
      This allows to setup an almost fully functional cluster without additional
      processes. Threads are scheduled so that they never run simultaneously,
      eliminating most random.
      There's still much improvement possible like controlled randomization,
      or easier debugging when switching from one thread to another.
      
      As mock objects are not usable in such tests, an API should be implemented to
      trace/count any method call we'd like to check.
      
      This fixes test_notifyNodeInformation_checkUnregisterStorage
      
      git-svn-id: https://svn.erp5.org/repos/neo/trunk@2775 71dcc9de-d417-0410-9af5-da40c76e7ee4
      1ef149c2
  28. 27 May, 2011 1 commit
    • Julien Muchembled's avatar
      connection: reimplement timeout logic and redefine pings as a keep-alive feature · 737e227a
      Julien Muchembled authored
      - Previous implementation was not able to import transactions with many small
        objects, the client for faster to send a store request than to process its
        answer. If X is the difference of time for these 2 operations, the maximum
        number of objects a transaction could contain was CRITICAL_TIMEOUT / X.
        And HasLock feature can't act as a workaround because it is not working yet.
      - Change API of 'on_timeout', which currently only used by HasLock.
      - Stop pinging when we wait for an answer. This wastes resources and would
        never recover any bad state.
      - Make client connections send pings when they are idle instead.
        This implements keep-alive feature for high availability.
        Start with an non-configurable period of 60 seconds.
      - Move processing of ping/pong to handlers.
      
      git-svn-id: https://svn.erp5.org/repos/neo/trunk@2762 71dcc9de-d417-0410-9af5-da40c76e7ee4
      737e227a
  29. 02 May, 2011 1 commit
  30. 10 Jan, 2011 1 commit
  31. 16 Dec, 2010 1 commit
  32. 15 Dec, 2010 5 commits