1. 27 Apr, 2019 6 commits
    • Julien Muchembled's avatar
    • Julien Muchembled's avatar
      tweak: add option to simulate · 2a27239d
      Julien Muchembled authored
      Initially, I wanted to do the simulation inside neoctl but it has no knowledge
      of the topology (the master don't send devpath values of storage nodes).
      Therefore, the work is delegated to the master node, which implies a change
      of the protocol.
    • Julien Muchembled's avatar
    • Julien Muchembled's avatar
      Better error reporting from the master to neoctl for denied requests · c2c9e99d
      Julien Muchembled authored
      This stops abusing ProtocolError, which disconnects the admin node needlessly.
      The many 'if ... raise RuntimeError' in neo/neoctl/neoctl.py
      could be turned into assertions.
    • Julien Muchembled's avatar
    • Julien Muchembled's avatar
      Make the number of replicas modifiable when the cluster is running · ef5fc508
      Julien Muchembled authored
      neoctl gets a new command to change the number of replicas.
      The number of replicas becomes a new partition table attribute and
      like the PT id, it is stored in the config table. On the other side,
      the configuration value for the number of partitions is dropped,
      since it can be computed from the partition table, which is
      always stored in full.
      The -p/-r master options now only apply at database creation.
      Some implementation notes:
      - The protocol is slightly optimized in that the master now sends
        automatically the whole partition tables to the admin & client
        nodes upon connection, like for storage nodes.
        This makes the protocol more consistent, and the master is the
        only remaining node requesting partition tables, during recovery.
      - Some parts become tricky because app.pt can be None in more cases.
        For example, the extra condition in NodeManager.update
        (before app.pt.dropNode) was added for this is the reason.
        Or the 'loadPartitionTable' method (storage) that is not inlined
        because of unit tests.
        Overall, this commit simplifies more than it complicates.
      - In the master handlers, we stop hijacking the 'connectionCompleted'
        method for tasks to be performed (often send the full partition
        table) on handler switches.
      - The admin's 'bootstrapped' flag could have been removed earlier:
        race conditions can't happen since the AskNodeInformation packet
        was removed (commit d048a52d).
  2. 11 Mar, 2019 1 commit
  3. 07 Nov, 2018 1 commit
  4. 07 Aug, 2018 1 commit
    • Julien Muchembled's avatar
      Use argparse instead of optparse · 9f1e4eef
      Julien Muchembled authored
      Besides the use of another module for option parsing, the main change is that
      there's no more Config class that mixes configuration for different components.
      Application classes now takes a simple 'dict' with parsed values.
      The changes in 'neoctl' are somewhat ugly, because command-line options are not
      defined on the command-line class, but this component is likely to disappear
      in the future.
      It remains possible to pass options via a configuration file. The code is a bit
      complex but isolated in neo.lib.config
      For SSL, the code may be simpler if we change for a single --ssl option that
      takes 3 paths. Not done to not break compatibility. Hence, the hack with
      an extra OptionList class in neo.lib.app
      A new functional test tests the 'neomigrate' script, instead of just the
      internal API to migrate data.
  5. 29 Sep, 2017 1 commit
  6. 25 Apr, 2017 2 commits
  7. 18 Jan, 2017 1 commit
  8. 30 Nov, 2016 1 commit
  9. 27 Nov, 2016 1 commit
  10. 27 Oct, 2016 1 commit
    • Iliya Manolov's avatar
      neoctl: make 'print ids' command display time of TIDs · d9dd39f0
      Iliya Manolov authored
      Currently, the command "neoctl [arguments] print ids" has the following output:
          last_oid = 0x...
          last_tid = 0x...
          last_ptid = ...
          backup_tid = 0x...
          last_tid = 0x...
          last_ptid = ...
      depending on whether the cluster is in normal or backup mode.
      This is extremely unreadable since the admin is often interested in the time that corresponds to each tid. Now the output is:
          last_oid = 0x...
          last_tid = 0x... (yyyy-mm-dd hh:mm:ss.ssssss)
          last_ptid = ...
          backup_tid = 0x... (yyyy-mm-dd hh:mm:ss.ssssss)
          last_tid = 0x... (yyyy-mm-dd hh:mm:ss.ssssss)
          last_ptid = ...
      /reviewed-on !2
  11. 20 Apr, 2016 1 commit
  12. 25 Jan, 2016 1 commit
  13. 01 Dec, 2015 1 commit
    • Julien Muchembled's avatar
      Safer DB truncation, new 'truncate' ctl command · d3c8b76d
      Julien Muchembled authored
      With the previous commit, the request to truncate the DB was not stored
      persistently, which means that this operation was still vulnerable to the case
      where the master is restarted after some nodes, but not all, have already
      truncated. The master didn't have the information to fix this and the result
      was a DB partially truncated.
      -> On a Truncate packet, a storage node only stores the tid somewhere, to send
         it back to the master, which stays in RECOVERING state as long as any node
         has a different value than that of the node with the latest partition table.
      We also want to make sure that there is no unfinished data, because a user may
      truncate at a tid higher than a locked one.
      -> Truncation is now effective at the end on the VERIFYING phase, just before
         returning the last ids to the master.
      At last all nodes should be truncated, to avoid that an offline node comes back
      with a different history. Currently, this would not be an issue since
      replication is always restart from the beginning, but later we'd like they
      remember where they stopped to replicate.
      -> If a truncation is requested, the master waits for all nodes to be pending,
         even if it was previously started (the user can still force the cluster to
         start with neoctl). And any lost node during verification also causes the
         master to go back to recovery.
      Obviously, the protocol has been changed to split the LastIDs packet and
      introduce a new Recovery, since it does not make sense anymore to ask last ids
      during recovery.
  14. 05 Oct, 2015 1 commit
  15. 24 Sep, 2015 2 commits
  16. 14 Aug, 2015 1 commit
    • Julien Muchembled's avatar
      Do not reconnect too quickly to a node after an error · d898a83d
      Julien Muchembled authored
      For example, a backup storage node that was rejected because the upstream
      cluster was not ready could reconnect in loop without delay, using 100% CPU
      and flooding logs.
      A new 'setReconnectionNoDelay' method on Connection can be used for cases where
      it's legitimate to quickly reconnect.
      With this new delayed reconnection, it's possible to remove the remaining
  17. 12 Aug, 2015 2 commits
  18. 21 May, 2015 1 commit
  19. 03 Jun, 2014 1 commit
  20. 07 Jan, 2014 1 commit
  21. 23 Aug, 2012 2 commits
  22. 20 Aug, 2012 2 commits
  23. 16 Aug, 2012 1 commit
  24. 14 Aug, 2012 1 commit
  25. 01 Aug, 2012 1 commit
  26. 24 Jul, 2012 1 commit
  27. 23 Jul, 2012 1 commit
  28. 13 Jul, 2012 1 commit
  29. 21 Mar, 2012 2 commits