1. 20 Dec, 2016 1 commit
  2. 06 Dec, 2016 2 commits
    • master,client: ignore notifications before complete initialization · 36b2d141
      A backup master crashed with the following traceback after a reconnection:
      
          Traceback (most recent call last):
            File "neo/master/app.py", line 127, in run
              self._run()
            File "neo/master/app.py", line 147, in _run
              self.playPrimaryRole()
            File "neo/master/app.py", line 348, in playPrimaryRole
              self.backup_app.provideService())
            File "neo/master/backup_app.py", line 123, in provideService
              poll(1)
            File "neo/lib/event.py", line 126, in poll
              to_process.process()
            File "neo/lib/connection.py", line 500, in process
              self._handlers.handle(self, self._queue.pop(0))
            File "neo/lib/connection.py", line 110, in handle
              self._handle(connection, packet)
            File "neo/lib/connection.py", line 125, in _handle
              handler.packetReceived(connection, packet)
            File "neo/lib/handler.py", line 117, in packetReceived
              self.dispatch(*args)
            File "neo/lib/handler.py", line 66, in dispatch
              method(conn, *args, **kw)
            File "neo/master/handlers/backup.py", line 52, in invalidateObjects
              app.invalidatePartitions(tid, partition_set)
            File "neo/master/backup_app.py", line 257, in invalidatePartitions
              self.triggerBackup(node)
            File "neo/master/backup_app.py", line 281, in triggerBackup
              assert cell_list, offset
          AssertionError: 0
      Julien Muchembled committed
  3. 01 Dec, 2016 5 commits
    • Remove dead code found by coverage · 23b9544d
      Julien Muchembled committed
    • Remove some useless unit tests · 1e4a4178
      Many "unit" tests (!= "threaded" tests) don't do more than checking
      implementation details, and increase coverage artificially. As with testEvent
      in commit 71e30fb9, most of these tests will
      either be removed or rewritten as threaded tests.
      
      The fact that the remaining unit tests actually cover code that other test
      don't gives motivation to maintain them. It will be also less code to update
      when switching to https://pypi.python.org/pypi/mock
      
      I proceeded as follows:
      
      1. Measure coverage for all tests except unit tests. While checking my work,
         I found that coverage stats for threaded/functional/zodb tests are quite
         unstable, so I restarted from the beginning by doing this measure several
         times and only keeping the intersection of coverage data.
      
      2. Measure coverage individually for each 'unit' tests, and substract the
         each result with the data in 1.
      
      3. The candidates for deletion are those without any code covered.
      
      Tests I didn't delete:
      
      - neo.tests.master.testElectionHandler: I always do minimal changes about
        election, as long as there's no serious review.
      
      - neo.tests.master.testMasterPT.MasterPartitionTableTests.test_13_outdate
      
      - 4 tests in neo.tests.testPT:
        test_01_Cell, test_04_removeCell, test_06_clear, test_08_filled
      
      - neo.tests.storage.testStorage{MySQL,SQLite}
      
      - neo.tests.testUtil.UtilTests.testReadBufferRead
      
      In a way, this commit is actually quite conservative. There are still many
      useless tests that only check error paths and for simple tested methods, this
      is just duplicating thie tested code.
      Julien Muchembled committed
    • Remove unused imports, found by pylint · 3b5a6edb
      Julien Muchembled committed
    • TODO: tweak should be safer · e0a2a217
      Julien Muchembled committed
  4. 30 Nov, 2016 3 commits
  5. 28 Nov, 2016 4 commits
  6. 27 Nov, 2016 11 commits
    • Bump protocol version · 8eb14b01
      Julien Muchembled committed
    • Fix identification issues, including a race condition causing id conflicts · 9385706f
      The added test describes how the new id timestamps fix the race condition.
      These timestamps could be any unique opaque values, and the protocol is
      extended to exchange them along with node ids.
      
      Internally, nodes also reuse timestamps as a marker to identify the first
      NotifyNodeInformation packets from the master: since this packet is a complete
      list of nodes in the cluster, any other node in the node manager has left the
      cluster definitely and is removed.
      
      The secondary masters didn't receive update about master nodes.
      It's also useless to send them information about non-master nodes.
      Julien Muchembled committed
    • protocol: simplify definition of Struct-based items · 54e819ff
      Julien Muchembled committed
    • Remove AskNodeInformation packet · d048a52d
      When Client (including backup master) and admin nodes are identified,
      the primary master now sends them automatically all nodes with
      NotifyNodeInformation, as with storage nodes.
      Julien Muchembled committed
    • master: fix crashes in identification due to buggy nodes · 35664759
      - check address conflicts
      - on invalid values, reject peer instead of dying
      Julien Muchembled committed
    • lib.node: fix NodeManager accessors returning identified nodes · e7cccf01
      Listing connected/connecting nodes with a UUID is used:
      - in one place by storage nodes: here, it does not matter if we skip nodes that
        aren't really identified
      - in many places by the master, only for server connections, in which case we
        have equivalence with real identification
      
      So in practice, NodeManager is only simplified to reuse the 'identified'
      property of nodes.
      Julien Muchembled committed
    • lib.node: code refactoring · 5941b27d
      Julien Muchembled committed
    • storage: only accept clients that are known by the master · c17f5f91
      Therefore, a client node in the node manager is always RUNNING.
      Julien Muchembled committed
    • Give new ids to clients whose ids were already reallocated · d752aadb
      Although the change applies to any node with a temporary ids (all but storage),
      only clients don't have addresses and are therefore not recognizable.
      
      After a client is disconnected from the master and before reconnecting, another
      client may join the cluster and "steals" the id of the first client. This issue
      leads to stuck clients, failing in loop with exceptions like the following one:
      
          ERROR ZODB.Connection Couldn't load state for 0x0251
          Traceback (most recent call last):
            File "ZODB/Connection.py", line 860, in setstate
              self._setstate(obj)
            File "ZODB/Connection.py", line 901, in _setstate
              p, serial = self._storage.load(obj._p_oid, '')
            File "neo/client/Storage.py", line 82, in load
              return self.app.load(oid)[:2]
            File "neo/client/app.py", line 353, in load
              data, tid, next_tid, _ = self._loadFromStorage(oid, tid, before_tid)
            File "neo/client/app.py", line 373, in _loadFromStorage
              for node, conn in self.cp.iterateForObject(oid, readable=True):
            File "neo/client/pool.py", line 91, in iterateForObject
              pt = self.app.pt
            File "neo/client/app.py", line 145, in __getattr__
              self._getMasterConnection()
            File "neo/client/app.py", line 214, in _getMasterConnection
              result = self.master_conn = self._connectToPrimaryNode()
            File "neo/client/app.py", line 246, in _connectToPrimaryNode
              handler=handler)
            File "neo/lib/threaded_app.py", line 154, in _ask
              _handlePacket(qconn, qpacket, kw, handler)
            File "neo/lib/threaded_app.py", line 135, in _handlePacket
              handler.dispatch(conn, packet, kw)
            File "neo/lib/handler.py", line 66, in dispatch
              method(conn, *args, **kw)
            File "neo/lib/handler.py", line 188, in error
              getattr(self, Errors[code])(conn, message)
            File "neo/client/handlers/__init__.py", line 23, in protocolError
              raise StorageError("protocol error: %s" % message)
          StorageError: protocol error: already connected
      Julien Muchembled committed
    • spelling: oudated -> outdated · b62b8dc3
      Julien Muchembled committed
    • Fix spelling mistakes · 6e32ebb7
      Julien Muchembled committed
  7. 25 Nov, 2016 2 commits
  8. 21 Nov, 2016 2 commits
    • client: fix item eviction from cache, which could break loading from storage · 4ef05b9e
      `ClientCache._oid_dict` shall not have empty values. For a given oid, when the
      last item is removed from the cache, the oid must be removed as well to free
      memory. In some cases, this was not done.
      
      A consequence of this bug is the following exception:
      
          ERROR ZODB.Connection Couldn't load state for 0x02d1e1e4
          Traceback (most recent call last):
            File "ZODB/Connection.py", line 860, in setstate
              self._setstate(obj)
            File "ZODB/Connection.py", line 901, in _setstate
              p, serial = self._storage.load(obj._p_oid, '')
            File "neo/client/Storage.py", line 82, in load
              return self.app.load(oid)[:2]
            File "neo/client/app.py", line 358, in load
              self._cache.store(oid, data, tid, next_tid)
            File "neo/client/cache.py", line 228, in store
              prev = item_list[-1]
          IndexError: list index out of range
      Julien Muchembled committed
  9. 15 Nov, 2016 2 commits
    • backup: Teach cluster in BACKUPING state to also serve regular ZODB clients in read-only mode · d4944062
      A backup cluster for tids <= backup_tid has all data to provide regular
      read-only ZODB service. Having regular ZODB access to the data can be
      handy e.g. for externally verifying data for consistency between
      main and backup clusters. Peeking around without disturbing main
      cluster might be also useful sometimes.
      
      In this patch:
      
      - master & storage nodes are taught:
      
          * to instantiate read-only or regular client service handler depending on cluster state:
            RUNNING   -> regular
            BACKINGUP -> read-only
      
          * in read-only client handler:
            + to reject write-related operations
            + to provide read operations but adjust semantic as last_tid in the database
              would be = backup_tid
      
      - new READ_ONLY_ACCESS protocol error code is introduced so that client can
        raise POSException.ReadOnlyError upon receiving it.
      
      I have not implemented back-channel for invalidations in read-only mode (yet ?).
      This way once a client connects to cluster in backup state, it won't see
      new data fetched by backup cluster from upstream after client connected.
      
      The reasons invalidations are not implemented is that for now (imho)
      there is no off-hand ready infrastructure to get updates from
      replicating node on transaction-by-transaction basis (it currently only
      notifies when whole batch is done). For consistency verification (main
      reason for this patch) we also don't need invalidations to work, as in
      that task we always connect afresh to backup. So I simply only put
      relevant TODOs about invalidations for now.
      
      The patch is not very polished but should work.
      
      /reviewed-on !4
      Kirill Smelkov committed
  10. 27 Oct, 2016 1 commit
    • neoctl: make 'print ids' command display time of TIDs · d9dd39f0
      Currently, the command "neoctl [arguments] print ids" has the following output:
      
          last_oid = 0x...
          last_tid = 0x...
          last_ptid = ...
      
      or
      
          backup_tid = 0x...
          last_tid = 0x...
          last_ptid = ...
      
      depending on whether the cluster is in normal or backup mode.
      
      This is extremely unreadable since the admin is often interested in the time that corresponds to each tid. Now the output is:
      
          last_oid = 0x...
          last_tid = 0x... (yyyy-mm-dd hh:mm:ss.ssssss)
          last_ptid = ...
      
      or
      
          backup_tid = 0x... (yyyy-mm-dd hh:mm:ss.ssssss)
          last_tid = 0x... (yyyy-mm-dd hh:mm:ss.ssssss)
          last_ptid = ...
      
      /reviewed-on !2
      Iliya Manolov committed
  11. 17 Oct, 2016 1 commit
    • mysql: force _getNextTID() to use appropriate/whole index · eaa00a88
      Similarly to 13911ca3 on the same instance after MariaDB was upgraded to
      10.1.17 the following query, even after `OPTIMIZE TABLE obj`, started to execute
      very slowly:
      
          MariaDB [(none)]> SELECT tid FROM neo1.obj WHERE `partition`=5 AND oid=79613 AND tid>268707071353462798 ORDER BY tid LIMIT 1;
          +--------------------+
          | tid                |
          +--------------------+
          | 268707072758797063 |
          +--------------------+
          1 row in set (4.82 sec)
      
      Both explain and analyze says the query will/is using `partition` key but only partially (note key_len is only 10, not 18):
      
          MariaDB [(none)]> SHOW INDEX FROM neo1.obj;
          +-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
          | Table | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
          +-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
          | obj   |          0 | PRIMARY   |            1 | partition   | A         |    28755928 |     NULL | NULL   |      | BTREE      |         |               |
          | obj   |          0 | PRIMARY   |            2 | tid         | A         |    28755928 |     NULL | NULL   |      | BTREE      |         |               |
          | obj   |          0 | PRIMARY   |            3 | oid         | A         |    28755928 |     NULL | NULL   |      | BTREE      |         |               |
          | obj   |          0 | partition |            1 | partition   | A         |    28755928 |     NULL | NULL   |      | BTREE      |         |               |
          | obj   |          0 | partition |            2 | oid         | A         |    28755928 |     NULL | NULL   |      | BTREE      |         |               |
          | obj   |          0 | partition |            3 | tid         | A         |    28755928 |     NULL | NULL   |      | BTREE      |         |               |
          | obj   |          1 | data_id   |            1 | data_id     | A         |    28755928 |     NULL | NULL   | YES  | BTREE      |         |               |
          +-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
          7 rows in set (0.00 sec)
      
          MariaDB [(none)]> explain SELECT tid FROM neo1.obj WHERE `partition`=5 AND oid=79613 AND tid>268707071353462798 ORDER BY tid LIMIT 1;
          +------+-------------+-------+------+-------------------+-----------+---------+-------------+------+--------------------------+
          | id   | select_type | table | type | possible_keys     | key       | key_len | ref         | rows | Extra                    |
          +------+-------------+-------+------+-------------------+-----------+---------+-------------+------+--------------------------+
          |    1 | SIMPLE      | obj   | ref  | PRIMARY,partition | partition | 10      | const,const |    2 | Using where; Using index |
          +------+-------------+-------+------+-------------------+-----------+---------+-------------+------+--------------------------+
          1 row in set (0.00 sec)
      
          MariaDB [(none)]> analyze SELECT tid FROM neo1.obj WHERE `partition`=5 AND oid=79613 AND tid>268707071353462798 ORDER BY tid LIMIT 1;
          +------+-------------+-------+------+-------------------+-----------+---------+-------------+------+------------+----------+------------+--------------------------+
          | id   | select_type | table | type | possible_keys     | key       | key_len | ref         | rows | r_rows     | filtered | r_filtered | Extra                    |
          +------+-------------+-------+------+-------------------+-----------+---------+-------------+------+------------+----------+------------+--------------------------+
          |    1 | SIMPLE      | obj   | ref  | PRIMARY,partition | partition | 10      | const,const |    2 | 9741121.00 |   100.00 |       0.00 | Using where; Using index |
          +------+-------------+-------+------+-------------------+-----------+---------+-------------+------+------------+----------+------------+--------------------------+
          1 row in set (4.93 sec)
      
      By explicitly forcing (partition, oid, tid) index usage which is precisely designed to serve this and similar queries can avoid the query from being slow:
      
          MariaDB [(none)]> analyze SELECT tid FROM neo1.obj FORCE INDEX(`partition`) WHERE `partition`=5 AND oid=79613 AND tid>268707071353462798 ORDER BY tid LIMIT 1;
          +------+-------------+-------+-------+---------------+-----------+---------+------+------+--------+----------+------------+--------------------------+
          | id   | select_type | table | type  | possible_keys | key       | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra                    |
          +------+-------------+-------+-------+---------------+-----------+---------+------+------+--------+----------+------------+--------------------------+
          |    1 | SIMPLE      | obj   | range | partition     | partition | 18      | NULL |    2 |   1.00 |   100.00 |     100.00 | Using where; Using index |
          +------+-------------+-------+-------+---------------+-----------+---------+------+------+--------+----------+------------+--------------------------+
          1 row in set (0.00 sec)
      
      /cc @jm, @vpelltier, @Tyagov
      
      /reviewed-on !1
      Kirill Smelkov committed
  12. 12 Sep, 2016 1 commit
  13. 29 Aug, 2016 2 commits
    • mysql: fix use of wrong SQL index when checking for dropped partitions · 13911ca3
      After partitions were dropped with TokuDB, we had a case where MariaDB 10.1.14
      stopped using the most appropriate index.
      
      MariaDB [neo0]> explain SELECT DISTINCT data_id FROM obj WHERE `partition`=5;
      +------+-------------+-------+-------+-------------------+---------+---------+------+------+---------------------------------------+
      | id   | select_type | table | type  | possible_keys     | key     | key_len | ref  | rows | Extra                                 |
      +------+-------------+-------+-------+-------------------+---------+---------+------+------+---------------------------------------+
      |    1 | SIMPLE      | obj   | range | PRIMARY,partition | data_id | 11      | NULL |   10 | Using where; Using index for group-by |
      +------+-------------+-------+-------+-------------------+---------+---------+------+------+---------------------------------------+
      MariaDB [neo0]> SELECT SQL_NO_CACHE DISTINCT data_id FROM obj WHERE `partition`=5;
      Empty set (1 min 51.47 sec)
      
      Expected:
      
      MariaDB [neo1]> explain SELECT DISTINCT data_id FROM obj WHERE `partition`=4;
      +------+-------------+-------+------+-------------------+---------+---------+-------+------+------------------------------+
      | id   | select_type | table | type | possible_keys     | key     | key_len | ref   | rows | Extra                        |
      +------+-------------+-------+------+-------------------+---------+---------+-------+------+------------------------------+
      |    1 | SIMPLE      | obj   | ref  | PRIMARY,partition | PRIMARY | 2       | const |    1 | Using where; Using temporary |
      +------+-------------+-------+------+-------------------+---------+---------+-------+------+------------------------------+
      1 row in set (0.00 sec)
      MariaDB [neo1]> SELECT SQL_NO_CACHE DISTINCT data_id FROM obj WHERE `partition`=4;
      Empty set (0.00 sec)
      
      Restarting the server or 'OPTIMIZE TABLE obj; ' does not help.
      
      Such issue could prevent the cluster to start due to timeouts, by always going
      back to RECOVERING state.
      Julien Muchembled committed
    • Update TODO · 00ffb1ef
      Julien Muchembled committed
  14. 11 Aug, 2016 2 commits
  15. 01 Aug, 2016 1 commit