Commit ed7ee13b authored by Kirill Smelkov's avatar Kirill Smelkov

Merge tag 'v1.10'

NEO 1.10

* tag 'v1.10': (55 commits)
  Release version 1.10
  Maximize resiliency by taking into account the topology of storage nodes
  storage: also commit updated cell TID at each replicated chunk of 'obj' records
  storage: skip useless work when unlocking transactions
  qa: flush logs at the end of each test when -L is not used
  qa: add a log in case that a mysterious bug happens again
  storage: clarify log about data deletion of discarded cells
  debug: new example to run the profiler for 1 minute
  mysql: fix replication of big oids (> 16M)
  tests/cluster: speedup waiting a bit
  protocol: update packet docstrings
  Bump protocol version
  protocol: a single byte is more than enough to encode enums
  protocol: small cleanup in packet registration
  Optimize resumption of replication by starting from a greater TID
  importer: update comment about a workaround for ZODB3
  Micro-optimization of p64/u64
  qa: add a log in testBackupNodeLost for easier debugging
  Document that the bug when checking replicas may also cause the master to crash
  storage: stop logging 'Abort TXN' for txn that have been locked
  storage: split _migrate2() for reusable _alterTable()
  qa: new testStorageUpgrade
  qa: update testStorageUpgrade data for what is not automatically upgraded
  qa: original data for the future testStorageUpgrade
  sqlite: fix indexes of upgraded db
  importer: fix NameError when recovering during tpc_finish
  fixup! importer: fetch and process the data to import in a separate process
  Serialize empty transaction extension with an empty string
  client: fix partial import from a source storage
  qa: give a title to subprocesses of functional tests
  importer: give a title to the 'import' and 'writeback' subprocesses
  importer: fetch and process the data to import in a separate process
  importer: new option to write back new transactions to the source database
  importer: log when the transaction index for FileStorage DB is built
  importer: open imported zodb in read-only whenever possible
  fixup! mysql: fix remaining places where a server disconnection was not catched
  fixup! storage: speed up replication by sending bigger network packets
  mysql: do not full-scan for duplicates of big oids if deduplication is disabled
  mysql: fix remaining places where a server disconnection was not catched
  fixup! Add support for custom compression levels
  importer: reenable compression by default
  qa: review testImporter
  qa: remove a few uses of 'chr'
  Fix a few issues with ZODB5
  importer: small code cleanup in speedupFileStorageTxnLookup patch
  importer: do not trigger speedupFileStorageTxnLookup uselessly
  Add support for custom compression levels
  setup: update MANIFEST.in
  importer: do not checksum data twice
  client: store uncompressed if compressed size is equal
  fixup! master: automatically discard feeding cells that get out-of-date
  master: automatically discard feeding cells that get out-of-date
  qa: remove useless indentation in testSafeTweak
  bench: new option to mesure ZEO perfs in matrix test
  bench: reduce number of partitions in matrix test
  storage: fix replication of creation undone
parents 27df6fe8 1ef5c1ba
...@@ -16,6 +16,19 @@ This happens in the following conditions: ...@@ -16,6 +16,19 @@ This happens in the following conditions:
4. the cell is checked completely before it could replicate up to the max tid 4. the cell is checked completely before it could replicate up to the max tid
to check to check
Sometimes, it causes the master to crash::
File "neo/lib/handler.py", line 72, in dispatch
method(conn, *args, **kw)
File "neo/master/handlers/storage.py", line 93, in notifyReplicationDone
cell_list = app.backup_app.notifyReplicationDone(node, offset, tid)
File "neo/master/backup_app.py", line 337, in notifyReplicationDone
assert cell.isReadable()
AssertionError
Workaround: make sure all cells are up-to-date before checking replicas. Workaround: make sure all cells are up-to-date before checking replicas.
Found by running testBackupNodeLost many times. Found by running testBackupNodeLost many times:
- either a failureException: 12 != 11
- or the above assert failure, in which case the unit test freezes
Change History Change History
============== ==============
1.10 (2018-07-16)
-----------------
A important performance improvement is that the replication now remembers where
it was interrupted: a storage node that gets disconnected for a short time now
gets fully operational quite instantaneously because it only has to replicate
the new data. Before, the time to recover depended on the size of the DB, just
to verify that most of the data are already transferred.
As a small optimization, an empty transaction extension is now serialized with
an empty string.
The above 2 changes required a bump of the protocol version, as well as an
upgrade of the storage format. Once upgraded (this is done automatically as
usual), databases can't be opened anymore by older versions of NEO.
Other general changes:
- Add support for custom compression levels.
- Maximize resiliency by taking into account the topology of storage nodes.
- Fix a few issues with ZODB5. Note however that merging several DB with the
Importer backend only works if they were only used with ZODB < 5.
Master:
- Automatically discard feeding cells that get out-of-date.
Client:
- Fix partial import from a source storage.
- Store uncompressed if compressed size is equal.
Storage:
- Fixed v1.9 code that sped up the replication by sending bigger network
packets.
- Fix replication of creation undone.
- Stop logging 'Abort TXN' for txn that have been locked.
- Clarify log about data deletion of discarded cells.
MySQL backend:
- Fix replication of big OIDs (> 16M).
- Do not full-scan for duplicates of big OIDs if deduplication is disabled.
- Fix remaining places where a server disconnection was not catched.
SQlite backend:
- Fix indexes of upgraded databases.
Importer backend:
- Fetch and process the data to import in a separate process. It is even
usually free to use the best compression level.
- New option to write back new transactions to the source database.
See 'importer.conf' for more information.
- Give a title to the 'import' and 'writeback' subprocesses,
if the 'setproctitle' egg is installed.
- Log when the transaction index for FileStorage DB is built.
- Open imported database in read-only whenever possible.
- Do not trigger speedupFileStorageTxnLookup uselessly.
- Do not checksum data twice.
- Fix NameError when recovering during tpc_finish.
1.9 (2018-03-13) 1.9 (2018-03-13)
---------------- ----------------
......
graft tools graft tools
include neo.conf CHANGELOG.rst TODO TESTS.txt ZODB3.patch include neo.conf CHANGELOG.rst TODO ZODB3.patch
...@@ -45,6 +45,12 @@ ...@@ -45,6 +45,12 @@
# (instead of adapter=Importer & database=/path_to_this_file). # (instead of adapter=Importer & database=/path_to_this_file).
adapter=MySQL adapter=MySQL
database=neo database=neo
# Keep writing back new transactions to the source database, provided it is
# not splitted. In case of any issue, the import can be aborted without losing
# data. Note however that it is asynchronous so don't stop the storage node
# too quickly after the last committed transaction (e.g. check with tools like
# fstail).
writeback=true
# The other sections are for source databases. # The other sections are for source databases.
[root] [root]
...@@ -52,7 +58,8 @@ database=neo ...@@ -52,7 +58,8 @@ database=neo
# ZEO is possible but less efficient: ZEO servers must be stopped # ZEO is possible but less efficient: ZEO servers must be stopped
# if NEO opens FileStorage DBs directly. # if NEO opens FileStorage DBs directly.
# Note that NEO uses 'new_oid' method to get the last OID, that's why the # Note that NEO uses 'new_oid' method to get the last OID, that's why the
# source DB can't be open read-only. NEO never modifies a FileStorage DB. # source DB can't be open read-only. Unless 'writeback' is enabled, NEO never
# modifies a FileStorage DB.
storage= storage=
<filestorage> <filestorage>
path /path/to/root.fs path /path/to/root.fs
......
...@@ -160,11 +160,7 @@ class Storage(BaseStorage.BaseStorage, ...@@ -160,11 +160,7 @@ class Storage(BaseStorage.BaseStorage,
def copyTransactionsFrom(self, source, verbose=False): def copyTransactionsFrom(self, source, verbose=False):
""" Zope compliant API """ """ Zope compliant API """
return self.importFrom(source) return self.app.importFrom(self, source)
def importFrom(self, source, start=None, stop=None, preindex=None):
""" Allow import only a part of the source storage """
return self.app.importFrom(self, source, start, stop, preindex)
def pack(self, t, referencesf, gc=False): def pack(self, t, referencesf, gc=False):
if gc: if gc:
......
...@@ -14,11 +14,14 @@ ...@@ -14,11 +14,14 @@
# You should have received a copy of the GNU General Public License # You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>. # along with this program. If not, see <http://www.gnu.org/licenses/>.
from cPickle import dumps, loads
from zlib import compress, decompress
import heapq import heapq
import time import time
try:
from ZODB._compat import dumps, loads, _protocol
except ImportError:
from cPickle import dumps, loads
_protocol = 1
from ZODB.POSException import UndoError, ConflictError, ReadConflictError from ZODB.POSException import UndoError, ConflictError, ReadConflictError
from . import OLD_ZODB from . import OLD_ZODB
if OLD_ZODB: if OLD_ZODB:
...@@ -26,6 +29,7 @@ if OLD_ZODB: ...@@ -26,6 +29,7 @@ if OLD_ZODB:
from persistent.TimeStamp import TimeStamp from persistent.TimeStamp import TimeStamp
from neo.lib import logging from neo.lib import logging
from neo.lib.compress import decompress_list, getCompress
from neo.lib.protocol import NodeTypes, Packets, \ from neo.lib.protocol import NodeTypes, Packets, \
INVALID_PARTITION, MAX_TID, ZERO_HASH, ZERO_TID INVALID_PARTITION, MAX_TID, ZERO_HASH, ZERO_TID
from neo.lib.util import makeChecksum, dump from neo.lib.util import makeChecksum, dump
...@@ -50,7 +54,6 @@ if SignalHandler: ...@@ -50,7 +54,6 @@ if SignalHandler:
import signal import signal
SignalHandler.registerHandler(signal.SIGUSR2, logging.reopen) SignalHandler.registerHandler(signal.SIGUSR2, logging.reopen)
class Application(ThreadedApplication): class Application(ThreadedApplication):
"""The client node application.""" """The client node application."""
...@@ -99,7 +102,7 @@ class Application(ThreadedApplication): ...@@ -99,7 +102,7 @@ class Application(ThreadedApplication):
# _connecting_to_master_node is used to prevent simultaneous master # _connecting_to_master_node is used to prevent simultaneous master
# node connection attempts # node connection attempts
self._connecting_to_master_node = Lock() self._connecting_to_master_node = Lock()
self.compress = compress self.compress = getCompress(compress)
def __getattr__(self, attr): def __getattr__(self, attr):
if attr in ('last_tid', 'pt'): if attr in ('last_tid', 'pt'):
...@@ -215,7 +218,7 @@ class Application(ThreadedApplication): ...@@ -215,7 +218,7 @@ class Application(ThreadedApplication):
node=node, node=node,
dispatcher=self.dispatcher) dispatcher=self.dispatcher)
p = Packets.RequestIdentification( p = Packets.RequestIdentification(
NodeTypes.CLIENT, self.uuid, None, self.name, None) NodeTypes.CLIENT, self.uuid, None, self.name, (), None)
try: try:
ask(conn, p, handler=handler) ask(conn, p, handler=handler)
except ConnectionClosed: except ConnectionClosed:
...@@ -388,7 +391,7 @@ class Application(ThreadedApplication): ...@@ -388,7 +391,7 @@ class Application(ThreadedApplication):
logging.error('wrong checksum from %s for oid %s', logging.error('wrong checksum from %s for oid %s',
conn, dump(oid)) conn, dump(oid))
raise NEOStorageReadRetry(False) raise NEOStorageReadRetry(False)
return (decompress(data) if compression else data, return (decompress_list[compression](data),
tid, next_tid, data_tid) tid, next_tid, data_tid)
raise NEOStorageCreationUndoneError(dump(oid)) raise NEOStorageCreationUndoneError(dump(oid))
return self._askStorageForRead(oid, return self._askStorageForRead(oid,
...@@ -435,17 +438,7 @@ class Application(ThreadedApplication): ...@@ -435,17 +438,7 @@ class Application(ThreadedApplication):
checksum = ZERO_HASH checksum = ZERO_HASH
else: else:
assert data_serial is None assert data_serial is None
size = len(data) size, compression, compressed_data = self.compress(data)
if self.compress:
compressed_data = compress(data)
if size < len(compressed_data):
compressed_data = data
compression = 0
else:
compression = 1
else:
compression = 0
compressed_data = data
checksum = makeChecksum(compressed_data) checksum = makeChecksum(compressed_data)
txn_context.data_size += size txn_context.data_size += size
# Store object in tmp cache # Store object in tmp cache
...@@ -554,9 +547,12 @@ class Application(ThreadedApplication): ...@@ -554,9 +547,12 @@ class Application(ThreadedApplication):
txn_context = self._txn_container.get(transaction) txn_context = self._txn_container.get(transaction)
self.waitStoreResponses(txn_context) self.waitStoreResponses(txn_context)
ttid = txn_context.ttid ttid = txn_context.ttid
ext = transaction._extension
ext = dumps(ext, _protocol) if ext else ''
# user and description are cast to str in case they're unicode.
# BBB: This is not required anymore with recent ZODB.
packet = Packets.AskStoreTransaction(ttid, str(transaction.user), packet = Packets.AskStoreTransaction(ttid, str(transaction.user),
str(transaction.description), dumps(transaction._extension), str(transaction.description), ext, txn_context.cache_dict)
txn_context.cache_dict)
queue = txn_context.queue queue = txn_context.queue
involved_nodes = txn_context.involved_nodes involved_nodes = txn_context.involved_nodes
# Ask in parallel all involved storage nodes to commit object metadata. # Ask in parallel all involved storage nodes to commit object metadata.
...@@ -786,10 +782,6 @@ class Application(ThreadedApplication): ...@@ -786,10 +782,6 @@ class Application(ThreadedApplication):
self.waitStoreResponses(txn_context) self.waitStoreResponses(txn_context)
return None, txn_oid_list return None, txn_oid_list
def _insertMetadata(self, txn_info, extension):
for k, v in loads(extension).items():
txn_info[k] = v
def _getTransactionInformation(self, tid): def _getTransactionInformation(self, tid):
return self._askStorageForRead(tid, return self._askStorageForRead(tid,
Packets.AskTransactionInformation(tid)) Packets.AskTransactionInformation(tid))
...@@ -829,7 +821,8 @@ class Application(ThreadedApplication): ...@@ -829,7 +821,8 @@ class Application(ThreadedApplication):
if filter is None or filter(txn_info): if filter is None or filter(txn_info):
txn_info.pop('packed') txn_info.pop('packed')
txn_info.pop("oids") txn_info.pop("oids")
self._insertMetadata(txn_info, txn_ext) if txn_ext:
txn_info.update(loads(txn_ext))
append(txn_info) append(txn_info)
if len(undo_info) >= last - first: if len(undo_info) >= last - first:
break break
...@@ -857,7 +850,7 @@ class Application(ThreadedApplication): ...@@ -857,7 +850,7 @@ class Application(ThreadedApplication):
tid = None tid = None
for tid in tid_list: for tid in tid_list:
(txn_info, txn_ext) = self._getTransactionInformation(tid) (txn_info, txn_ext) = self._getTransactionInformation(tid)
txn_info['ext'] = loads(txn_ext) txn_info['ext'] = loads(txn_ext) if txn_ext else {}
append(txn_info) append(txn_info)
return (tid, txn_list) return (tid, txn_list)
...@@ -876,23 +869,29 @@ class Application(ThreadedApplication): ...@@ -876,23 +869,29 @@ class Application(ThreadedApplication):
txn_info['size'] = size txn_info['size'] = size
if filter is None or filter(txn_info): if filter is None or filter(txn_info):
result.append(txn_info) result.append(txn_info)
self._insertMetadata(txn_info, txn_ext) if txn_ext:
txn_info.update(loads(txn_ext))
return result return result
def importFrom(self, storage, source, start, stop, preindex=None): def importFrom(self, storage, source):
# TODO: The main difference with BaseStorage implementation is that # TODO: The main difference with BaseStorage implementation is that
# preindex can't be filled with the result 'store' (tid only # preindex can't be filled with the result 'store' (tid only
# known after 'tpc_finish'. This method could be dropped if we # known after 'tpc_finish'. This method could be dropped if we
# implemented IStorageRestoreable (a wrapper around source would # implemented IStorageRestoreable (a wrapper around source would
# still be required for partial import). # still be required for partial import).
if preindex is None: preindex = {}
preindex = {} for transaction in source.iterator():
for transaction in source.iterator(start, stop):
tid = transaction.tid tid = transaction.tid
self.tpc_begin(storage, transaction, tid, transaction.status) self.tpc_begin(storage, transaction, tid, transaction.status)
for r in transaction: for r in transaction:
oid = r.oid oid = r.oid
pre = preindex.get(oid) try:
pre = preindex[oid]
except KeyError:
try:
pre = self.load(oid)[1]
except NEOStorageNotFoundError:
pre = ZERO_TID
self.store(oid, pre, r.data, r.version, transaction) self.store(oid, pre, r.data, r.version, transaction)
preindex[oid] = tid preindex[oid] = tid
conflicted = self.tpc_vote(transaction) conflicted = self.tpc_vote(transaction)
......
...@@ -14,10 +14,14 @@ ...@@ -14,10 +14,14 @@
Give the name of the cluster Give the name of the cluster
</description> </description>
</key> </key>
<key name="compress" datatype="boolean"> <key name="compress" datatype=".compress">
<description> <description>
If true, data is automatically compressed (unless compressed size is The value is either of 'boolean' type or an explicit algorithm that
not smaller). This is the default behaviour. matches the regex 'zlib(=\d+)?', where the optional number is
the compression level.
Any record that is not smaller once compressed is stored uncompressed.
True is the default and its meaning may change over time:
currently, it is the same as 'zlib'.
</description> </description>
</key> </key>
<key name="read-only" datatype="boolean"> <key name="read-only" datatype="boolean">
......
...@@ -23,3 +23,11 @@ class NeoStorage(BaseConfig): ...@@ -23,3 +23,11 @@ class NeoStorage(BaseConfig):
config = self.config config = self.config
return Storage(**{k: getattr(config, k) return Storage(**{k: getattr(config, k)
for k in config.getSectionAttributes()}) for k in config.getSectionAttributes()})
def compress(value):
from ZConfig.datatypes import asBoolean
try:
return asBoolean(value)
except ValueError:
from neo.lib.compress import parseOption
return parseOption(value)
...@@ -14,10 +14,10 @@ ...@@ -14,10 +14,10 @@
# You should have received a copy of the GNU General Public License # You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>. # along with this program. If not, see <http://www.gnu.org/licenses/>.
from zlib import decompress
from ZODB.TimeStamp import TimeStamp from ZODB.TimeStamp import TimeStamp
from neo.lib import logging from neo.lib import logging
from neo.lib.compress import decompress_list
from neo.lib.protocol import Packets, uuid_str from neo.lib.protocol import Packets, uuid_str
from neo.lib.util import dump, makeChecksum from neo.lib.util import dump, makeChecksum
from neo.lib.exception import NodeNotReady from neo.lib.exception import NodeNotReady
...@@ -129,8 +129,7 @@ class StorageAnswersHandler(AnswerBaseHandler): ...@@ -129,8 +129,7 @@ class StorageAnswersHandler(AnswerBaseHandler):
'wrong checksum while getting back data for' 'wrong checksum while getting back data for'
' object %s during rebase of transaction %s' ' object %s during rebase of transaction %s'
% (dump(oid), dump(txn_context.ttid))) % (dump(oid), dump(txn_context.ttid)))
if compression: data = decompress_list[compression](data)
data = decompress(data)
size = len(data) size = len(data)
txn_context.data_size += size txn_context.data_size += size
if cached: if cached:
......
...@@ -47,7 +47,7 @@ class ConnectionPool(object): ...@@ -47,7 +47,7 @@ class ConnectionPool(object):
conn = MTClientConnection(app, app.storage_event_handler, node, conn = MTClientConnection(app, app.storage_event_handler, node,
dispatcher=app.dispatcher) dispatcher=app.dispatcher)
p = Packets.RequestIdentification(NodeTypes.CLIENT, p = Packets.RequestIdentification(NodeTypes.CLIENT,
app.uuid, None, app.name, app.id_timestamp) app.uuid, None, app.name, (), app.id_timestamp)
try: try:
app._ask(conn, p, handler=app.storage_bootstrap_handler) app._ask(conn, p, handler=app.storage_bootstrap_handler)
except ConnectionClosed: except ConnectionClosed:
......
...@@ -164,3 +164,17 @@ elif IF == 'frames': ...@@ -164,3 +164,17 @@ elif IF == 'frames':
write("Thread %s:\n" % thread_id) write("Thread %s:\n" % thread_id)
traceback.print_stack(frame) traceback.print_stack(frame)
write("End of dump\n") write("End of dump\n")
elif IF == 'profile':
DURATION = 60
def stop(prof, path):
prof.disable()
prof.dump_stats(path)
@defer
def profile(app):
import cProfile, threading, time
from .lib.protocol import uuid_str
path = 'neo-%s-%s.prof' % (uuid_str(app.uuid), time.time())
prof = cProfile.Profile()
threading.Timer(DURATION, stop, (prof, path)).start()
prof.enable()
...@@ -26,13 +26,14 @@ class BootstrapManager(EventHandler): ...@@ -26,13 +26,14 @@ class BootstrapManager(EventHandler):
Manage the bootstrap stage, lookup for the primary master then connect to it Manage the bootstrap stage, lookup for the primary master then connect to it
""" """
def __init__(self, app, node_type, server=None): def __init__(self, app, node_type, server=None, devpath=()):
""" """
Manage the bootstrap stage of a non-master node, it lookup for the Manage the bootstrap stage of a non-master node, it lookup for the
primary master node, connect to it then returns when the master node primary master node, connect to it then returns when the master node
is ready. is ready.
""" """
self.server = server self.server = server
self.devpath = devpath
self.node_type = node_type self.node_type = node_type
self.num_replicas = None self.num_replicas = None
self.num_partitions = None self.num_partitions = None
...@@ -43,7 +44,7 @@ class BootstrapManager(EventHandler): ...@@ -43,7 +44,7 @@ class BootstrapManager(EventHandler):
def connectionCompleted(self, conn): def connectionCompleted(self, conn):
EventHandler.connectionCompleted(self, conn) EventHandler.connectionCompleted(self, conn)
conn.ask(Packets.RequestIdentification(self.node_type, self.uuid, conn.ask(Packets.RequestIdentification(self.node_type, self.uuid,
self.server, self.app.name, None)) self.server, self.app.name, self.devpath, None))
def connectionFailed(self, conn): def connectionFailed(self, conn):
EventHandler.connectionFailed(self, conn) EventHandler.connectionFailed(self, conn)
......
#
# Copyright (C) 2018 Nexedi SA
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
import zlib
decompress_list = (
lambda data: data,
zlib.decompress,
)
def parseOption(value):
x = value.split('=', 1)
try:
alg = ('zlib',).index(x[0])
if len(x) == 1:
return alg, None
level = int(x[1])
except Exception:
raise ValueError("not a valid 'compress' option: %r" % value)
if 0 < level <= zlib.Z_BEST_COMPRESSION:
return alg, level
raise ValueError("invalid compression level: %r" % level)
def getCompress(value):
if value:
alg, level = (0, None) if value is True else value
_compress = zlib.compress
if level:
zlib_compress = _compress
_compress = lambda data: zlib_compress(data, level)
alg += 1
assert 0 < alg < len(decompress_list), 'invalid compression algorithm'
def compress(data):
size = len(data)
compressed = _compress(data)
if len(compressed) < size:
return size, alg, compressed
return size, 0, data
compress._compress = _compress # for testBasicStore
return compress
return lambda data: (len(data), 0, data)
...@@ -34,6 +34,7 @@ class SocketConnector(object): ...@@ -34,6 +34,7 @@ class SocketConnector(object):
is_closed = is_server = None is_closed = is_server = None
connect_limit = {} connect_limit = {}
CONNECT_LIMIT = 1 # XXX actually this is (RE-)CONNECT_THROTTLE CONNECT_LIMIT = 1 # XXX actually this is (RE-)CONNECT_THROTTLE
SOMAXCONN = 5 # for threaded tests
def __new__(cls, addr, s=None): def __new__(cls, addr, s=None):
if s is None: if s is None:
...@@ -78,7 +79,8 @@ class SocketConnector(object): ...@@ -78,7 +79,8 @@ class SocketConnector(object):
def queue(self, data): def queue(self, data):
was_empty = not self.queued was_empty = not self.queued
self.queued += data self.queued += data
self.queue_size += len(data) for data in data:
self.queue_size += len(data)
return was_empty return was_empty
def _error(self, op, exc=None): def _error(self, op, exc=None):
...@@ -123,7 +125,7 @@ class SocketConnector(object): ...@@ -123,7 +125,7 @@ class SocketConnector(object):
try: try:
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self._bind(self.addr) self._bind(self.addr)
self.socket.listen(5) self.socket.listen(self.SOMAXCONN)
except socket.error, e: except socket.error, e:
self.socket.close() self.socket.close()
self._error('listen', e) self._error('listen', e)
......
...@@ -26,9 +26,6 @@ class PrimaryFailure(NeoException): ...@@ -26,9 +26,6 @@ class PrimaryFailure(NeoException):
class StoppedOperation(NeoException): class StoppedOperation(NeoException):
pass pass
class DatabaseFailure(NeoException):
pass
class NodeNotReady(NeoException): class NodeNotReady(NeoException):
pass pass
...@@ -22,14 +22,13 @@ def check_signature(reference, function): ...@@ -22,14 +22,13 @@ def check_signature(reference, function):
a, b, c, d = inspect.getargspec(function) a, b, c, d = inspect.getargspec(function)
x = len(A) - len(a) x = len(A) - len(a)
if x < 0: # ignore extra default parameters if x < 0: # ignore extra default parameters
if x + len(d) < 0: if B or x + len(d) < 0:
return False return False
del a[x:] del a[x:]
d = d[:x] or None d = d[:x] or None
elif x: # different signature elif x: # different signature
# We have no need yet to support methods with default parameters. return a == A[:-x] and (b or a and c) and (d or ()) == (D or ())[:-x]
return a == A[:-x] and (b or a and c) and not (d or D) return a == A and (b or not B) and (c or not C) and d == D
return a == A and b == B and c == C and d == D
def implements(obj, ignore=()): def implements(obj, ignore=()):
ignore = set(ignore) ignore = set(ignore)
...@@ -55,7 +54,7 @@ def implements(obj, ignore=()): ...@@ -55,7 +54,7 @@ def implements(obj, ignore=()):
while 1: while 1:
name, func = base.pop() name, func = base.pop()
x = getattr(obj, name) x = getattr(obj, name)
if x.im_class is tobj: if type(getattr(x, '__self__', None)) is tobj:
x = x.__func__ x = x.__func__
if x is func: if x is func:
try: try:
......
...@@ -290,3 +290,16 @@ class NEOLogger(Logger): ...@@ -290,3 +290,16 @@ class NEOLogger(Logger):
logging = NEOLogger() logging = NEOLogger()
signal.signal(signal.SIGRTMIN, lambda signum, frame: logging.flush()) signal.signal(signal.SIGRTMIN, lambda signum, frame: logging.flush())
signal.signal(signal.SIGRTMIN+1, lambda signum, frame: logging.reopen()) signal.signal(signal.SIGRTMIN+1, lambda signum, frame: logging.reopen())
def patch():
def fork():
with logging:
pid = os_fork()
if not pid:
logging._setup()
return pid
os_fork = os.fork
os.fork = fork
patch()
del patch
...@@ -28,6 +28,7 @@ class Node(object): ...@@ -28,6 +28,7 @@ class Node(object):
_connection = None _connection = None
_identified = False _identified = False
devpath = ()
id_timestamp = None id_timestamp = None
def __init__(self, manager, address=None, uuid=None, state=NodeStates.DOWN): def __init__(self, manager, address=None, uuid=None, state=NodeStates.DOWN):
......
...@@ -25,6 +25,7 @@ def speedupFileStorageTxnLookup(): ...@@ -25,6 +25,7 @@ def speedupFileStorageTxnLookup():
from array import array from array import array
from bisect import bisect from bisect import bisect
from collections import defaultdict from collections import defaultdict
from neo.lib import logging
from ZODB.FileStorage.FileStorage import FileStorage, FileIterator from ZODB.FileStorage.FileStorage import FileStorage, FileIterator
typecode = 'L' if array('I').itemsize < 4 else 'I' typecode = 'L' if array('I').itemsize < 4 else 'I'
...@@ -44,6 +45,8 @@ def speedupFileStorageTxnLookup(): ...@@ -44,6 +45,8 @@ def speedupFileStorageTxnLookup():
try: try:
index = self._tidindex index = self._tidindex
except AttributeError: except AttributeError:
logging.info("Building index for faster lookup of"
" transactions in the FileStorage DB.")
# Cache a sorted list of all the file pos from oid index. # Cache a sorted list of all the file pos from oid index.
# To reduce memory usage, the list is splitted in arrays of # To reduce memory usage, the list is splitted in arrays of
# low order 32-bit words. # low order 32-bit words.
...@@ -52,10 +55,10 @@ def speedupFileStorageTxnLookup(): ...@@ -52,10 +55,10 @@ def speedupFileStorageTxnLookup():
tindex[x >> 32].append(x & 0xffffffff) tindex[x >> 32].append(x & 0xffffffff)
index = self._tidindex = [] index = self._tidindex = []
for h, l in sorted(tindex.iteritems()): for h, l in sorted(tindex.iteritems()):
x = array('I') l = array(typecode, sorted(l))
x.fromlist(sorted(l)) x = self._read_data_header(h << 32 | l[0])
l = self._read_data_header(h << 32 | x[0]) index.append((x.tid, h, l))
index.append((l.tid, h, x)) logging.info("... index built")
x = bisect(index, (start,)) - 1 x = bisect(index, (start,)) - 1
if x >= 0: if x >= 0:
x, h, index = index[x] x, h, index = index[x]
......
This diff is collapsed.
...@@ -15,12 +15,12 @@ ...@@ -15,12 +15,12 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>. # along with this program. If not, see <http://www.gnu.org/licenses/>.
import socket import os, socket
from binascii import a2b_hex, b2a_hex from binascii import a2b_hex, b2a_hex
from datetime import timedelta, datetime from datetime import timedelta, datetime
from hashlib import sha1 from hashlib import sha1
from Queue import deque from Queue import deque
from struct import pack, unpack from struct import pack, unpack, Struct
from time import gmtime from time import gmtime
TID_LOW_OVERFLOW = 2**32 TID_LOW_OVERFLOW = 2**32
...@@ -102,11 +102,10 @@ def addTID(ptid, offset): ...@@ -102,11 +102,10 @@ def addTID(ptid, offset):
higher = (d.year, d.month, d.day, d.hour, d.minute) higher = (d.year, d.month, d.day, d.hour, d.minute)
return packTID(higher, lower) return packTID(higher, lower)
def u64(s): p64, u64 = (lambda unpack: (
return unpack('!Q', s)[0] unpack.__self__.pack,
lambda s: unpack(s)[0]
def p64(n): ))(Struct('!Q').unpack)
return pack('!Q', n)
def add64(packed, offset): def add64(packed, offset):
"""Add a python number to a 64-bits packed value""" """Add a python number to a 64-bits packed value"""
...@@ -226,3 +225,25 @@ class cached_property(object): ...@@ -226,3 +225,25 @@ class cached_property(object):
if obj is None: return self if obj is None: return self
value = obj.__dict__[self.func.__name__] = self.func(obj) value = obj.__dict__[self.func.__name__] = self.func(obj)
return value return value
# This module is always imported before multiprocessing is used, and the
# main process does not want to change name when task are run in threads.
spt_pid = os.getpid()
def setproctitle(title):
global spt_pid
pid = os.getpid()
if spt_pid == pid:
return
spt_pid = pid
# Try using https://pypi.org/project/setproctitle/
try:
# On Linux, this is done by clobbering argv, and the main process
# usually has a longer command line than the title of subprocesses.
os.environ['SPT_NOENV'] = '1'
from setproctitle import setproctitle
except ImportError:
return
finally:
del os.environ['SPT_NOENV']
setproctitle(title)
...@@ -24,7 +24,7 @@ from ..app import monotonic_time ...@@ -24,7 +24,7 @@ from ..app import monotonic_time
class IdentificationHandler(EventHandler): class IdentificationHandler(EventHandler):
def requestIdentification(self, conn, node_type, uuid, def requestIdentification(self, conn, node_type, uuid,
address, name, id_timestamp): address, name, devpath, id_timestamp):
app = self.app app = self.app
self.checkClusterName(name) self.checkClusterName(name)
if address == app.server: if address == app.server:
...@@ -101,6 +101,8 @@ class IdentificationHandler(EventHandler): ...@@ -101,6 +101,8 @@ class IdentificationHandler(EventHandler):
uuid=uuid, address=address) uuid=uuid, address=address)
else: else:
node.setUUID(uuid) node.setUUID(uuid)
if devpath:
node.devpath = tuple(devpath)
node.id_timestamp = monotonic_time() node.id_timestamp = monotonic_time()
node.setState(state) node.setState(state)
conn.setHandler(handler) conn.setHandler(handler)
...@@ -120,7 +122,7 @@ class IdentificationHandler(EventHandler): ...@@ -120,7 +122,7 @@ class IdentificationHandler(EventHandler):
class SecondaryIdentificationHandler(EventHandler): class SecondaryIdentificationHandler(EventHandler):
def requestIdentification(self, conn, node_type, uuid, def requestIdentification(self, conn, node_type, uuid,
address, name, id_timestamp): address, name, devpath, id_timestamp):
app = self.app app = self.app
self.checkClusterName(name) self.checkClusterName(name)
if address == app.server: if address == app.server:
......
...@@ -38,7 +38,7 @@ class ElectionHandler(MasterHandler): ...@@ -38,7 +38,7 @@ class ElectionHandler(MasterHandler):
super(ElectionHandler, self).connectionCompleted(conn) super(ElectionHandler, self).connectionCompleted(conn)
app = self.app app = self.app
conn.ask(Packets.RequestIdentification(NodeTypes.MASTER, conn.ask(Packets.RequestIdentification(NodeTypes.MASTER,
app.uuid, app.server, app.name, app.election)) app.uuid, app.server, app.name, (), app.election))
def connectionFailed(self, conn): def connectionFailed(self, conn):
super(ElectionHandler, self).connectionFailed(conn) super(ElectionHandler, self).connectionFailed(conn)
......
...@@ -178,7 +178,7 @@ class PartitionTable(neo.lib.pt.PartitionTable): ...@@ -178,7 +178,7 @@ class PartitionTable(neo.lib.pt.PartitionTable):
def tweak(self, drop_list=()): def tweak(self, drop_list=()):
"""Optimize partition table """Optimize partition table
This reassigns cells in 3 ways: This reassigns cells in 4 ways:
- Discard cells of nodes listed in 'drop_list'. For partitions with too - Discard cells of nodes listed in 'drop_list'. For partitions with too
few readable cells, some cells are instead marked as FEEDING. This is few readable cells, some cells are instead marked as FEEDING. This is
a preliminary step to drop these nodes, otherwise the partition table a preliminary step to drop these nodes, otherwise the partition table
...@@ -187,6 +187,8 @@ class PartitionTable(neo.lib.pt.PartitionTable): ...@@ -187,6 +187,8 @@ class PartitionTable(neo.lib.pt.PartitionTable):
- When a transaction creates new objects (oids are roughly allocated - When a transaction creates new objects (oids are roughly allocated
sequentially), we expect better performance by maximizing the number sequentially), we expect better performance by maximizing the number
of involved nodes (i.e. parallelizing writes). of involved nodes (i.e. parallelizing writes).
- For maximum resiliency, cells of each partition are assigned as far
as possible from each other, by checking the topology path of nodes.
Examples of optimal partition tables with np=10, nr=1 and 5 nodes: Examples of optimal partition tables with np=10, nr=1 and 5 nodes:
...@@ -215,6 +217,17 @@ class PartitionTable(neo.lib.pt.PartitionTable): ...@@ -215,6 +217,17 @@ class PartitionTable(neo.lib.pt.PartitionTable):
U. .U U. U. .U U.
.U U. U. .U U. U.
U. U. .U U. U. .U
For the topology, let's consider an example with paths of the form
(room, machine, disk):
- if there are more rooms than the number of replicas, 2 cells of the
same partition must not be assigned in the same room;
- otherwise, topology paths are checked at a deeper depth,
e.g. not on the same machine and distributed evenly
(off by 1) among rooms.
But the topology is expected to be optimal, otherwise it is ignored.
In some cases, we could fall back to a non-optimal topology but
that would cause extra replication if the user wants to fix it.
""" """
# Collect some data in a usable form for the rest of the method. # Collect some data in a usable form for the rest of the method.
node_list = {node: {} for node in self.count_dict node_list = {node: {} for node in self.count_dict
...@@ -242,6 +255,67 @@ class PartitionTable(neo.lib.pt.PartitionTable): ...@@ -242,6 +255,67 @@ class PartitionTable(neo.lib.pt.PartitionTable):
i += 1 i += 1
option_dict = Counter(map(tuple, x)) option_dict = Counter(map(tuple, x))
# Initialize variables/functions to optimize the topology.
devpath_max = []
devpaths = [()] * node_count
if repeats > 1:
_devpaths = [x[0].devpath for x in node_list]
max_depth = min(map(len, _devpaths))
depth = 0
while 1:
if depth < max_depth:
depth += 1
x = Counter(x[:depth] for x in _devpaths)
n = len(x)
x = set(x.itervalues())
# TODO: Prove it works. If the code turns out to be:
# - too pessimistic, the topology is ignored when
# resiliency could be maximized;
# - or worse too optimistic, in which case this
# method raises, possibly after a very long time.
if len(x) == 1 or max(x) * repeats <= node_count:
i, x = divmod(repeats, n)
devpath_max.append((i + 1, x) if x else (i, n))
if n < repeats:
continue
devpaths = [x[:depth] for x in _devpaths]
break
logging.warning("Can't maximize resiliency: fix the topology"
" of your storage nodes and make sure they're all running."
" %s storage device failure(s) may be enough to lose all"
" the database." % (repeats - 1))
break
topology = [{} for _ in xrange(self.np)]
def update_topology():
for offset in option:
n = topology[offset]
for i, (j, k) in zip(devpath, devpath_max):
try:
i, x = n[i]
except KeyError:
n[i] = i, x = [0, {}]
if i == j or i + 1 == j and k == sum(
1 for i in n.itervalues() if i[0] == j):
# Too many cells would be assigned at this topology
# node.
return False
n = x
# The topology may be optimal with this option. Apply it.
for offset in option:
n = topology[offset]
for i in devpath:
n = n[i]
n[0] += 1
n = n[1]
return True
def revert_topology():
for offset in option:
n = topology[offset]
for i in devpath:
n = n[i]
n[0] -= 1
n = n[1]
# Strategies to find the "best" permutation of nodes. # Strategies to find the "best" permutation of nodes.
def node_options(): def node_options():
# The second part of the key goes with the above cosmetic sort. # The second part of the key goes with the above cosmetic sort.
...@@ -291,24 +365,27 @@ class PartitionTable(neo.lib.pt.PartitionTable): ...@@ -291,24 +365,27 @@ class PartitionTable(neo.lib.pt.PartitionTable):
new = [] # the solution new = [] # the solution
stack = [] # data recursion stack = [] # data recursion
def options(): def options():
return iter(node_options[len(new)][-1]) x = node_options[len(new)]
return devpaths[x[-2]], iter(x[-1])
for node_options in node_options(): # for each strategy for node_options in node_options(): # for each strategy
iter_option = options() devpath, iter_option = options()
while 1: while 1:
try: try:
option = next(iter_option) option = next(iter_option)
except StopIteration: # 1st strategy only except StopIteration:
if new: if new:
iter_option = stack.pop() devpath, iter_option = stack.pop()
option_dict[new.pop()] += 1 option = new.pop()
revert_topology()
option_dict[option] += 1
continue continue
break break
if option_dict[option]: if option_dict[option] and update_topology():
new.append(option) new.append(option)
if len(new) == len(node_list): if len(new) == node_count:
break break
stack.append(iter_option) stack.append((devpath, iter_option))
iter_option = options() devpath, iter_option = options()
option_dict[option] -= 1 option_dict[option] -= 1
if new: if new:
break break
...@@ -384,13 +461,18 @@ class PartitionTable(neo.lib.pt.PartitionTable): ...@@ -384,13 +461,18 @@ class PartitionTable(neo.lib.pt.PartitionTable):
if cell.isReadable(): if cell.isReadable():
if cell.getNode().isRunning(): if cell.getNode().isRunning():
lost = None lost = None
else : else:
cell_list.append(cell) cell_list.append(cell)
for cell in cell_list: for cell in cell_list:
if cell.getNode() is not lost: node = cell.getNode()
cell.setState(CellStates.OUT_OF_DATE) if node is not lost:
change_list.append((offset, cell.getUUID(), if cell.isFeeding():
CellStates.OUT_OF_DATE)) self.removeCell(offset, node)
state = CellStates.DISCARDED
else:
state = CellStates.OUT_OF_DATE
cell.setState(state)
change_list.append((offset, node.getUUID(), state))
if fully_readable and change_list: if fully_readable and change_list:
logging.warning(self._first_outdated_message) logging.warning(self._first_outdated_message)
return change_list return change_list
......
...@@ -65,6 +65,7 @@ UNIT_TEST_MODULES = [ ...@@ -65,6 +65,7 @@ UNIT_TEST_MODULES = [
'neo.tests.client.testZODBURI', 'neo.tests.client.testZODBURI',
# light functional tests # light functional tests
'neo.tests.threaded.test', 'neo.tests.threaded.test',
'neo.tests.threaded.testConfig',
'neo.tests.threaded.testImporter', 'neo.tests.threaded.testImporter',
'neo.tests.threaded.testReplication', 'neo.tests.threaded.testReplication',
'neo.tests.threaded.testSSL', 'neo.tests.threaded.testSSL',
......
...@@ -71,6 +71,7 @@ class Application(BaseApplication): ...@@ -71,6 +71,7 @@ class Application(BaseApplication):
self.dm.setup(reset=config.getReset(), dedup=config.getDedup()) self.dm.setup(reset=config.getReset(), dedup=config.getDedup())
self.loadConfiguration() self.loadConfiguration()
self.devpath = self.dm.getTopologyPath()
# force node uuid from command line argument, for testing purpose only # force node uuid from command line argument, for testing purpose only
if config.getUUID() is not None: if config.getUUID() is not None:
...@@ -203,7 +204,8 @@ class Application(BaseApplication): ...@@ -203,7 +204,8 @@ class Application(BaseApplication):
pt = self.pt pt = self.pt
# search, find, connect and identify to the primary master # search, find, connect and identify to the primary master
bootstrap = BootstrapManager(self, NodeTypes.STORAGE, self.server) bootstrap = BootstrapManager(self, NodeTypes.STORAGE, self.server,
self.devpath)
self.master_node, self.master_conn, num_partitions, num_replicas = \ self.master_node, self.master_conn, num_partitions, num_replicas = \
bootstrap.getPrimaryConnection() bootstrap.getPrimaryConnection()
uuid = self.uuid uuid = self.uuid
......
...@@ -51,7 +51,7 @@ class Checker(object): ...@@ -51,7 +51,7 @@ class Checker(object):
else: else:
conn = ClientConnection(app, StorageOperationHandler(app), node) conn = ClientConnection(app, StorageOperationHandler(app), node)
conn.ask(Packets.RequestIdentification(NodeTypes.STORAGE, conn.ask(Packets.RequestIdentification(NodeTypes.STORAGE,
uuid, app.server, name, app.id_timestamp)) uuid, app.server, name, (), app.id_timestamp))
self.conn_dict[conn] = node.isIdentified() self.conn_dict[conn] = node.isIdentified()
conn_set = set(self.conn_dict) conn_set = set(self.conn_dict)
conn_set.discard(None) conn_set.discard(None)
......
...@@ -16,8 +16,6 @@ ...@@ -16,8 +16,6 @@
LOG_QUERIES = False LOG_QUERIES = False
from neo.lib.exception import DatabaseFailure
DATABASE_MANAGER_DICT = { DATABASE_MANAGER_DICT = {
'Importer': 'importer.ImporterDatabaseManager', 'Importer': 'importer.ImporterDatabaseManager',
'MySQL': 'mysqldb.MySQLDatabaseManager', 'MySQL': 'mysqldb.MySQLDatabaseManager',
...@@ -33,3 +31,6 @@ def getAdapterKlass(name): ...@@ -33,3 +31,6 @@ def getAdapterKlass(name):
def buildDatabaseManager(name, args=(), kw={}): def buildDatabaseManager(name, args=(), kw={}):
return getAdapterKlass(name)(*args, **kw) return getAdapterKlass(name)(*args, **kw)
class DatabaseFailure(Exception):
pass
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -25,7 +25,7 @@ from . import LOG_QUERIES ...@@ -25,7 +25,7 @@ from . import LOG_QUERIES
from .manager import DatabaseManager, splitOIDField from .manager import DatabaseManager, splitOIDField
from neo.lib import logging, util from neo.lib import logging, util
from neo.lib.interfaces import implements from neo.lib.interfaces import implements
from neo.lib.protocol import CellStates, ZERO_OID, ZERO_TID, ZERO_HASH from neo.lib.protocol import ZERO_OID, ZERO_TID, ZERO_HASH
def unique_constraint_message(table, *columns): def unique_constraint_message(table, *columns):
c = sqlite3.connect(":memory:") c = sqlite3.connect(":memory:")
...@@ -68,7 +68,7 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -68,7 +68,7 @@ class SQLiteDatabaseManager(DatabaseManager):
never be used for small requests. never be used for small requests.
""" """
VERSION = 2 VERSION = 3
def _parse(self, database): def _parse(self, database):
self.db = os.path.expanduser(database) self.db = os.path.expanduser(database)
...@@ -86,6 +86,9 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -86,6 +86,9 @@ class SQLiteDatabaseManager(DatabaseManager):
q("PRAGMA journal_mode = MEMORY") q("PRAGMA journal_mode = MEMORY")
self._config = {} self._config = {}
def _getDevPath(self):
return self.db
def _commit(self): def _commit(self):
retry_if_locked(self.conn.commit) retry_if_locked(self.conn.commit)
...@@ -113,23 +116,33 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -113,23 +116,33 @@ class SQLiteDatabaseManager(DatabaseManager):
if not e.args[0].startswith("no such table:"): if not e.args[0].startswith("no such table:"):
raise raise
def _migrate1(self, *_): def _alterTable(self, schema_dict, table, select="*"):
self._checkNoUnfinishedTransactions()
self.query("DROP TABLE IF EXISTS ttrans")
def _migrate2(self, schema_dict, index_dict):
# BBB: As explained in _setup, no transactional DDL # BBB: As explained in _setup, no transactional DDL
# so let's do the same dance as for MySQL. # so let's do the same dance as for MySQL.
q = self.query q = self.query
if self.nonempty('obj') is None: new = 'new_' + table
if self.nonempty('new_obj') is None: if self.nonempty(table) is None:
if self.nonempty(new) is None:
return return
else: else:
q("DROP TABLE IF EXISTS new_obj") q("DROP TABLE IF EXISTS " + new)
q(schema_dict.pop('obj') % 'new_obj') q(schema_dict.pop(table) % new)
q("INSERT INTO new_obj SELECT * FROM obj") q("INSERT INTO %s SELECT %s FROM %s" % (new, select, table))
q("DROP TABLE obj") q("DROP TABLE " + table)
q("ALTER TABLE new_obj RENAME TO obj") q("ALTER TABLE %s RENAME TO %s" % (new, table))
def _migrate1(self, *_):
self._checkNoUnfinishedTransactions()
self.query("DROP TABLE IF EXISTS ttrans")
def _migrate2(self, schema_dict, index_dict):
self._alterTable(schema_dict, 'obj')
def _migrate3(self, schema_dict, index_dict):
self._alterTable(schema_dict, 'pt', "rid, nid, CASE state"
" WHEN 0 THEN -1" # UP_TO_DATE
" WHEN 2 THEN -2" # FEEDING
" ELSE 1-state END")
def _setup(self, dedup=False): def _setup(self, dedup=False):
# BBB: SQLite has transactional DDL but before Python 3.6, # BBB: SQLite has transactional DDL but before Python 3.6,
...@@ -150,10 +163,10 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -150,10 +163,10 @@ class SQLiteDatabaseManager(DatabaseManager):
# The table "pt" stores a partition table. # The table "pt" stores a partition table.
schema_dict['pt'] = """CREATE TABLE %s ( schema_dict['pt'] = """CREATE TABLE %s (
rid INTEGER NOT NULL, partition INTEGER NOT NULL,
nid INTEGER NOT NULL, nid INTEGER NOT NULL,
state INTEGER NOT NULL, tid INTEGER NOT NULL,
PRIMARY KEY (rid, nid)) PRIMARY KEY (partition, nid))
""" """
# The table "trans" stores information on committed transactions. # The table "trans" stores information on committed transactions.
...@@ -223,7 +236,8 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -223,7 +236,8 @@ class SQLiteDatabaseManager(DatabaseManager):
for table, schema in schema_dict.iteritems(): for table, schema in schema_dict.iteritems():
q(schema % ('IF NOT EXISTS ' + table)) q(schema % ('IF NOT EXISTS ' + table))
for i, index in enumerate(index_dict.get(table, ()), 1): for table, index in index_dict.iteritems():
for i, index in enumerate(index, 1):
q(index % ('IF NOT EXISTS _%s_i%s' % (table, i), table)) q(index % ('IF NOT EXISTS _%s_i%s' % (table, i), table))
self._uncommitted_data.update(q("SELECT data_id, count(*)" self._uncommitted_data.update(q("SELECT data_id, count(*)"
...@@ -249,42 +263,23 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -249,42 +263,23 @@ class SQLiteDatabaseManager(DatabaseManager):
else: else:
q("REPLACE INTO config VALUES (?,?)", (key, str(value))) q("REPLACE INTO config VALUES (?,?)", (key, str(value)))
def getPartitionTable(self, *nid): def _getPartitionTable(self):
if nid:
return self.query("SELECT rid, state FROM pt WHERE nid=?", nid)
return self.query("SELECT * FROM pt") return self.query("SELECT * FROM pt")
# A test with a table of 20 million lines and SQLite 3.8.7.1 shows that def _getLastTID(self, partition, max_tid=None):
# it's not worth changing getLastTID: x = self.query
# - It already returns the result in less than 2 seconds, without reading if max_tid is None:
# the whole table (this is 4-7 times faster than MySQL). x = x("SELECT MAX(tid) FROM trans WHERE partition=?", (partition,))
# - Strangely, a "GROUP BY partition" clause makes SQLite almost twice else:
# slower. x = x("SELECT MAX(tid) FROM trans WHERE partition=? AND tid<=?",
# - Getting MAX(tid) is immediate with a "AND partition=?" condition so one (partition, max_tid))
# way to speed up the following 2 methods is to repeat the queries for return x.next()[0]
# each partition (and finish in Python with max() for getLastTID).
def _getLastIDs(self, *args):
def getLastTID(self, max_tid):
return self.query(
"SELECT MAX(tid) FROM pt, trans"
" WHERE nid=? AND rid=partition AND tid<=?",
(self.getUUID(), max_tid,)).next()[0]
def _getLastIDs(self):
p64 = util.p64
q = self.query q = self.query
args = self.getUUID(), (oid,), = q("SELECT MAX(oid) FROM obj WHERE `partition`=?", args)
trans = {partition: p64(tid) (tid,), = q("SELECT MAX(tid) FROM obj WHERE `partition`=?", args)
for partition, tid in q( return tid, oid
"SELECT partition, MAX(tid) FROM pt, trans"
" WHERE nid=? AND rid=partition GROUP BY partition", args)}
obj = {partition: p64(tid)
for partition, tid in q(
"SELECT partition, MAX(tid) FROM pt, obj"
" WHERE nid=? AND rid=partition GROUP BY partition", args)}
oid = q("SELECT MAX(oid) oid FROM pt, obj"
" WHERE nid=? AND rid=partition", args).next()[0]
return trans, obj, None if oid is None else p64(oid)
def _getDataLastId(self, partition): def _getDataLastId(self, partition):
return self.query("SELECT MAX(id) FROM data WHERE %s <= id AND id < %s" return self.query("SELECT MAX(id) FROM data WHERE %s <= id AND id < %s"
...@@ -352,8 +347,8 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -352,8 +347,8 @@ class SQLiteDatabaseManager(DatabaseManager):
# whereas we try to replace only 1 value ? # whereas we try to replace only 1 value ?
# We don't want to remove the 'NOT NULL' constraint # We don't want to remove the 'NOT NULL' constraint
# so we must simulate a "REPLACE OR FAIL". # so we must simulate a "REPLACE OR FAIL".
q("DELETE FROM pt WHERE rid=? AND nid=?", (offset, nid)) q("DELETE FROM pt WHERE partition=? AND nid=?", (offset, nid))
if state != CellStates.DISCARDED: if state is not None:
q("INSERT OR FAIL INTO pt VALUES (?,?,?)", q("INSERT OR FAIL INTO pt VALUES (?,?,?)",
(offset, nid, int(state))) (offset, nid, int(state)))
...@@ -478,10 +473,15 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -478,10 +473,15 @@ class SQLiteDatabaseManager(DatabaseManager):
(u64(tid), u64(ttid))) (u64(tid), u64(ttid)))
self.commit() self.commit()
def unlockTransaction(self, tid, ttid): def unlockTransaction(self, tid, ttid, trans, obj):
q = self.query q = self.query
u64 = util.u64 u64 = util.u64
tid = u64(tid) tid = u64(tid)
if trans:
q("INSERT INTO trans SELECT * FROM ttrans WHERE tid=?", (tid,))
q("DELETE FROM ttrans WHERE tid=?", (tid,))
if not obj:
return
ttid = u64(ttid) ttid = u64(ttid)
sql = " FROM tobj WHERE tid=?" sql = " FROM tobj WHERE tid=?"
data_id_list = [x for x, in q("SELECT data_id%s AND data_id IS NOT NULL" data_id_list = [x for x, in q("SELECT data_id%s AND data_id IS NOT NULL"
...@@ -489,8 +489,6 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -489,8 +489,6 @@ class SQLiteDatabaseManager(DatabaseManager):
q("INSERT INTO obj SELECT partition, oid, ?, data_id, value_tid" + sql, q("INSERT INTO obj SELECT partition, oid, ?, data_id, value_tid" + sql,
(tid, ttid)) (tid, ttid))
q("DELETE" + sql, (ttid,)) q("DELETE" + sql, (ttid,))
q("INSERT INTO trans SELECT * FROM ttrans WHERE tid=?", (tid,))
q("DELETE FROM ttrans WHERE tid=?", (tid,))
self.releaseData(data_id_list) self.releaseData(data_id_list)
def abortTransaction(self, ttid): def abortTransaction(self, ttid):
...@@ -520,12 +518,12 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -520,12 +518,12 @@ class SQLiteDatabaseManager(DatabaseManager):
def _deleteRange(self, partition, min_tid=None, max_tid=None): def _deleteRange(self, partition, min_tid=None, max_tid=None):
sql = " WHERE partition=?" sql = " WHERE partition=?"
args = [partition] args = [partition]
if min_tid: if min_tid is not None:
sql += " AND ? < tid" sql += " AND ? < tid"
args.append(util.u64(min_tid)) args.append(min_tid)
if max_tid: if max_tid is not None:
sql += " AND tid <= ?" sql += " AND tid <= ?"
args.append(util.u64(max_tid)) args.append(max_tid)
q = self.query q = self.query
q("DELETE FROM trans" + sql, args) q("DELETE FROM trans" + sql, args)
sql = " FROM obj" + sql sql = " FROM obj" + sql
...@@ -693,3 +691,24 @@ class SQLiteDatabaseManager(DatabaseManager): ...@@ -693,3 +691,24 @@ class SQLiteDatabaseManager(DatabaseManager):
sha1(','.join(str(x[1]) for x in r)).digest(), sha1(','.join(str(x[1]) for x in r)).digest(),
p64(r[-1][1])) p64(r[-1][1]))
return 0, ZERO_HASH, ZERO_TID, ZERO_HASH, ZERO_OID return 0, ZERO_HASH, ZERO_TID, ZERO_HASH, ZERO_OID
def dump(self):
main = []
data = []
for line in self.conn.iterdump():
if line.startswith('INSERT '):
assert line.endswith(';'), line
data.append(line)
continue
if line.startswith('CREATE TABLE '):
# ALTER TABLE adds quotes.
create, table, name, tail = line.split(' ', 3)
line = ' '.join((create, table, name.strip('"'), tail))
main.append(line)
assert line == 'COMMIT;', line
data.sort()
main[-1:-1] = data
return '\n'.join(main) + '\n'
def restore(self, sql):
self.conn.executescript(sql)
...@@ -42,11 +42,11 @@ class ClientOperationHandler(BaseHandler): ...@@ -42,11 +42,11 @@ class ClientOperationHandler(BaseHandler):
# for read rpc # for read rpc
return self.app.tm.read_queue return self.app.tm.read_queue
def askObject(self, conn, oid, serial, tid): def askObject(self, conn, oid, at, before):
app = self.app app = self.app
if app.tm.loadLocked(oid): if app.tm.loadLocked(oid):
raise DelayEvent raise DelayEvent
o = app.dm.getObject(oid, serial, tid) o = app.dm.getObject(oid, at, before)
try: try:
serial, next_serial, compression, checksum, data, data_serial = o serial, next_serial, compression, checksum, data, data_serial = o
except TypeError: except TypeError:
......
...@@ -32,7 +32,7 @@ class IdentificationHandler(EventHandler): ...@@ -32,7 +32,7 @@ class IdentificationHandler(EventHandler):
return self.app.nm return self.app.nm
def requestIdentification(self, conn, node_type, uuid, address, name, def requestIdentification(self, conn, node_type, uuid, address, name,
id_timestamp): devpath, id_timestamp):
self.checkClusterName(name) self.checkClusterName(name)
app = self.app app = self.app
# reject any incoming connections if not ready # reject any incoming connections if not ready
......
...@@ -28,21 +28,21 @@ class InitializationHandler(BaseMasterHandler): ...@@ -28,21 +28,21 @@ class InitializationHandler(BaseMasterHandler):
raise ProtocolError('Partial partition table received') raise ProtocolError('Partial partition table received')
# Install the partition table into the database for persistence. # Install the partition table into the database for persistence.
cell_list = [] cell_list = []
offset_list = xrange(pt.getPartitions()) unassigned = range(pt.getPartitions())
unassigned_set = set(offset_list) for offset in reversed(unassigned):
for offset in offset_list:
for cell in pt.getCellList(offset): for cell in pt.getCellList(offset):
cell_list.append((offset, cell.getUUID(), cell.getState())) cell_list.append((offset, cell.getUUID(), cell.getState()))
if cell.getUUID() == app.uuid: if cell.getUUID() == app.uuid:
unassigned_set.remove(offset) unassigned.remove(offset)
# delete objects database # delete objects database
dm = app.dm dm = app.dm
if unassigned_set: if unassigned:
if app.disable_drop_partitions: if app.disable_drop_partitions:
logging.info("don't drop data for partitions %r", unassigned_set) logging.info('partitions %r are discarded but actual deletion'
' of data is disabled', unassigned)
else: else:
logging.debug('drop data for partitions %r', unassigned_set) logging.debug('drop data for partitions %r', unassigned)
dm.dropPartitions(unassigned_set) dm.dropPartitions(unassigned)
dm.changePartitionTable(ptid, cell_list, reset=True) dm.changePartitionTable(ptid, cell_list, reset=True)
dm.commit() dm.commit()
...@@ -63,7 +63,7 @@ class InitializationHandler(BaseMasterHandler): ...@@ -63,7 +63,7 @@ class InitializationHandler(BaseMasterHandler):
def askLastIDs(self, conn): def askLastIDs(self, conn):
dm = self.app.dm dm = self.app.dm
dm.truncate() dm.truncate()
ltid, _, _, loid = dm.getLastIDs() ltid, loid = dm.getLastIDs()
conn.answer(Packets.AnswerLastIDs(loid, ltid)) conn.answer(Packets.AnswerLastIDs(loid, ltid))
def askPartitionTable(self, conn): def askPartitionTable(self, conn):
...@@ -77,18 +77,10 @@ class InitializationHandler(BaseMasterHandler): ...@@ -77,18 +77,10 @@ class InitializationHandler(BaseMasterHandler):
def validateTransaction(self, conn, ttid, tid): def validateTransaction(self, conn, ttid, tid):
dm = self.app.dm dm = self.app.dm
dm.lockTransaction(tid, ttid) dm.lockTransaction(tid, ttid)
dm.unlockTransaction(tid, ttid) dm.unlockTransaction(tid, ttid, True, True)
dm.commit() dm.commit()
def startOperation(self, conn, backup): def startOperation(self, conn, backup):
self.app.operational = True
# XXX: see comment in protocol # XXX: see comment in protocol
dm = self.app.dm self.app.operational = True
if backup: self.app.replicator.startOperation(backup)
if dm.getBackupTID():
return
tid = dm.getLastIDs()[0] or ZERO_TID
else:
tid = None
dm._setBackupTID(tid)
dm.commit()
...@@ -26,10 +26,7 @@ class MasterOperationHandler(BaseMasterHandler): ...@@ -26,10 +26,7 @@ class MasterOperationHandler(BaseMasterHandler):
def startOperation(self, conn, backup): def startOperation(self, conn, backup):
# XXX: see comment in protocol # XXX: see comment in protocol
assert self.app.operational and backup assert self.app.operational and backup
dm = self.app.dm self.app.replicator.startOperation(backup)
if not dm.getBackupTID():
dm._setBackupTID(dm.getLastIDs()[0] or ZERO_TID)
dm.commit()
def askLockInformation(self, conn, ttid, tid): def askLockInformation(self, conn, ttid, tid):
self.app.tm.lock(ttid, tid) self.app.tm.lock(ttid, tid)
......
...@@ -75,9 +75,6 @@ class StorageOperationHandler(EventHandler): ...@@ -75,9 +75,6 @@ class StorageOperationHandler(EventHandler):
deleteTransaction(tid) deleteTransaction(tid)
assert not pack_tid, "TODO" assert not pack_tid, "TODO"
if next_tid: if next_tid:
# More than one chunk ? This could be a full replication so avoid
# restarting from the beginning by committing now.
self.app.dm.commit()
self.app.replicator.fetchTransactions(next_tid) self.app.replicator.fetchTransactions(next_tid)
else: else:
self.app.replicator.fetchObjects() self.app.replicator.fetchObjects()
...@@ -97,15 +94,12 @@ class StorageOperationHandler(EventHandler): ...@@ -97,15 +94,12 @@ class StorageOperationHandler(EventHandler):
for serial, oid_list in object_dict.iteritems(): for serial, oid_list in object_dict.iteritems():
for oid in oid_list: for oid in oid_list:
deleteObject(oid, serial) deleteObject(oid, serial)
# XXX: It should be possible not to commit here if it was the last
# chunk, because we'll either commit again when updating
# 'backup_tid' or the partition table.
self.app.dm.commit()
assert not pack_tid, "TODO" assert not pack_tid, "TODO"
if next_tid: if next_tid:
# TODO also provide feedback to master about current replication state (tid) # TODO also provide feedback to master about current replication state (tid)
self.app.replicator.fetchObjects(next_tid, next_oid) self.app.replicator.fetchObjects(next_tid, next_oid)
else: else:
# This will also commit.
self.app.replicator.finish() self.app.replicator.finish()
@checkConnectionIsReplicatorConnection @checkConnectionIsReplicatorConnection
...@@ -267,6 +261,8 @@ class StorageOperationHandler(EventHandler): ...@@ -267,6 +261,8 @@ class StorageOperationHandler(EventHandler):
"partition %u dropped or truncated" "partition %u dropped or truncated"
% partition), msg_id) % partition), msg_id)
return return
if not object[2]: # creation undone
object = object[0], 0, ZERO_HASH, '', object[4]
# Same as in askFetchTransactions. # Same as in askFetchTransactions.
conn.send(Packets.AddObject(oid, *object), msg_id) conn.send(Packets.AddObject(oid, *object), msg_id)
yield conn.buffering yield conn.buffering
......
...@@ -93,7 +93,7 @@ from neo.lib import logging ...@@ -93,7 +93,7 @@ from neo.lib import logging
from neo.lib.protocol import CellStates, NodeTypes, NodeStates, \ from neo.lib.protocol import CellStates, NodeTypes, NodeStates, \
Packets, INVALID_TID, ZERO_TID, ZERO_OID Packets, INVALID_TID, ZERO_TID, ZERO_OID
from neo.lib.connection import ClientConnection, ConnectionClosed from neo.lib.connection import ClientConnection, ConnectionClosed
from neo.lib.util import add64, dump from neo.lib.util import add64, dump, p64
from .handlers.storage import StorageOperationHandler from .handlers.storage import StorageOperationHandler
FETCH_COUNT = 1000 FETCH_COUNT = 1000
...@@ -190,42 +190,51 @@ class Replicator(object): ...@@ -190,42 +190,51 @@ class Replicator(object):
return add64(tid, -1) return add64(tid, -1)
return ZERO_TID return ZERO_TID
def updateBackupTID(self): def updateBackupTID(self, commit=False):
dm = self.app.dm dm = self.app.dm
tid = dm.getBackupTID() tid = dm.getBackupTID()
if tid: if tid:
new_tid = self.getBackupTID() new_tid = self.getBackupTID()
if tid != new_tid: if tid != new_tid:
dm._setBackupTID(new_tid) dm._setBackupTID(new_tid)
dm.commit() if commit:
dm.commit()
def startOperation(self, backup):
dm = self.app.dm
if backup:
if dm.getBackupTID():
assert not hasattr(self, 'partition_dict'), self.partition_dict
return
tid = dm.getLastIDs()[0] or ZERO_TID
else:
tid = None
dm._setBackupTID(tid)
dm.commit()
try:
partition_dict = self.partition_dict
except AttributeError:
return
for offset, next_tid in dm.iterCellNextTIDs():
if type(next_tid) is not bytes: # readable
p = partition_dict[offset]
p.next_trans, p.next_obj = next_tid
def populate(self): def populate(self):
app = self.app
pt = app.pt
uuid = app.uuid
self.partition_dict = {} self.partition_dict = {}
self.replicate_dict = {} self.replicate_dict = {}
self.source_dict = {} self.source_dict = {}
self.ttid_set = set() self.ttid_set = set()
last_tid, last_trans_dict, last_obj_dict, _ = app.dm.getLastIDs()
next_tid = app.dm.getBackupTID() or last_tid
next_tid = add64(next_tid, 1) if next_tid else ZERO_TID
outdated_list = [] outdated_list = []
for offset in xrange(pt.getPartitions()): for offset, next_tid in self.app.dm.iterCellNextTIDs():
for cell in pt.getCellList(offset): self.partition_dict[offset] = p = Partition()
if cell.getUUID() == uuid and not cell.isCorrupted(): if type(next_tid) is bytes: # OUT_OF_DATE
self.partition_dict[offset] = p = Partition() outdated_list.append(offset)
if cell.isOutOfDate(): p.next_trans = p.next_obj = next_tid
outdated_list.append(offset) p.max_ttid = INVALID_TID
try: else: # readable
p.next_trans = add64(last_trans_dict[offset], 1) p.next_trans, p.next_obj = next_tid or (None, None)
except KeyError: p.max_ttid = None
p.next_trans = ZERO_TID
p.next_obj = last_obj_dict.get(offset, ZERO_TID)
p.max_ttid = INVALID_TID
else:
p.next_trans = p.next_obj = next_tid
p.max_ttid = None
if outdated_list: if outdated_list:
self.app.tm.replicating(outdated_list) self.app.tm.replicating(outdated_list)
...@@ -236,7 +245,6 @@ class Replicator(object): ...@@ -236,7 +245,6 @@ class Replicator(object):
discarded_list = [] discarded_list = []
readable_list = [] readable_list = []
app = self.app app = self.app
last_tid, last_trans_dict, last_obj_dict, _ = app.dm.getLastIDs()
for offset, uuid, state in cell_list: for offset, uuid, state in cell_list:
if uuid == app.uuid: if uuid == app.uuid:
if state in (CellStates.DISCARDED, CellStates.CORRUPTED): if state in (CellStates.DISCARDED, CellStates.CORRUPTED):
...@@ -251,11 +259,9 @@ class Replicator(object): ...@@ -251,11 +259,9 @@ class Replicator(object):
elif state == CellStates.OUT_OF_DATE: elif state == CellStates.OUT_OF_DATE:
assert offset not in self.partition_dict assert offset not in self.partition_dict
self.partition_dict[offset] = p = Partition() self.partition_dict[offset] = p = Partition()
try: # New cell. 0 is also what should be stored by the backend.
p.next_trans = add64(last_trans_dict[offset], 1) # Nothing to optimize.
except KeyError: p.next_trans = p.next_obj = ZERO_TID
p.next_trans = ZERO_TID
p.next_obj = last_obj_dict.get(offset, ZERO_TID)
p.max_ttid = INVALID_TID p.max_ttid = INVALID_TID
added_list.append(offset) added_list.append(offset)
else: else:
...@@ -289,7 +295,7 @@ class Replicator(object): ...@@ -289,7 +295,7 @@ class Replicator(object):
next_tid = add64(tid, 1) next_tid = add64(tid, 1)
p.next_trans = p.next_obj = next_tid p.next_trans = p.next_obj = next_tid
if next_tid: if next_tid:
self.updateBackupTID() self.updateBackupTID(True)
self._nextPartition() self._nextPartition()
def _nextPartitionSortKey(self, offset): def _nextPartitionSortKey(self, offset):
...@@ -344,7 +350,7 @@ class Replicator(object): ...@@ -344,7 +350,7 @@ class Replicator(object):
try: try:
conn.ask(Packets.RequestIdentification(NodeTypes.STORAGE, conn.ask(Packets.RequestIdentification(NodeTypes.STORAGE,
None if name else app.uuid, app.server, name or app.name, None if name else app.uuid, app.server, name or app.name,
app.id_timestamp)) (), app.id_timestamp))
except ConnectionClosed: except ConnectionClosed:
if previous_node is self.current_node: if previous_node is self.current_node:
return return
...@@ -360,6 +366,9 @@ class Replicator(object): ...@@ -360,6 +366,9 @@ class Replicator(object):
offset = self.current_partition offset = self.current_partition
p = self.partition_dict[offset] p = self.partition_dict[offset]
if min_tid: if min_tid:
# More than one chunk ? This could be a full replication so avoid
# restarting from the beginning by committing now.
self.app.dm.commit()
p.next_trans = min_tid p.next_trans = min_tid
else: else:
try: try:
...@@ -384,13 +393,17 @@ class Replicator(object): ...@@ -384,13 +393,17 @@ class Replicator(object):
offset = self.current_partition offset = self.current_partition
p = self.partition_dict[offset] p = self.partition_dict[offset]
max_tid = self.replicate_tid max_tid = self.replicate_tid
dm = self.app.dm
if min_tid: if min_tid:
p.next_obj = min_tid p.next_obj = min_tid
self.updateBackupTID()
dm.updateCellTID(offset, add64(min_tid, -1))
dm.commit() # like in fetchTransactions
else: else:
min_tid = p.next_obj min_tid = p.next_obj
p.next_trans = add64(max_tid, 1) p.next_trans = add64(max_tid, 1)
object_dict = {} object_dict = {}
for serial, oid in self.app.dm.getReplicationObjectList(min_tid, for serial, oid in dm.getReplicationObjectList(min_tid,
max_tid, FETCH_COUNT, offset, min_oid): max_tid, FETCH_COUNT, offset, min_oid):
try: try:
object_dict[serial].append(oid) object_dict[serial].append(oid)
...@@ -406,11 +419,14 @@ class Replicator(object): ...@@ -406,11 +419,14 @@ class Replicator(object):
p = self.partition_dict[offset] p = self.partition_dict[offset]
p.next_obj = add64(tid, 1) p.next_obj = add64(tid, 1)
self.updateBackupTID() self.updateBackupTID()
app = self.app
app.dm.updateCellTID(offset, tid)
app.dm.commit()
if p.max_ttid or offset in self.replicate_dict and \ if p.max_ttid or offset in self.replicate_dict and \
offset not in self.source_dict: offset not in self.source_dict:
logging.debug("unfinished transactions: %r", self.ttid_set) logging.debug("unfinished transactions: %r", self.ttid_set)
else: else:
self.app.tm.replicated(offset, tid) app.tm.replicated(offset, tid)
logging.debug("partition %u replicated up to %s from %r", logging.debug("partition %u replicated up to %s from %r",
offset, dump(tid), self.current_node) offset, dump(tid), self.current_node)
self.getCurrentConnection().setReconnectionNoDelay() self.getCurrentConnection().setReconnectionNoDelay()
......
#
# Copyright (C) 2018 Nexedi SA
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
from msgpack import Packer, Unpacker
class Queue(object):
"""Unidirectional pipe for asynchronous and fast exchange of big amounts
of data between 2 processes.
It is implemented using shared memory, a few locks and msgpack
serialization. While the latter is faster than C pickle, it was mainly
chosen for its streaming API while deserializing, which greatly reduces
the locking overhead for the consumer process.
There is no mechanism to end a communication, so this information must be
in the exchanged data, for example by choosing a marker object like None:
- the last object sent by the producer is this marker
- the consumer stops iterating when it gets this marker
As long as there are data being exchanged, the 2 processes can't change
roles (producer/consumer).
"""
def __init__(self, max_size):
from multiprocessing import Lock, RawArray, RawValue
self._max_size = max_size
self._array = RawArray('c', max_size)
self._pos = RawValue('L')
self._size = RawValue('L')
self._locks = Lock(), Lock(), Lock()
def __repr__(self):
return "<%s pos=%s size=%s max_size=%s>" % (self.__class__.__name__,
self._pos.value, self._size.value, self._max_size)
def __iter__(self):
"""Iterate endlessly over all objects sent by the producer
Internally, this method uses a receiving buffer that is lost if
interrupted (GeneratorExit). If this buffer was not empty, the queue
is left in a inconsistent state and this method can't be called again.
So the correct way to split a loop is to first get an iterator
explicitly:
iq = iter(queue)
for x in iq:
if ...:
break
for x in iq:
...
"""
unpacker = Unpacker(use_list=False, raw=True)
feed = unpacker.feed
max_size = self._max_size
array = self._array
pos = self._pos
size = self._size
lock, get_lock, put_lock = self._locks
left = 0
while 1:
for data in unpacker:
yield data
while 1:
with lock:
p = pos.value
s = size.value
if s:
break
get_lock.acquire()
e = p + s
if e < max_size:
feed(array[p:e])
else:
feed(array[p:])
e -= max_size
feed(array[:e])
with lock:
pos.value = e
n = size.value
size.value = n - s
if n == max_size:
put_lock.acquire(0)
put_lock.release()
def __call__(self, iterable):
"""Fill the queue with given objects
Hoping than msgpack.Packer gets a streaming API, 'iterable' should not
be split (i.e. this method should be called only once, like __iter__).
"""
pack = Packer(use_bin_type=True).pack
max_size = self._max_size
array = self._array
pos = self._pos
size = self._size
lock, get_lock, put_lock = self._locks
left = 0
for data in iterable:
data = pack(data)
n = len(data)
i = 0
while 1:
if not left:
while 1:
with lock:
p = pos.value
j = size.value
left = max_size - j
if left:
break
put_lock.acquire()
p += j
if p >= max_size:
p -= max_size
e = min(p + min(n, left), max_size)
j = e - p
array[p:e] = data[i:i+j]
n -= j
i += j
with lock:
p = pos.value
s = size.value
j += s
size.value = j
if not s:
get_lock.acquire(0)
get_lock.release()
p += j
if p >= max_size:
p -= max_size
left = max_size - j
if not n:
break
def test(self):
import multiprocessing, random, sys, threading
from traceback import print_tb
r = range(50)
random.shuffle(r)
for P in threading.Thread, multiprocessing.Process:
q = Queue(23)
def t():
for n in xrange(len(r)):
yield '.' * n
yield
for n in r:
yield '.' * n
i = j = 0
p = P(target=q, args=(t(),))
p.daemon = 1
p.start()
try:
q = iter(q)
for i, x in enumerate(q):
if x is None:
break
self.assertEqual(x, '.' * i)
self.assertEqual(i, len(r))
for j in r:
self.assertEqual(next(q), '.' * j)
except KeyboardInterrupt:
print_tb(sys.exc_info()[2])
self.fail((i, j))
p.join()
if __name__ == '__main__':
import unittest
unittest.TextTestRunner().run(type('', (unittest.TestCase,), {
'runTest': test})())
...@@ -314,12 +314,15 @@ class TransactionManager(EventQueue): ...@@ -314,12 +314,15 @@ class TransactionManager(EventQueue):
Unlock transaction Unlock transaction
""" """
try: try:
tid = self._transaction_dict[ttid].tid transaction = self._transaction_dict[ttid]
except KeyError: except KeyError:
raise ProtocolError("unknown ttid %s" % dump(ttid)) raise ProtocolError("unknown ttid %s" % dump(ttid))
tid = transaction.tid
logging.debug('Unlock TXN %s (ttid=%s)', dump(tid), dump(ttid)) logging.debug('Unlock TXN %s (ttid=%s)', dump(tid), dump(ttid))
dm = self._app.dm dm = self._app.dm
dm.unlockTransaction(tid, ttid) dm.unlockTransaction(tid, ttid,
transaction.voted == 2,
transaction.store_dict)
self._app.em.setTimeout(time() + 1, dm.deferCommit()) self._app.em.setTimeout(time() + 1, dm.deferCommit())
self.abort(ttid, even_if_locked=True) self.abort(ttid, even_if_locked=True)
...@@ -521,7 +524,6 @@ class TransactionManager(EventQueue): ...@@ -521,7 +524,6 @@ class TransactionManager(EventQueue):
assert not even_if_locked assert not even_if_locked
# See how the master processes AbortTransaction from the client. # See how the master processes AbortTransaction from the client.
return return
logging.debug('Abort TXN %s', dump(ttid))
transaction = self._transaction_dict[ttid] transaction = self._transaction_dict[ttid]
locked = transaction.tid locked = transaction.tid
# if the transaction is locked, ensure we can drop it # if the transaction is locked, ensure we can drop it
...@@ -529,6 +531,7 @@ class TransactionManager(EventQueue): ...@@ -529,6 +531,7 @@ class TransactionManager(EventQueue):
if not even_if_locked: if not even_if_locked:
return return
else: else:
logging.debug('Abort TXN %s', dump(ttid))
dm = self._app.dm dm = self._app.dm
dm.abortTransaction(ttid) dm.abortTransaction(ttid)
dm.releaseData([x[1] for x in transaction.store_dict.itervalues()], dm.releaseData([x[1] for x in transaction.store_dict.itervalues()],
......
...@@ -28,6 +28,7 @@ import weakref ...@@ -28,6 +28,7 @@ import weakref
import MySQLdb import MySQLdb
import transaction import transaction
from ConfigParser import SafeConfigParser
from cStringIO import StringIO from cStringIO import StringIO
try: try:
from ZODB._compat import Unpickler from ZODB._compat import Unpickler
...@@ -155,8 +156,22 @@ def setupMySQLdb(db_list, user=DB_USER, password='', clear_databases=True): ...@@ -155,8 +156,22 @@ def setupMySQLdb(db_list, user=DB_USER, password='', clear_databases=True):
conn.commit() conn.commit()
conn.close() conn.close()
def ImporterConfigParser(adapter, zodb, **kw):
cfg = SafeConfigParser()
cfg.add_section("neo")
cfg.set("neo", "adapter", adapter)
for x in kw.iteritems():
cfg.set("neo", *x)
for name, zodb in zodb:
cfg.add_section(name)
for x in zodb.iteritems():
cfg.set(name, *x)
return cfg
class NeoTestBase(unittest.TestCase): class NeoTestBase(unittest.TestCase):
maxDiff = None
def setUp(self): def setUp(self):
logging.name = self.setupLog() logging.name = self.setupLog()
unittest.TestCase.setUp(self) unittest.TestCase.setUp(self)
...@@ -175,6 +190,8 @@ class NeoTestBase(unittest.TestCase): ...@@ -175,6 +190,8 @@ class NeoTestBase(unittest.TestCase):
# Note we don't even abort them because it may require a valid # Note we don't even abort them because it may require a valid
# connection to a master node (see Storage.sync()). # connection to a master node (see Storage.sync()).
transaction.manager.__init__() transaction.manager.__init__()
if logging._max_size is not None:
logging.flush()
class failureException(AssertionError): class failureException(AssertionError):
def __init__(self, msg=None): def __init__(self, msg=None):
......
...@@ -21,6 +21,7 @@ from .. import NeoUnitTestBase, buildUrlFromString ...@@ -21,6 +21,7 @@ from .. import NeoUnitTestBase, buildUrlFromString
from neo.client.app import Application from neo.client.app import Application
from neo.client.cache import test as testCache from neo.client.cache import test as testCache
from neo.client.exception import NEOStorageError from neo.client.exception import NEOStorageError
from neo.lib.util import p64
class ClientApplicationTests(NeoUnitTestBase): class ClientApplicationTests(NeoUnitTestBase):
...@@ -51,9 +52,7 @@ class ClientApplicationTests(NeoUnitTestBase): ...@@ -51,9 +52,7 @@ class ClientApplicationTests(NeoUnitTestBase):
def makeOID(self, value=None): def makeOID(self, value=None):
from random import randint from random import randint
if value is None: return p64(randint(1, 255) if value is None else value)
value = randint(1, 255)
return '\00' * 7 + chr(value)
makeTID = makeOID makeTID = makeOID
def makeTransactionObject(self, user='u', description='d', _extension='e'): def makeTransactionObject(self, user='u', description='d', _extension='e'):
......
...@@ -221,7 +221,7 @@ class ClusterPdb(object): ...@@ -221,7 +221,7 @@ class ClusterPdb(object):
def wait(self, test, timeout): def wait(self, test, timeout):
end_time = time() + timeout end_time = time() + timeout
period = 0.1 period = 0.01
while not test(): while not test():
cluster_dict.acquire() cluster_dict.acquire()
try: try:
...@@ -232,7 +232,6 @@ class ClusterPdb(object): ...@@ -232,7 +232,6 @@ class ClusterPdb(object):
next_sleep = max(last_pdb + timeout, end_time) - time() next_sleep = max(last_pdb + timeout, end_time) - time()
if next_sleep > period: if next_sleep > period:
next_sleep = period next_sleep = period
period *= 1.5
elif next_sleep < 0: elif next_sleep < 0:
return False return False
finally: finally:
......
#
# Copyright (C) 2014-2017 Nexedi SA
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
import os, stat, time
from persistent import Persistent
from BTrees.OOBTree import OOBTree
class Inode(OOBTree):
data = None
def __init__(self, up=None, mode=stat.S_IFDIR):
self[os.pardir] = self if up is None else up
self.mode = mode
self.mtime = time.time()
def __getstate__(self):
return Persistent.__getstate__(self), OOBTree.__getstate__(self)
def __setstate__(self, state):
Persistent.__setstate__(self, state[0])
OOBTree.__setstate__(self, state[1])
def edit(self, data=None, mtime=None):
fmt = stat.S_IFMT(self.mode)
if data is None:
assert fmt == stat.S_IFDIR, oct(fmt)
else:
assert fmt == stat.S_IFREG or fmt == stat.S_IFLNK, oct(fmt)
if self.data != data:
self.data = data
if self.mtime != mtime:
self.mtime = mtime or time.time()
def root(self):
try:
self = self[os.pardir]
except KeyError:
return self
return self.root()
def traverse(self, path, followlinks=True):
path = iter(path.split(os.sep) if isinstance(path, basestring) and path
else path)
for d in path:
if not d:
return self.root().traverse(path, followlinks)
if d != os.curdir:
d = self[d]
if followlinks and stat.S_ISLNK(d.mode):
d = self.traverse(d.data, True)
return d.traverse(path, followlinks)
return self
def inodeFromFs(self, path):
s = os.lstat(path)
mode = s.st_mode
name = os.path.basename(path)
try:
i = self[name]
assert stat.S_IFMT(i.mode) == stat.S_IFMT(mode)
changed = False
except KeyError:
i = self[name] = self.__class__(self, mode)
changed = True
i.edit(open(path).read() if stat.S_ISREG(mode) else
os.readlink(p) if stat.S_ISLNK(mode) else
None, s.st_mtime)
return changed or i._p_changed
def treeFromFs(self, path, yield_interval=None, filter=None):
prefix_len = len(path) + len(os.sep)
n = 0
for dirpath, dirnames, filenames in os.walk(path):
inodeFromFs = self.traverse(dirpath[prefix_len:]).inodeFromFs
for names in dirnames, filenames:
skipped = []
for j, name in enumerate(names):
p = os.path.join(dirpath, name)
if filter and not filter(p[prefix_len:]):
skipped.append(j)
elif inodeFromFs(p):
n += 1
if n == yield_interval:
n = 0
yield self
while skipped:
del names[skipped.pop()]
if n:
yield self
def walk(self):
s = [(None, self)]
while s:
top, self = s.pop()
dirs = []
nondirs = []
for name, inode in self.iteritems():
if name != os.pardir:
(dirs if stat.S_ISDIR(inode.mode) else nondirs).append(name)
yield top or os.curdir, dirs, nondirs
for name in dirs:
s.append((os.path.join(top, name) if top else name, self[name]))
...@@ -29,17 +29,16 @@ import tempfile ...@@ -29,17 +29,16 @@ import tempfile
import traceback import traceback
import threading import threading
import psutil import psutil
from ConfigParser import SafeConfigParser
import neo.scripts import neo.scripts
from neo.neoctl.neoctl import NeoCTL, NotReadyException from neo.neoctl.neoctl import NeoCTL, NotReadyException
from neo.lib import logging from neo.lib import logging
from neo.lib.protocol import ClusterStates, NodeTypes, CellStates, NodeStates, \ from neo.lib.protocol import ClusterStates, NodeTypes, CellStates, NodeStates, \
UUID_NAMESPACES UUID_NAMESPACES
from neo.lib.util import dump from neo.lib.util import dump, setproctitle
from .. import (ADDRESS_TYPE, DB_SOCKET, DB_USER, IP_VERSION_FORMAT_DICT, SSL, from .. import (ADDRESS_TYPE, DB_SOCKET, DB_USER, IP_VERSION_FORMAT_DICT, SSL,
buildUrlFromString, cluster, getTempDirectory, NeoTestBase, Patch, buildUrlFromString, cluster, getTempDirectory, setupMySQLdb,
setupMySQLdb) ImporterConfigParser, NeoTestBase, Patch)
from neo.client.Storage import Storage from neo.client.Storage import Storage
from neo.storage.database import manager, buildDatabaseManager from neo.storage.database import manager, buildDatabaseManager
...@@ -116,36 +115,31 @@ class PortAllocator(object): ...@@ -116,36 +115,31 @@ class PortAllocator(object):
__del__ = release __del__ = release
class NEOProcess(object): class Process(object):
_coverage_fd = None _coverage_fd = None
_coverage_prefix = os.path.join(getTempDirectory(), 'coverage-') _coverage_prefix = os.path.join(getTempDirectory(), 'coverage-')
_coverage_index = 0 _coverage_index = 0
pid = 0 pid = 0
def __init__(self, command, uuid, arg_dict): def __init__(self, command, arg_dict={}):
try:
__import__('neo.scripts.' + command, level=0)
except ImportError:
raise NotFound, '%s not found' % (command)
self.command = command self.command = command
self.arg_dict = arg_dict self.arg_dict = arg_dict
self.with_uuid = True
self.setUUID(uuid)
def start(self, with_uuid=True): def _args(self):
# Prevent starting when already forked and wait wasn't called.
if self.pid != 0:
raise AlreadyRunning, 'Already running with PID %r' % (self.pid, )
command = self.command
args = [] args = []
self.with_uuid = with_uuid
for arg, param in self.arg_dict.iteritems(): for arg, param in self.arg_dict.iteritems():
args.append('--' + arg) args.append('--' + arg)
if param is not None: if param is not None:
args.append(str(param)) args.append(str(param))
if with_uuid: return args
args += '--uuid', str(self.uuid)
def start(self):
# Prevent starting when already forked and wait wasn't called.
if self.pid != 0:
raise AlreadyRunning('Already running with PID %r' % self.pid)
command = self.command
args = self._args()
global coverage global coverage
if coverage: if coverage:
cls = self.__class__ cls = self.__class__
...@@ -159,7 +153,7 @@ class NEOProcess(object): ...@@ -159,7 +153,7 @@ class NEOProcess(object):
if args: if args:
os.close(w) os.close(w)
os.kill(os.getpid(), signal.SIGSTOP) os.kill(os.getpid(), signal.SIGSTOP)
self.pid = logging.fork() self.pid = os.fork()
if self.pid: if self.pid:
# Wait that the signal to kill the child is set up. # Wait that the signal to kill the child is set up.
os.close(w) os.close(w)
...@@ -179,7 +173,8 @@ class NEOProcess(object): ...@@ -179,7 +173,8 @@ class NEOProcess(object):
os.close(self._coverage_fd) os.close(self._coverage_fd)
os.write(w, '\0') os.write(w, '\0')
sys.argv = [command] + args sys.argv = [command] + args
getattr(neo.scripts, command).main() setproctitle(self.command)
self.run()
status = 0 status = 0
except SystemExit, e: except SystemExit, e:
status = e.code status = e.code
...@@ -203,6 +198,9 @@ class NEOProcess(object): ...@@ -203,6 +198,9 @@ class NEOProcess(object):
logging.info('pid %u: %s %s', logging.info('pid %u: %s %s',
self.pid, command, ' '.join(map(repr, args))) self.pid, command, ' '.join(map(repr, args)))
def run(self):
raise NotImplementedError
def child_coverage(self): def child_coverage(self):
r = self._coverage_fd r = self._coverage_fd
if r is not None: if r is not None:
...@@ -249,11 +247,32 @@ class NEOProcess(object): ...@@ -249,11 +247,32 @@ class NEOProcess(object):
self.kill() self.kill()
self.wait() self.wait()
def getPID(self): def isAlive(self):
return self.pid try:
return psutil.Process(self.pid).status() != psutil.STATUS_ZOMBIE
except psutil.NoSuchProcess:
return False
class NEOProcess(Process):
def __init__(self, command, uuid, arg_dict):
try:
__import__('neo.scripts.' + command, level=0)
except ImportError:
raise NotFound(command + ' not found')
super(NEOProcess, self).__init__(command, arg_dict)
self.setUUID(uuid)
def _args(self):
args = super(NEOProcess, self)._args()
if self.uuid:
args += '--uuid', str(self.uuid)
return args
def run(self):
getattr(neo.scripts, self.command).main()
def getUUID(self): def getUUID(self):
assert self.with_uuid, 'UUID disabled on this process'
return self.uuid return self.uuid
def setUUID(self, uuid): def setUUID(self, uuid):
...@@ -262,12 +281,6 @@ class NEOProcess(object): ...@@ -262,12 +281,6 @@ class NEOProcess(object):
""" """
self.uuid = uuid self.uuid = uuid
def isAlive(self):
try:
return psutil.Process(self.pid).status() != psutil.STATUS_ZOMBIE
except psutil.NoSuchProcess:
return False
class NEOCluster(object): class NEOCluster(object):
SSL = None SSL = None
...@@ -304,14 +317,8 @@ class NEOCluster(object): ...@@ -304,14 +317,8 @@ class NEOCluster(object):
IP_VERSION_FORMAT_DICT[self.address_type] IP_VERSION_FORMAT_DICT[self.address_type]
self.setupDB(clear_databases) self.setupDB(clear_databases)
if importer: if importer:
cfg = SafeConfigParser() cfg = ImporterConfigParser(adapter, **importer)
cfg.add_section("neo")
cfg.set("neo", "adapter", adapter)
cfg.set("neo", "database", self.db_template(*db_list)) cfg.set("neo", "database", self.db_template(*db_list))
for name, zodb in importer:
cfg.add_section(name)
for x in zodb.iteritems():
cfg.set(name, *x)
importer_conf = os.path.join(temp_dir, 'importer.cfg') importer_conf = os.path.join(temp_dir, 'importer.cfg')
with open(importer_conf, 'w') as f: with open(importer_conf, 'w') as f:
cfg.write(f) cfg.write(f)
......
...@@ -202,9 +202,9 @@ class ClientTests(NEOFunctionalTest): ...@@ -202,9 +202,9 @@ class ClientTests(NEOFunctionalTest):
self.neo.stop() self.neo.stop()
self.neo = NEOCluster(db_list=['test_neo1'], partitions=3, self.neo = NEOCluster(db_list=['test_neo1'], partitions=3,
importer=[("root", { importer={"zodb": [("root", {
"storage": "<filestorage>\npath %s\n</filestorage>" "storage": "<filestorage>\npath %s\n</filestorage>"
% dfs_storage.getName()})], % dfs_storage.getName()})]},
temp_dir=self.getTempDirectory()) temp_dir=self.getTempDirectory())
self.neo.start() self.neo.start()
neo_db, neo_conn = self.neo.getZODBConnection() neo_db, neo_conn = self.neo.getZODBConnection()
......
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>. # along with this program. If not, see <http://www.gnu.org/licenses/>.
import random, time, unittest import random, time, unittest
from collections import defaultdict from collections import Counter, defaultdict
from .. import NeoUnitTestBase from .. import NeoUnitTestBase
from neo.lib import logging from neo.lib import logging
from neo.lib.protocol import NodeStates, CellStates from neo.lib.protocol import NodeStates, CellStates
...@@ -291,26 +291,94 @@ class MasterPartitionTableTests(NeoUnitTestBase): ...@@ -291,26 +291,94 @@ class MasterPartitionTableTests(NeoUnitTestBase):
self.update(pt, self.tweak(pt, sn[:1])) self.update(pt, self.tweak(pt, sn[:1]))
self.assertPartitionTable(pt, '.U.|..U|.U.|..U|.U.|..U|.U.') self.assertPartitionTable(pt, '.U.|..U|.U.|..U|.U.|..U|.U.')
def test_18_tweak(self): def test_18_tweakBigPT(self):
s = repr(time.time()) seed = repr(time.time())
logging.info("using seed %r", s) logging.info("using seed %r", seed)
r = random.Random(s)
sn_count = 11 sn_count = 11
sn = [self.createStorage(None, i + 1, NodeStates.RUNNING) sn = [self.createStorage(None, i + 1, NodeStates.RUNNING)
for i in xrange(sn_count)] for i in xrange(sn_count)]
pt = PartitionTable(1000, 2) for topo in 0, 1:
pt.setID(1) r = random.Random(seed)
for offset in xrange(pt.np): if topo:
state = CellStates.UP_TO_DATE for i, s in enumerate(sn, sn_count):
k = r.randrange(1, sn_count) s.devpath = str(i % 5),
for s in r.sample(sn, k): pt = PartitionTable(1000, 2)
pt._setCell(offset, s, state) pt.setID(1)
if k * r.random() < 1: for offset in xrange(pt.np):
state = CellStates.OUT_OF_DATE state = CellStates.UP_TO_DATE
k = r.randrange(1, sn_count)
for s in r.sample(sn, k):
pt._setCell(offset, s, state)
if k * r.random() < 1:
state = CellStates.OUT_OF_DATE
pt.log()
self.tweak(pt)
self.update(pt)
def test_19_topology(self):
sn_count = 16
sn = [self.createStorage(None, i + 1, NodeStates.RUNNING)
for i in xrange(sn_count)]
pt = PartitionTable(48, 2)
pt.make(sn)
pt.log() pt.log()
self.tweak(pt) for i, s in enumerate(sn, sn_count):
s.devpath = tuple(bin(i)[3:-1])
self.assertEqual(Counter(x[2] for x in self.tweak(pt)), {
CellStates.OUT_OF_DATE: 96,
CellStates.FEEDING: 96,
})
self.update(pt) self.update(pt)
x = lambda n, *x: ('|'.join(x[:1]*n), '|'.join(x[1:]*n))
for even, np, i, topo, expected in (
## Optimal topology.
# All nodes have same number of cells.
(1, 2, 2, ("00", "01", "02", "10", "11", "12"), ('UU...U|..UUU.',
'UU.U..|..U.UU')),
(1, 7, 1, "0001122", (
'U.....U|.U.U...|..U.U..|U....U.|.U....U|..UU...|....UU.',
'U..U...|.U...U.|..U.U..|U.....U|.U.U...|..U..U.|....U.U')),
(1, 4, 1, "00011122", ('U......U|.U.U....|..U.U...|.....UU.',
'U..U....|.U..U...|..U...U.|.....U.U')),
(1, 9, 1, "000111222", ('U.......U|.U.U.....|..U.U....|'
'.....UU..|U......U.|.U......U|'
'..UU.....|....U.U..|.....U.U.',
'U..U.....|.U....U..|..U.U....|'
'.....U.U.|U.......U|.U.U.....|'
'..U...U..|....U..U.|.....U..U')),
# Some nodes have a extra cell.
(0, 8, 1, "0001122", ('U.....U|.U.U...|..U.U..|U....U.|'
'.U....U|..UU...|....UU.|U.....U',
'U..U...|.U...U.|..U.U..|U.....U|'
'.U.U...|..U..U.|....U.U|U..U...')),
## Topology ignored.
(1, 6, 1, ("00", "01", "1"), 'UU.|U.U|.UU|UU.|U.U|.UU'),
(1, 5, 2, "01233", 'UUU..|U..UU|.UUU.|UU..U|..UUU'),
):
assert len(topo) <= sn_count
sn2 = sn[:len(topo)]
for s in sn2:
s.devpath = ()
k = (1,7)[even]
pt = PartitionTable(np*k, i)
pt.make(sn2)
for devpath, s in zip(topo, sn2):
s.devpath = tuple(devpath)
if type(expected) is tuple:
self.assertTrue(self.tweak(pt))
self.update(pt)
self.assertPartitionTable(pt, '|'.join(expected[:1]*k))
pt.clear()
pt.make(sn2)
self.assertPartitionTable(pt, '|'.join(expected[1:]*k))
self.assertFalse(pt.tweak())
else:
expected = '|'.join((expected,)*k)
self.assertFalse(pt.tweak())
self.assertPartitionTable(pt, expected)
pt.clear()
pt.make(sn2)
self.assertPartitionTable(pt, expected)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()
......
#
# Copyright (C) 2018 Nexedi SA
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
import hashlib, random
from collections import deque
from itertools import islice
from persistent import Persistent
from BTrees.IOBTree import IOBTree
from .stat_zodb import _DummyData
def generateTree(random=random):
tree = []
N = 5
fifo = deque()
path = ()
size = lambda: max(int(random.gauss(40,30)), 0)
while 1:
tree.extend(path + (i, size())
for i in xrange(-random.randrange(N), 0))
n = N * (1 - len(path)) + random.randrange(N)
for i in xrange(n):
fifo.append(path + (i,))
try:
path = fifo.popleft()
except IndexError:
break
change = tree
while change:
change = [x[:-1] + (size(),) for x in change if random.randrange(2)]
tree += change
random.shuffle(tree)
return tree
class Leaf(Persistent):
pass
Node = IOBTree
def importTree(root, tree, yield_interval=None, filter=None):
n = 0
for path in tree:
node = root
for i, x in enumerate(path[:-1], 1):
if filter and not filter(path[:i]):
break
if x < 0:
try:
node = node[x]
except KeyError:
node[x] = node = Leaf()
node.data = bytes(_DummyData(random.Random(path), path[-1]))
else:
try:
node = node[x]
continue
except KeyError:
node[x] = node = Node()
n += 1
if n == yield_interval:
n = 0
yield root
if n:
yield root
class hashTree(object):
_hash = None
_new = hashlib.md5
def __init__(self, node):
s = [((), node)]
def walk():
h = self._new()
update = h.update
while s:
top, node = s.pop()
try:
update('%s %s %s\n' % (top, len(node.data),
self._new(node.data).hexdigest()))
yield
except AttributeError:
update('%s %s\n' % (top, tuple(node.keys())))
yield
for k, v in reversed(node.items()):
s.append((top + (k,), v))
del self._walk
self._hash = h
self._walk = walk()
def __getattr__(self, attr):
return getattr(self._hash, attr)
def __call__(self, n=None):
if n is None:
return sum(1 for _ in self._walk)
next(islice(self._walk, n - 1, None))
...@@ -19,11 +19,13 @@ PROD1 = lambda random=random: DummyZODB(6.04237779991, 1.55811487853, ...@@ -19,11 +19,13 @@ PROD1 = lambda random=random: DummyZODB(6.04237779991, 1.55811487853,
1.04108991045, 0.906703192546, 1.04108991045, 0.906703192546,
0.810080409164, random) 0.810080409164, random)
def DummyData(random=random): def _DummyData(random, size):
# returns data that gzip at about 28.5 % # returns data that gzip at about 28.5 %
return bytearray(int(random.gauss(0, .8)) % 256 for x in xrange(size))
def DummyData(random=random):
# make sure sample is bigger than dictionary of compressor # make sure sample is bigger than dictionary of compressor
data = ''.join(chr(int(random.gauss(0, .8)) % 256) for x in xrange(100000)) return StringIO(_DummyData(random, 100000))
return StringIO(data)
class DummyZODB(object): class DummyZODB(object):
......
...@@ -89,7 +89,7 @@ class StorageDBTests(NeoUnitTestBase): ...@@ -89,7 +89,7 @@ class StorageDBTests(NeoUnitTestBase):
self.db.lockTransaction(tid, ttid) self.db.lockTransaction(tid, ttid)
yield yield
if commit: if commit:
self.db.unlockTransaction(tid, ttid) self.db.unlockTransaction(tid, ttid, True, objs)
self.db.commit() self.db.commit()
elif commit is not None: elif commit is not None:
self.db.abortTransaction(ttid) self.db.abortTransaction(ttid)
...@@ -227,6 +227,7 @@ class StorageDBTests(NeoUnitTestBase): ...@@ -227,6 +227,7 @@ class StorageDBTests(NeoUnitTestBase):
def test_changePartitionTable(self): def test_changePartitionTable(self):
db = self.getDB() db = self.getDB()
db.setNumPartitions(3)
ptid = 1 ptid = 1
uuid = self.getStorageUUID() uuid = self.getStorageUUID()
cell1 = 0, uuid, CellStates.OUT_OF_DATE cell1 = 0, uuid, CellStates.OUT_OF_DATE
...@@ -253,7 +254,7 @@ class StorageDBTests(NeoUnitTestBase): ...@@ -253,7 +254,7 @@ class StorageDBTests(NeoUnitTestBase):
txn1, objs1 = self.getTransaction([oid1]) txn1, objs1 = self.getTransaction([oid1])
txn2, objs2 = self.getTransaction([oid2]) txn2, objs2 = self.getTransaction([oid2])
# nothing in database # nothing in database
self.assertEqual(self.db.getLastIDs(), (None, {}, {}, None)) self.assertEqual(self.db.getLastIDs(), (None, None))
self.assertEqual(self.db.getUnfinishedTIDDict(), {}) self.assertEqual(self.db.getUnfinishedTIDDict(), {})
self.assertEqual(self.db.getObject(oid1), None) self.assertEqual(self.db.getObject(oid1), None)
self.assertEqual(self.db.getObject(oid2), None) self.assertEqual(self.db.getObject(oid2), None)
...@@ -319,13 +320,17 @@ class StorageDBTests(NeoUnitTestBase): ...@@ -319,13 +320,17 @@ class StorageDBTests(NeoUnitTestBase):
expected = [(t, oid_list[offset+i]) for t in tids for i in (0, np)] expected = [(t, oid_list[offset+i]) for t in tids for i in (0, np)]
self.assertEqual(self.db.getReplicationObjectList(ZERO_TID, self.assertEqual(self.db.getReplicationObjectList(ZERO_TID,
MAX_TID, len(expected) + 1, offset, ZERO_OID), expected) MAX_TID, len(expected) + 1, offset, ZERO_OID), expected)
self.db._deleteRange(0, MAX_TID) def deleteRange(partition, min_tid=None, max_tid=None):
self.db._deleteRange(0, max_tid=ZERO_TID) self.db._deleteRange(partition,
None if min_tid is None else u64(min_tid),
None if max_tid is None else u64(max_tid))
deleteRange(0, MAX_TID)
deleteRange(0, max_tid=ZERO_TID)
check(0, [], t1, t2, t3) check(0, [], t1, t2, t3)
self.db._deleteRange(0); check(0, []) deleteRange(0); check(0, [])
self.db._deleteRange(1, t2); check(1, [t1], t1, t2) deleteRange(1, t2); check(1, [t1], t1, t2)
self.db._deleteRange(2, max_tid=t2); check(2, [], t3) deleteRange(2, max_tid=t2); check(2, [], t3)
self.db._deleteRange(3, t1, t2); check(3, [t3], t1, t3) deleteRange(3, t1, t2); check(3, [t3], t1, t3)
def test_getTransaction(self): def test_getTransaction(self):
oid1, oid2 = self.getOIDs(2) oid1, oid2 = self.getOIDs(2)
......
...@@ -15,17 +15,32 @@ ...@@ -15,17 +15,32 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>. # along with this program. If not, see <http://www.gnu.org/licenses/>.
import unittest import unittest
from MySQLdb import NotSupportedError, OperationalError from contextlib import contextmanager
from MySQLdb import NotSupportedError, OperationalError, ProgrammingError
from MySQLdb.constants.CR import SERVER_GONE_ERROR
from MySQLdb.constants.ER import UNKNOWN_STORAGE_ENGINE from MySQLdb.constants.ER import UNKNOWN_STORAGE_ENGINE
from ..mock import Mock from ..mock import Mock
from neo.lib.exception import DatabaseFailure
from neo.lib.protocol import ZERO_OID from neo.lib.protocol import ZERO_OID
from neo.lib.util import p64 from neo.lib.util import p64
from .. import DB_PREFIX, DB_SOCKET, DB_USER from .. import DB_PREFIX, DB_SOCKET, DB_USER, Patch
from .testStorageDBTests import StorageDBTests from .testStorageDBTests import StorageDBTests
from neo.storage.database import DatabaseFailure
from neo.storage.database.mysqldb import MySQLDatabaseManager from neo.storage.database.mysqldb import MySQLDatabaseManager
class ServerGone(object):
@contextmanager
def __new__(cls, db):
self = object.__new__(cls)
with Patch(db, conn=self) as self._p:
yield self._p
def query(self, *args):
self._p.revert()
raise OperationalError(SERVER_GONE_ERROR, 'this is a test')
class StorageMySQLdbTests(StorageDBTests): class StorageMySQLdbTests(StorageDBTests):
engine = None engine = None
...@@ -67,23 +82,9 @@ class StorageMySQLdbTests(StorageDBTests): ...@@ -67,23 +82,9 @@ class StorageMySQLdbTests(StorageDBTests):
calls[0].checkArgs('SELECT ') calls[0].checkArgs('SELECT ')
def test_query2(self): def test_query2(self):
# test the OperationalError exception with ServerGone(self.db) as p:
# fake object, raise exception during the first call self.assertRaises(ProgrammingError, self.db.query, 'QUERY')
from MySQLdb.constants.CR import SERVER_GONE_ERROR self.assertFalse(p.applied)
class FakeConn(object):
def query(*args):
raise OperationalError(SERVER_GONE_ERROR, 'this is a test')
self.db.conn = FakeConn()
self.connect_called = False
def connect_hook():
# mock object, break raise/connect loop
self.db.conn = Mock()
self.connect_called = True
self.db._connect = connect_hook
# make a query, exception will be raised then connect() will be
# called and the second query will use the mock object
self.db.query('INSERT')
self.assertTrue(self.connect_called)
def test_query3(self): def test_query3(self):
# OperationalError > raise DatabaseFailure exception # OperationalError > raise DatabaseFailure exception
......
...@@ -21,6 +21,8 @@ from neo.lib.util import ReadBuffer, parseNodeAddress ...@@ -21,6 +21,8 @@ from neo.lib.util import ReadBuffer, parseNodeAddress
class UtilTests(NeoUnitTestBase): class UtilTests(NeoUnitTestBase):
from neo.storage.shared_queue import test as testSharedQueue
def test_parseNodeAddress(self): def test_parseNodeAddress(self):
""" Parsing of addresses """ """ Parsing of addresses """
def test(parsed, *args): def test(parsed, *args):
......
...@@ -19,7 +19,6 @@ ...@@ -19,7 +19,6 @@
import os, random, select, socket, sys, tempfile import os, random, select, socket, sys, tempfile
import thread, threading, time, traceback, weakref import thread, threading, time, traceback, weakref
from collections import deque from collections import deque
from ConfigParser import SafeConfigParser
from contextlib import contextmanager from contextlib import contextmanager
from itertools import count from itertools import count
from functools import partial, wraps from functools import partial, wraps
...@@ -37,8 +36,9 @@ from neo.lib.handler import EventHandler ...@@ -37,8 +36,9 @@ from neo.lib.handler import EventHandler
from neo.lib.locking import SimpleQueue from neo.lib.locking import SimpleQueue
from neo.lib.protocol import ClusterStates, Enum, NodeStates, NodeTypes, Packets from neo.lib.protocol import ClusterStates, Enum, NodeStates, NodeTypes, Packets
from neo.lib.util import cached_property, parseMasterList, p64 from neo.lib.util import cached_property, parseMasterList, p64
from .. import NeoTestBase, Patch, getTempDirectory, setupMySQLdb, \ from .. import (getTempDirectory, setupMySQLdb,
ADDRESS_TYPE, IP_VERSION_FORMAT_DICT, DB_PREFIX, DB_SOCKET, DB_USER ImporterConfigParser, NeoTestBase, Patch,
ADDRESS_TYPE, IP_VERSION_FORMAT_DICT, DB_PREFIX, DB_SOCKET, DB_USER)
BIND = IP_VERSION_FORMAT_DICT[ADDRESS_TYPE], 0 BIND = IP_VERSION_FORMAT_DICT[ADDRESS_TYPE], 0
LOCAL_IP = socket.inet_pton(ADDRESS_TYPE, IP_VERSION_FORMAT_DICT[ADDRESS_TYPE]) LOCAL_IP = socket.inet_pton(ADDRESS_TYPE, IP_VERSION_FORMAT_DICT[ADDRESS_TYPE])
...@@ -185,6 +185,8 @@ class Serialized(object): ...@@ -185,6 +185,8 @@ class Serialized(object):
# a single-core CPU, other threads are still busy and haven't # a single-core CPU, other threads are still busy and haven't
# sent anything yet on the network. This causes tic() to # sent anything yet on the network. This causes tic() to
# return prematurely. Passing a non-zero value is a hack. # return prematurely. Passing a non-zero value is a hack.
# We also increase SocketConnector.SOMAXCONN in tests so that
# a connection attempt is never delayed inside the kernel.
timeout=0): timeout=0):
# If you're in a pdb here, 'n' switches to another thread # If you're in a pdb here, 'n' switches to another thread
# (the following lines are not supposed to be debugged into) # (the following lines are not supposed to be debugged into)
...@@ -634,6 +636,7 @@ class NEOCluster(object): ...@@ -634,6 +636,7 @@ class NEOCluster(object):
Patch(BaseConnection, getTimeout=lambda orig, self: None), Patch(BaseConnection, getTimeout=lambda orig, self: None),
Patch(SimpleQueue, __init__=__init__), Patch(SimpleQueue, __init__=__init__),
Patch(SocketConnector, CONNECT_LIMIT=0), Patch(SocketConnector, CONNECT_LIMIT=0),
Patch(SocketConnector, SOMAXCONN=128), # see Serialized.tic comment
Patch(SocketConnector, _bind=lambda orig, self, addr: orig(self, BIND)), Patch(SocketConnector, _bind=lambda orig, self, addr: orig(self, BIND)),
Patch(SocketConnector, _connect = lambda orig, self, addr: Patch(SocketConnector, _connect = lambda orig, self, addr:
orig(self, ServerNode.resolv(addr)))) orig(self, ServerNode.resolv(addr))))
...@@ -674,8 +677,8 @@ class NEOCluster(object): ...@@ -674,8 +677,8 @@ class NEOCluster(object):
adapter=os.getenv('NEO_TESTS_ADAPTER', 'SQLite'), adapter=os.getenv('NEO_TESTS_ADAPTER', 'SQLite'),
storage_count=None, db_list=None, clear_databases=True, storage_count=None, db_list=None, clear_databases=True,
db_user=DB_USER, db_password='', compress=True, db_user=DB_USER, db_password='', compress=True,
importer=None, autostart=None, dedup=False): importer=None, autostart=None, dedup=False, name=None):
self.name = 'neo_%s' % self._allocate('name', self.name = name or 'neo_%s' % self._allocate('name',
lambda: random.randint(0, 100)) lambda: random.randint(0, 100))
self.compress = compress self.compress = compress
self.num_partitions = partitions self.num_partitions = partitions
...@@ -707,14 +710,8 @@ class NEOCluster(object): ...@@ -707,14 +710,8 @@ class NEOCluster(object):
else: else:
assert False, adapter assert False, adapter
if importer: if importer:
cfg = SafeConfigParser() cfg = ImporterConfigParser(adapter, **importer)
cfg.add_section("neo")
cfg.set("neo", "adapter", adapter)
cfg.set("neo", "database", db % tuple(db_list)) cfg.set("neo", "database", db % tuple(db_list))
for name, zodb in importer:
cfg.add_section(name)
for x in zodb.iteritems():
cfg.set(name, *x)
db = os.path.join(getTempDirectory(), '%s.conf') db = os.path.join(getTempDirectory(), '%s.conf')
with open(db % tuple(db_list), "w") as f: with open(db % tuple(db_list), "w") as f:
cfg.write(f) cfg.write(f)
...@@ -813,7 +810,7 @@ class NEOCluster(object): ...@@ -813,7 +810,7 @@ class NEOCluster(object):
else NodeStates.RUNNING) else NodeStates.RUNNING)
for node in self.storage_list if storage_list is None else storage_list: for node in self.storage_list if storage_list is None else storage_list:
state = self.getNodeState(node) state = self.getNodeState(node)
assert state == expected_state, (node, state) assert state == expected_state, (repr(node), state)
def stop(self, clear_database=False, __print_exc=traceback.print_exc, **kw): def stop(self, clear_database=False, __print_exc=traceback.print_exc, **kw):
if self.started: if self.started:
...@@ -933,10 +930,9 @@ class NEOCluster(object): ...@@ -933,10 +930,9 @@ class NEOCluster(object):
if dummy_zodb is None: if dummy_zodb is None:
from ..stat_zodb import PROD1 from ..stat_zodb import PROD1
dummy_zodb = PROD1(random) dummy_zodb = PROD1(random)
preindex = {}
as_storage = dummy_zodb.as_storage as_storage = dummy_zodb.as_storage
return lambda count: self.getZODBStorage().importFrom( return lambda count: self.getZODBStorage().copyTransactionsFrom(
as_storage(count), preindex=preindex) as_storage(count))
def populate(self, transaction_list, tid=lambda i: p64(i+1), def populate(self, transaction_list, tid=lambda i: p64(i+1),
oid=lambda i: p64(i+1)): oid=lambda i: p64(i+1)):
...@@ -1061,8 +1057,12 @@ class NEOThreadedTest(NeoTestBase): ...@@ -1061,8 +1057,12 @@ class NEOThreadedTest(NeoTestBase):
with Patch(client, _getFinalTID=lambda *_: None): with Patch(client, _getFinalTID=lambda *_: None):
self.assertRaises(ConnectionClosed, txn.commit) self.assertRaises(ConnectionClosed, txn.commit)
def assertPartitionTable(self, cluster, expected, pt_node=None): def assertPartitionTable(self, cluster, expected, pt_node=None,
index = [x.uuid for x in cluster.storage_list].index sort_by_nid=False):
if sort_by_nid:
index = lambda x: x
else:
index = [x.uuid for x in cluster.storage_list].index
super(NEOThreadedTest, self).assertPartitionTable( super(NEOThreadedTest, self).assertPartitionTable(
(pt_node or cluster.admin).pt, expected, (pt_node or cluster.admin).pt, expected,
lambda x: index(x.getUUID())) lambda x: index(x.getUUID()))
......
...@@ -23,7 +23,6 @@ import unittest ...@@ -23,7 +23,6 @@ import unittest
from collections import defaultdict from collections import defaultdict
from contextlib import contextmanager from contextlib import contextmanager
from thread import get_ident from thread import get_ident
from zlib import compress
from persistent import Persistent, GHOST from persistent import Persistent, GHOST
from transaction.interfaces import TransientError from transaction.interfaces import TransientError
from ZODB import DB, POSException from ZODB import DB, POSException
...@@ -31,7 +30,7 @@ from ZODB.DB import TransactionalUndo ...@@ -31,7 +30,7 @@ from ZODB.DB import TransactionalUndo
from neo.storage.transactions import TransactionManager, ConflictError from neo.storage.transactions import TransactionManager, ConflictError
from neo.lib.connection import ConnectionClosed, \ from neo.lib.connection import ConnectionClosed, \
ServerConnection, MTClientConnection ServerConnection, MTClientConnection
from neo.lib.exception import DatabaseFailure, StoppedOperation from neo.lib.exception import StoppedOperation
from neo.lib.handler import DelayEvent, EventHandler from neo.lib.handler import DelayEvent, EventHandler
from neo.lib import logging from neo.lib import logging
from neo.lib.protocol import (CellStates, ClusterStates, NodeStates, NodeTypes, from neo.lib.protocol import (CellStates, ClusterStates, NodeStates, NodeTypes,
...@@ -43,6 +42,7 @@ from neo.lib.util import add64, makeChecksum, p64, u64 ...@@ -43,6 +42,7 @@ from neo.lib.util import add64, makeChecksum, p64, u64
from neo.client.exception import NEOPrimaryMasterLost, NEOStorageError from neo.client.exception import NEOPrimaryMasterLost, NEOStorageError
from neo.client.transactions import Transaction from neo.client.transactions import Transaction
from neo.master.handlers.client import ClientServiceHandler from neo.master.handlers.client import ClientServiceHandler
from neo.storage.database import DatabaseFailure
from neo.storage.handlers.client import ClientOperationHandler from neo.storage.handlers.client import ClientOperationHandler
from neo.storage.handlers.identification import IdentificationHandler from neo.storage.handlers.identification import IdentificationHandler
from neo.storage.handlers.initialization import InitializationHandler from neo.storage.handlers.initialization import InitializationHandler
...@@ -60,12 +60,13 @@ class PCounterWithResolution(PCounter): ...@@ -60,12 +60,13 @@ class PCounterWithResolution(PCounter):
class Test(NEOThreadedTest): class Test(NEOThreadedTest):
@with_cluster() def testBasicStore(self, dedup=False):
def testBasicStore(self, cluster): with NEOCluster(dedup=dedup) as cluster:
if 1: cluster.start()
storage = cluster.getZODBStorage() storage = cluster.getZODBStorage()
storage.sync() storage.sync()
storage.app.max_reconnection_to_master = 0 storage.app.max_reconnection_to_master = 0
compress = storage.app.compress._compress
data_info = {} data_info = {}
compressible = 'x' * 20 compressible = 'x' * 20
compressed = compress(compressible) compressed = compress(compressible)
...@@ -137,27 +138,6 @@ class Test(NEOThreadedTest): ...@@ -137,27 +138,6 @@ class Test(NEOThreadedTest):
self.assertRaises(POSException.POSKeyError, self.assertRaises(POSException.POSKeyError,
storage.load, oid, '') storage.load, oid, '')
@with_cluster()
def testCreationUndoneHistory(self, cluster):
if 1:
storage = cluster.getZODBStorage()
oid = storage.new_oid()
txn = transaction.Transaction()
storage.tpc_begin(txn)
storage.store(oid, None, 'foo', '', txn)
storage.tpc_vote(txn)
tid1 = storage.tpc_finish(txn)
storage.tpc_begin(txn)
storage.undo(tid1, txn)
tid2 = storage.tpc_finish(txn)
storage.tpc_begin(txn)
storage.undo(tid2, txn)
tid3 = storage.tpc_finish(txn)
expected = [(tid1, 3), (tid2, 0), (tid3, 3)]
for x in storage.history(oid, 10):
self.assertEqual((x['tid'], x['size']), expected.pop())
self.assertFalse(expected)
def _testUndoConflict(self, cluster, *inc): def _testUndoConflict(self, cluster, *inc):
def waitResponses(orig, *args): def waitResponses(orig, *args):
orig(*args) orig(*args)
...@@ -738,8 +718,9 @@ class Test(NEOThreadedTest): ...@@ -738,8 +718,9 @@ class Test(NEOThreadedTest):
@with_cluster() @with_cluster()
def testStorageUpgrade1(self, cluster): def testStorageUpgrade1(self, cluster):
if 1: storage = cluster.storage
storage = cluster.storage # Disable migration steps that aren't idempotent.
with Patch(storage.dm.__class__, _migrate3=lambda *_: None):
t, c = cluster.getTransaction() t, c = cluster.getTransaction()
storage.dm.setConfiguration("version", None) storage.dm.setConfiguration("version", None)
c.root()._p_changed = 1 c.root()._p_changed = 1
...@@ -1309,7 +1290,7 @@ class Test(NEOThreadedTest): ...@@ -1309,7 +1290,7 @@ class Test(NEOThreadedTest):
s1.resetNode() s1.resetNode()
with Patch(s1.dm, truncate=dieFirst(1)): with Patch(s1.dm, truncate=dieFirst(1)):
s1.start() s1.start()
self.assertEqual(s0.dm.getLastIDs()[0], truncate_tid) self.assertFalse(s0.dm.getLastIDs()[0])
self.assertEqual(s1.dm.getLastIDs()[0], r._p_serial) self.assertEqual(s1.dm.getLastIDs()[0], r._p_serial)
self.tic() self.tic()
self.assertEqual(calls, [1, 2]) self.assertEqual(calls, [1, 2])
...@@ -1723,7 +1704,18 @@ class Test(NEOThreadedTest): ...@@ -1723,7 +1704,18 @@ class Test(NEOThreadedTest):
x.value += 1 x.value += 1
c2.root()['x'].value += 2 c2.root()['x'].value += 2
TransactionalResource(t1, 1, tpc_begin=begin1) TransactionalResource(t1, 1, tpc_begin=begin1)
s1m, = s1.getConnectionList(cluster.master) # BUG: Very rarely, getConnectionList returns more that 1
# connection ("too many values to unpack"), which is
# a mystery and impossible to reproduce:
# - 1st time: v1.8.1 on a test machine (no SSL)
# - last: current revision on my laptop (SSL),
# at the first iteration of this loop
_sm = list(s1.getConnectionList(cluster.master))
try:
s1m, = _sm
except ValueError:
self.fail((_sm, list(
s1.getConnectionList(cluster.master))))
try: try:
s1.em.removeReader(s1m) s1.em.removeReader(s1m)
with ConnectionFilter() as f, \ with ConnectionFilter() as f, \
...@@ -2371,7 +2363,7 @@ class Test(NEOThreadedTest): ...@@ -2371,7 +2363,7 @@ class Test(NEOThreadedTest):
oid, tid = big_id_list[i] oid, tid = big_id_list[i]
for j, expected in ( for j, expected in (
(1 - i, (dm.getLastTID(u64(MAX_TID)), dm.getLastIDs())), (1 - i, (dm.getLastTID(u64(MAX_TID)), dm.getLastIDs())),
(i, (u64(tid), (tid, {}, {}, oid)))): (i, (u64(tid), (tid, oid)))):
oid, tid = big_id_list[j] oid, tid = big_id_list[j]
# Somehow we abuse 'storeTransaction' because we ask it to # Somehow we abuse 'storeTransaction' because we ask it to
# write data for unassigned partitions. This is not checked # write data for unassigned partitions. This is not checked
...@@ -2381,6 +2373,44 @@ class Test(NEOThreadedTest): ...@@ -2381,6 +2373,44 @@ class Test(NEOThreadedTest):
self.assertEqual(expected, self.assertEqual(expected,
(dm.getLastTID(u64(MAX_TID)), dm.getLastIDs())) (dm.getLastTID(u64(MAX_TID)), dm.getLastIDs()))
def testStorageUpgrade(self):
path = os.path.join(os.path.dirname(__file__),
self._testMethodName + '-%s',
's%s.sql')
dump_dict = {}
def switch(s):
dm = s.dm
dm.commit()
dump_dict[s.uuid] = dm.dump()
dm.erase()
with open(path % (s.getAdapter(), s.uuid)) as f:
dm.restore(f.read())
with NEOCluster(storage_count=3, partitions=3, replicas=1,
name=self._testMethodName) as cluster:
s1, s2, s3 = cluster.storage_list
cluster.start(storage_list=(s1,))
for s in s2, s3:
s.start()
self.tic()
cluster.neoctl.enableStorageList([s.uuid])
cluster.neoctl.tweakPartitionTable()
self.tic()
nid_list = [s.uuid for s in cluster.storage_list]
switch(s3)
s3.stop()
storage = cluster.getZODBStorage()
txn = transaction.Transaction()
storage.tpc_begin(txn, p64(85**9)) # partition 1
storage.store(p64(0), None, 'foo', '', txn)
storage.tpc_vote(txn)
storage.tpc_finish(txn)
self.tic()
switch(s1)
switch(s2)
cluster.stop()
for i, s in zip(nid_list, cluster.storage_list):
self.assertMultiLineEqual(s.dm.dump(), dump_dict[i])
if __name__ == "__main__": if __name__ == "__main__":
unittest.main() unittest.main()
#
# Copyright (C) 2018 Nexedi SA
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
import unittest
from contextlib import contextmanager
from ZConfig import ConfigurationSyntaxError
from ZODB.config import databaseFromString
from .. import Patch
from . import ClientApplication, NEOThreadedTest, with_cluster
from neo.client import Storage
def databaseFromDict(**kw):
return databaseFromString("%%import neo.client\n"
"<zodb>\n <NEOStorage>\n%s </NEOStorage>\n</zodb>\n"
% ''.join(' %s %s\n' % x for x in kw.iteritems()))
class ConfigTests(NEOThreadedTest):
dummy_required = {'name': 'cluster', 'master_nodes': '127.0.0.1:10000'}
@contextmanager
def _db(self, cluster, **kw):
kw['name'] = cluster.name
kw['master_nodes'] = cluster.master_nodes
def newClient(_, *args, **kw):
client = ClientApplication(*args, **kw)
t.append(client.poll_thread)
return client
t = []
with Patch(Storage, Application=newClient):
db = databaseFromDict(**kw)
try:
yield db
finally:
db.close()
cluster.join(t)
@with_cluster()
def testCompress(self, cluster):
kw = self.dummy_required.copy()
valid = ['false', 'true', 'zlib', 'zlib=9']
for kw['compress'] in '9', 'best', 'zlib=0', 'zlib=100':
self.assertRaises(ConfigurationSyntaxError, databaseFromDict, **kw)
for compress in valid:
with self._db(cluster, compress=compress) as db:
self.assertEqual((0,0,''), db.storage.app.compress(''))
if __name__ == "__main__":
unittest.main()
This diff is collapsed.
This diff is collapsed.
...@@ -37,6 +37,11 @@ class SSLTests(SSLMixin, test.Test): ...@@ -37,6 +37,11 @@ class SSLTests(SSLMixin, test.Test):
testStorageDataLock2 = None testStorageDataLock2 = None
testUndoConflictDuringStore = None testUndoConflictDuringStore = None
# With MySQL, this test is expensive.
# Let's check deduplication of big oids here.
def testBasicStore(self):
super(SSLTests, self).testBasicStore(True)
def testAbortConnection(self, after_handshake=1): def testAbortConnection(self, after_handshake=1):
with self.getLoopbackConnection() as conn: with self.getLoopbackConnection() as conn:
conn.ask(Packets.Ping()) conn.ask(Packets.Ping())
......
CREATE TABLE `bigdata` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`value` mediumblob NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `config` (
`name` varbinary(255) NOT NULL,
`value` varbinary(255) DEFAULT NULL,
PRIMARY KEY (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `config` VALUES ('name','testStorageUpgrade'),('nid','1'),('partitions','3'),('ptid','9'),('replicas','1');
CREATE TABLE `data` (
`id` bigint(20) unsigned NOT NULL,
`hash` binary(20) NOT NULL,
`compression` tinyint(3) unsigned DEFAULT NULL,
`value` mediumblob NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `data` VALUES (0,0x0BEEC7B5EA3F0FDBC95D0DD47F3C5BC275DA8A33,0,0x666F6F);
CREATE TABLE `obj` (
`partition` smallint(5) unsigned NOT NULL,
`oid` bigint(20) unsigned NOT NULL,
`tid` bigint(20) unsigned NOT NULL,
`data_id` bigint(20) unsigned DEFAULT NULL,
`value_tid` bigint(20) unsigned DEFAULT NULL,
PRIMARY KEY (`partition`,`tid`,`oid`),
KEY `partition` (`partition`,`oid`,`tid`),
KEY `data_id` (`data_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `obj` VALUES (0,0,231616946283203125,0,NULL);
CREATE TABLE `pt` (
`rid` int(10) unsigned NOT NULL,
`nid` int(11) NOT NULL,
`state` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`rid`,`nid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `pt` VALUES (0,1,0),(0,2,0),(1,1,0),(1,3,1),(2,2,0),(2,3,1);
CREATE TABLE `tobj` (
`partition` smallint(5) unsigned NOT NULL,
`oid` bigint(20) unsigned NOT NULL,
`tid` bigint(20) unsigned NOT NULL,
`data_id` bigint(20) unsigned DEFAULT NULL,
`value_tid` bigint(20) unsigned DEFAULT NULL,
PRIMARY KEY (`tid`,`oid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `trans` (
`partition` smallint(5) unsigned NOT NULL,
`tid` bigint(20) unsigned NOT NULL,
`packed` tinyint(1) NOT NULL,
`oids` mediumblob NOT NULL,
`user` blob NOT NULL,
`description` blob NOT NULL,
`ext` blob NOT NULL,
`ttid` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`partition`,`tid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `trans` VALUES (1,231616946283203125,0,'\0\0\0\0\0\0\0\0','','','',231616946283203125);
CREATE TABLE `ttrans` (
`partition` smallint(5) unsigned NOT NULL,
`tid` bigint(20) unsigned NOT NULL,
`packed` tinyint(1) NOT NULL,
`oids` mediumblob NOT NULL,
`user` blob NOT NULL,
`description` blob NOT NULL,
`ext` blob NOT NULL,
`ttid` bigint(20) unsigned NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `bigdata` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`value` mediumblob NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `config` (
`name` varbinary(255) NOT NULL,
`value` varbinary(255) DEFAULT NULL,
PRIMARY KEY (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `config` VALUES ('name','testStorageUpgrade'),('nid','2'),('partitions','3'),('ptid','9'),('replicas','1');
CREATE TABLE `data` (
`id` bigint(20) unsigned NOT NULL,
`hash` binary(20) NOT NULL,
`compression` tinyint(3) unsigned DEFAULT NULL,
`value` mediumblob NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `data` VALUES (0,0x0BEEC7B5EA3F0FDBC95D0DD47F3C5BC275DA8A33,0,0x666F6F);
CREATE TABLE `obj` (
`partition` smallint(5) unsigned NOT NULL,
`oid` bigint(20) unsigned NOT NULL,
`tid` bigint(20) unsigned NOT NULL,
`data_id` bigint(20) unsigned DEFAULT NULL,
`value_tid` bigint(20) unsigned DEFAULT NULL,
PRIMARY KEY (`partition`,`tid`,`oid`),
KEY `partition` (`partition`,`oid`,`tid`),
KEY `data_id` (`data_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `obj` VALUES (0,0,231616946283203125,0,NULL);
CREATE TABLE `pt` (
`rid` int(10) unsigned NOT NULL,
`nid` int(11) NOT NULL,
`state` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`rid`,`nid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `pt` VALUES (0,1,0),(0,2,0),(1,1,0),(1,3,1),(2,2,0),(2,3,1);
CREATE TABLE `tobj` (
`partition` smallint(5) unsigned NOT NULL,
`oid` bigint(20) unsigned NOT NULL,
`tid` bigint(20) unsigned NOT NULL,
`data_id` bigint(20) unsigned DEFAULT NULL,
`value_tid` bigint(20) unsigned DEFAULT NULL,
PRIMARY KEY (`tid`,`oid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `trans` (
`partition` smallint(5) unsigned NOT NULL,
`tid` bigint(20) unsigned NOT NULL,
`packed` tinyint(1) NOT NULL,
`oids` mediumblob NOT NULL,
`user` blob NOT NULL,
`description` blob NOT NULL,
`ext` blob NOT NULL,
`ttid` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`partition`,`tid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `ttrans` (
`partition` smallint(5) unsigned NOT NULL,
`tid` bigint(20) unsigned NOT NULL,
`packed` tinyint(1) NOT NULL,
`oids` mediumblob NOT NULL,
`user` blob NOT NULL,
`description` blob NOT NULL,
`ext` blob NOT NULL,
`ttid` bigint(20) unsigned NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -38,7 +38,7 @@ extras_require = { ...@@ -38,7 +38,7 @@ extras_require = {
'master': [], 'master': [],
'storage-sqlite': [], 'storage-sqlite': [],
'storage-mysqldb': ['mysqlclient'], 'storage-mysqldb': ['mysqlclient'],
'storage-importer': zodb_require, 'storage-importer': zodb_require + ['msgpack>=0.5.6', 'setproctitle'],
} }
extras_require['tests'] = ['coverage', 'zope.testing', 'psutil>=2', extras_require['tests'] = ['coverage', 'zope.testing', 'psutil>=2',
'neoppod[%s]' % ', '.join(extras_require)] 'neoppod[%s]' % ', '.join(extras_require)]
...@@ -60,7 +60,7 @@ else: ...@@ -60,7 +60,7 @@ else:
setup( setup(
name = 'neoppod', name = 'neoppod',
version = '1.9', version = '1.10',
description = __doc__.strip(), description = __doc__.strip(),
author = 'Nexedi SA', author = 'Nexedi SA',
author_email = 'neo-dev@erp5.org', author_email = 'neo-dev@erp5.org',
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment