neoppod:bff5c82f4bc3215e60e6f3efe0ce5b52bb578222 commitshttps://lab.nexedi.com/nexedi/neoppod/-/commits/bff5c82f4bc3215e60e6f3efe0ce5b52bb5782222015-10-05T17:09:59+02:00https://lab.nexedi.com/nexedi/neoppod/-/commit/bff5c82f4bc3215e60e6f3efe0ce5b52bb578222Add SSL support2015-10-05T17:09:59+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/107ca7dfb6981f8c963a6c9d5fcacaa5605b6d9fneoctl: make -l option log everything on disk automatically2015-10-05T17:01:48+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/9d26bb513f294ef4b77bccf590e2ebef40d44acdIn importer.conf example, explain why the source DB can't be open read-only2015-10-02T20:01:33+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/f2babb121fc098033364c1c9253d31215adf049eExpand ~(user) construction for all paths in configuration2015-10-02T18:40:58+02:00Julien Muchembledjm@nexedi.com
Before, it was only done for 'logfile'.https://lab.nexedi.com/nexedi/neoppod/-/commit/57481c35685240af62660959e92688ce8436f96aReview API betweeen connections and connectors2015-10-01T16:20:46+02:00Julien Muchembledjm@nexedi.com
- Review error handling. Only 2 exceptions remain in connector.py:
- Drop useless exception handling for EAGAIN since it should not happen
if the kernel says the socket is ready.
- Do not distinguish other socket errors. Just close and log in a generic way.
- No need to raise a specific exception for EOF.
- Make 'connect' return a boolean instead of raising an exception.
- Raise appropriate exception when answer/ask/notify is called on a closed
non-MT connection.
- Add support for more complex connectors, which may need to write for a read
operation, or to read when there's pending data to send. This will be
required for SSL support (more exactly, the handshake will be done in
a transparent way):
- Move write buffer to connector.
- Make 'receive' fill the read buffer, instead of returning the read data.
- Make 'receive' & 'send' return a boolean to switch polling for writing.
- Tolerate that sockets return 0 as number of bytes sent.
- In testConnection, simply delete all failing tests, as announced
in commit <a href="/sfermigier/neo/-/commit/71e30fb9b8941b200dc1768d59a58e73fe5b354f" data-original="71e30fb9b8941b200dc1768d59a58e73fe5b354f" data-link="false" data-link-reference="false" data-project="778" data-commit="71e30fb9b8941b200dc1768d59a58e73fe5b354f" data-reference-type="commit" data-container="body" data-placement="top" data-html="true" title="Remove useless testEvent" class="gfm gfm-commit has-tooltip">71e30fb9</a>.https://lab.nexedi.com/nexedi/neoppod/-/commit/36a32f23904d5594387c02b0e79de53f03a72448tests: add "last" symlink to last temporary directory2015-09-30T19:28:40+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/518c7588fd6daf13c087978fc4af58d30a40b62dAllow to specify log file in configuration file, and expand ~(user) construction2015-09-24T20:22:17+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/32b2d173d920797bc7b9fdf8dd9fbd2ab17ab190Move common command-line options to neo.lib.config2015-09-24T20:22:17+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/727899e2cab3460f8ddb1bac026ed8e5f549d33aPass app as first parameter of (*Client|Listening)Connection2015-09-24T20:22:17+02:00Julien Muchembledjm@nexedi.com
Application will hold SSL parameters.https://lab.nexedi.com/nexedi/neoppod/-/commit/7d5b155980afbc07eed092acc92f4d841ca7265bFix remaining memory leaks and make handler instances become singletons2015-09-24T20:22:16+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/9fdd750fbaa5510042094eb7a6bdec3cf4974ef0Simplify cleanup of HandlerSwitcher when closing a connection2015-09-23T19:24:09+02:00Julien Muchembledjm@nexedi.com
This frees a reference to the last handler and there's no need to make the
instance reusable.https://lab.nexedi.com/nexedi/neoppod/-/commit/aaf2251e72d272f391cb21b2a05876afc9e254declient: do nothing (instead of raising) if a closed Storage is closed again2015-09-23T19:24:02+02:00Julien Muchembledjm@nexedi.com
This follows the behaviour of FileStorage.https://lab.nexedi.com/nexedi/neoppod/-/commit/d75fcc597fbfca501e0104fb9fa497809a3346f3Fix leak of file descriptors in unit tests2015-09-23T18:57:07+02:00Julien Muchembledjm@nexedi.com
There remain only one leak in ClientApplicationTests.test_connectToPrimaryNode
because of Mock objects.https://lab.nexedi.com/nexedi/neoppod/-/commit/c88c6ac5a3412f09664f2dc9d16c8301c1cbe11aTODO: document which mock library we should use2015-09-15T18:14:35+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/a72ddfb30d0cc9e5a3cc95ad6fc5ac517798cd4cadmin: do not reset the list of known masters from configuration (or command...2015-09-15T17:03:08+02:00Julien Muchembledjm@nexedi.comadmin: do not reset the list of known masters from configuration (or command line) when reconnecting
This is questionable but a lot of NodeManager must be reviewed if we want to do
differently. At least, admin nodes now behave like clients.
https://lab.nexedi.com/nexedi/neoppod/-/commit/6f6d071daba4d9f7f3989af044651c8c658cc8c9Simplify setup of monkey-patches in threaded tests2015-09-15T16:58:37+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/3e1ed6a43b18fed8f2b4c5be2f0a6de9ea1ae42aSimplify polling thread in threaded apps2015-09-15T16:53:46+02:00Julien Muchembledjm@nexedi.com
It's been a long time that the polling thread never ends and don't need to be
restarted. On the other side, there will be a need for the admin to define a
different polling loop, hence the move from threaded_poll to threaded_app.https://lab.nexedi.com/nexedi/neoppod/-/commit/f5f42522a9b4ceb54bd80d34e345df2b9993001aMove code from neo.client to neo.lib, since admins will be also multi-threaded2015-09-15T16:48:55+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/50d25d007dc49edc9d2c11d71e9c2f9d4c7eda9aDrop 'background' mode completely in threaded tests2015-09-15T16:41:09+02:00Julien Muchembledjm@nexedi.com
It was still used to stop a cluster.https://lab.nexedi.com/nexedi/neoppod/-/commit/4253d24fe0c01b5ae66c87b6dcf709cbe1506573Stop using 'background' mode in threaded tests2015-09-15T16:37:44+02:00Julien Muchembledjm@nexedi.com
This makes tests easier to write, with more determinism.
If only I had the idea to monkey-patch SimpleQueue several years ago.https://lab.nexedi.com/nexedi/neoppod/-/commit/7025db52513639f881e5996c8a87850cdc4c3fa5Rewrite of scheduler for threaded tests2015-09-15T15:53:38+02:00Julien Muchembledjm@nexedi.com
The previous implementation was built around a 'pending' global variable that
was set by a few monkey-patches when some network activity was pending between
nodes. All this is replaced by an extra epoll object is used to wait for nodes
that have pending network events: this is simpler, and faster since it
significantly reduces the number of context switches.https://lab.nexedi.com/nexedi/neoppod/-/commit/610093411e2c24d68c40f5b2696660000754f407Thread.isAlive is deprecated2015-09-14T18:04:47+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/af06676a2404e688f7e4f30f3b593d083aab6057Fix potential deadlock when connecting to primary master2015-09-07T11:30:05+02:00Julien Muchembledjm@nexedi.com
This is a regression caused by commit <a href="/kirr/neo/-/commit/eef52c27bc9955f8e68f0442089afb8fc03987f7" data-original="eef52c27bc9955f8e68f0442089afb8fc03987f7" data-link="false" data-link-reference="false" data-project="73" data-commit="eef52c27bc9955f8e68f0442089afb8fc03987f7" data-reference-type="commit" data-container="body" data-placement="top" data-html="true" title="Tickless poll loop, for lowest latency and cpu usage" class="gfm gfm-commit has-tooltip">eef52c27</a>
("Tickless poll loop, for lowest latency and cpu usage"), affecting:
- admins
- storages
- primary masters of backup clustershttps://lab.nexedi.com/nexedi/neoppod/-/commit/9531c9cb70e5f86747fd980822995bb3a2a526ddclient: drop now useless wrapper to log safely in poll thread during shutdown2015-08-28T20:52:09+02:00Julien Muchembledjm@nexedi.com
Recent Python already catches exceptions due to garbage collection on exit.https://lab.nexedi.com/nexedi/neoppod/-/commit/e27358d130e1222d5244d9bb686a769ed22a979dstorage: fix history() not waiting oid to be unlocked2015-08-28T20:52:09+02:00Julien Muchembledjm@nexedi.com
This fixes a random failure in testClientReconnection:
Traceback (most recent call last):
File "neo/tests/threaded/test.py", line 754, in testClientReconnection
self.assertTrue(cluster.client.history(x1._p_oid))
failureException: None is not truehttps://lab.nexedi.com/nexedi/neoppod/-/commit/79be7787add43ed9cd2d100cf87586466cb0cf6eFix random failure in testRecycledClientUUID2015-08-28T20:52:09+02:00Julien Muchembledjm@nexedi.com
Traceback (most recent call last):
File "neo/tests/threaded/test.py", line 838, in testRecycledClientUUID
x = client.load(ZERO_TID)
[...]
File "neo/tests/threaded/test.py", line 822, in notReady
m2s.remove(delayNotifyInformation)
File "neo/tests/threaded/__init__.py", line 482, in remove
del self.filter_dict[filter]
KeyError: <function delayNotifyInformation at 0x7f511063a578>https://lab.nexedi.com/nexedi/neoppod/-/commit/c4ac45a8e8059c2af887742863c9c926139bca0bFix several random failures in tests that didn't wait for transaction to be u...2015-08-28T20:52:09+02:00Julien Muchembledjm@nexedi.com
NEOCluster.tic() gets a new 'slave' parameter that must be True when a client
node is in 'master' mode (i.e. setPoll(True)). In this case, tic() will wait
that all nodes finish their work and the client polls with a non-zero timeout.
Here, tic(slave=1) is used to wait for the storage to process
NotifyUnlockInformation notification from the master.
Traceback (most recent call last):
File "neo/tests/threaded/test.py", line 80, in testBasicStore
self.assertEqual(data_info, cluster.storage.getDataLockInfo())
File "neo/tests/__init__.py", line 170, in assertEqual
return super(NeoTestBase, self).assertEqual(first, second, msg=msg)
failureException: {('\x0b\xee\xc7\xb5\xea?\x0f\xdb\xc9]\r\xd4\x7f<[\xc2u\xda\x8a3', 0): 0} != {('\x0b\xee\xc7\xb5\xea?\x0f\xdb\xc9]\r\xd4\x7f<[\xc2u\xda\x8a3', 0): 1}https://lab.nexedi.com/nexedi/neoppod/-/commit/5dc1f06cc6f20c547e3bcd8c8d49f4832a5042fdSeveral improvements to verbose locks2015-08-28T20:51:55+02:00Julien Muchembledjm@nexedi.com
All these changes were useful to debug deadlocks in threaded tests:
- New verbose Semaphore.
- Logs with numerical 'ident' were too annoying to read so revert to thread
name (before commit <a href="/nexedi/neoppod/-/commit/5b69d5531955191c78aebfc3898398dc6787dd6e" data-original="5b69d5531955191c78aebfc3898398dc6787dd6e" data-link="false" data-link-reference="false" data-project="72" data-commit="5b69d5531955191c78aebfc3898398dc6787dd6e" data-reference-type="commit" data-container="body" data-placement="top" data-html="true" title="Better output of verbose locks" class="gfm gfm-commit has-tooltip">5b69d553</a>), with an
exception for threaded tests. There remains one case where the result is not
unique: when several client apps are instantiated.
- Make deadlock detection optional.
- Make it possible to name locks.
- Make output more compact.
- Remove useless 'debug_lock' option.
- Add timing information.
- Make exception more verbose when an un-acquired lock is released.
Here is how I used 'locking':
--- a/neo/tests/threaded/__init__.py
+++ b/neo/tests/threaded/__init__.py
@@ -37,0 +38 @@
+from neo.lib.locking import VerboseSemaphore
@@ -71 +72,2 @@ def init(cls):
- cls._global_lock = threading.Semaphore(0)
+ cls._global_lock = VerboseSemaphore(0, check_owner=False,
+ name="Serialized._global_lock")
@@ -265 +267,2 @@ def start(self):
- self.em._lock = l = threading.Semaphore(0)
+ self.em._lock = l = VerboseSemaphore(0, check_owner=False,
+ name=self.node_name)
@@ -346 +349,2 @@ def __init__(self, master_nodes, name, **kw):
- self.em._lock = threading.Semaphore(0)
+ self.em._lock = VerboseSemaphore(0, check_owner=False,
+ name=repr(self))https://lab.nexedi.com/nexedi/neoppod/-/commit/0b93b1fb4f8418fc898a6660933daad1b01a1246Fix occasional deadlocks in threaded tests2015-08-28T20:13:52+02:00Julien Muchembledjm@nexedi.com
deadlocks mainly happened while stopping a cluster, hence the complete review
of NEOCluster.stop()
A major change is to make the client node handle its lock like other nodes
(i.e. in the polling thread itself) to better know when to call
Serialized.background() (there was a race condition with the test of
'self.poll_thread.isAlive()' in ClientApplication.close).https://lab.nexedi.com/nexedi/neoppod/-/commit/1ab594b412834706e502250493e0de9cedce64afRemove useless assert in a private method of MTClientConnection2015-08-14T12:01:48+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/d898a83d51e6d7701c14fb5cb6bb33b4869ad8a7Do not reconnect too quickly to a node after an error2015-08-14T12:01:16+02:00Julien Muchembledjm@nexedi.com
For example, a backup storage node that was rejected because the upstream
cluster was not ready could reconnect in loop without delay, using 100% CPU
and flooding logs.
A new 'setReconnectionNoDelay' method on Connection can be used for cases where
it's legitimate to quickly reconnect.
With this new delayed reconnection, it's possible to remove the remaining
time.sleep().https://lab.nexedi.com/nexedi/neoppod/-/commit/71e30fb9b8941b200dc1768d59a58e73fe5b354fRemove useless testEvent2015-08-12T19:18:46+02:00Julien Muchembledjm@nexedi.com
Such kind of test has never helped to detect regressions and any bug in
EpollEventManager would be quickly reported by other tests.
testConnection may go the same way if it keeps annoying me too much.https://lab.nexedi.com/nexedi/neoppod/-/commit/f9df31be57e13a47f49448a26e784594fe09261fclient: do not wait for the remote to close the connection if it's not ready2015-08-12T19:18:46+02:00Julien Muchembledjm@nexedi.com
This is currently not an issue because the 'time.sleep(1)' in iterateForObject
(storage) and _connectToPrimaryNode (master) leave enough time. What could
happen is a new connection attempt for a node that already has a connection
(causing a failure assertion in Node.setConnection).https://lab.nexedi.com/nexedi/neoppod/-/commit/a4731a0c7b8a0e8d938d114b9ca0151f3f21e98dFix invalid processing of unregistered connections2015-08-12T19:18:46+02:00Julien Muchembledjm@nexedi.com
This could happen if a file descriptor was reallocated by the kernel.https://lab.nexedi.com/nexedi/neoppod/-/commit/ed50edca14d92fbb332594cb8f80ea25e533f028Simplify API to establish connections and accept mix of IPv4/IPv62015-08-12T19:18:46+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/c2c97752b0809a4bcf19be09467bcf30d3cc7e46Rename parameter of polling methods now that _poll computes the timeout itself2015-08-12T19:18:46+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/eef52c27bc9955f8e68f0442089afb8fc03987f7Tickless poll loop, for lowest latency and cpu usage2015-08-12T19:18:46+02:00Julien Muchembledjm@nexedi.com
With this patch, the epolling object is not awoken every second to check
if a timeout has expired. The API of Connection is changed to get the smallest
timeout.https://lab.nexedi.com/nexedi/neoppod/-/commit/fd0b9c98384184a675dc45c1bfe33ab9782250bdtests: make Patch usable as a context manager2015-08-12T15:55:52+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/91c663569c6b77bdff6cb82663173bb2e5cd84d7Add file descriptor and aborted flag to __repr__ of connections2015-08-12T15:55:52+02:00Julien Muchembledjm@nexedi.comhttps://lab.nexedi.com/nexedi/neoppod/-/commit/cb8a5a88bc9660e6eb4aaa1be5889e028f21df6cclient: replace Event by a pipe as a way to stop the poll loop2015-08-12T15:55:52+02:00Julien Muchembledjm@nexedi.com
This is a prerequisite for tickless poll loops.