Commits · ed3a8b880a006f35f6ac9fa859a30ac71c4b5d77 · nexedi / ZODB

29 Sep, 2002 3 commits
- Tests for fallback should open the server with wait=1. (This didn't · ed3a8b88
  Guido van Rossum authored Sep 29, 2002
```
work previously, but the combination of wait=1 and fallback mode now
works correctly -- it waits until connected.)

This prevents spurious ClientDisconnected exceptions.
```
  ed3a8b88
- Add a comment explaining that sharing a pickler is not thread-safe. · 7812454c
  Guido van Rossum authored Sep 29, 2002
```
No jokes about sharing pickles please. :-)
```
  7812454c
- I ran a test, and sharing a pickler, even a fast one, between threads, · 3112ac9a
  Guido van Rossum authored Sep 29, 2002
```
is *not* thread-safe.  So don't share the Pickler.
```
  3112ac9a
28 Sep, 2002 1 commit
- Fix a mysterious occurrence of 'dumps'. Found by pychecker. Jeremy · cf112582
  Guido van Rossum authored Sep 28, 2002
```
says it should be cPickle.dumps(), so do that.
```
  cf112582
27 Sep, 2002 9 commits

In wait(), when there's no asyncore main loop, we called · 8cba5055

Guido van Rossum authored Sep 27, 2002

asyncore.poll() with a timeout of 10 seconds. Change this to a
variable timeout starting at 1 msec and doubling until 1 second.

While debugging Win2k crashes in the check4ExtStorageThread test from
ZODB/tests/MTStorage.py, Tim noticed that there were frequent 10
second gaps in the log file where *nothing* happens. These were caused
by the following scenario.

Suppose a ZEO client process has two threads using the same connection
to the ZEO server, and there's no asyncore loop active. T1 makes a
synchronous call, and enters the wait() function. Then T2 makes
another synchronous call, and enters the wait() function. At this
point, both are blocked in the select() call in asyncore.poll(), with
a timeout of 10 seconds (in the old version). Now the replies for
both calls arrive. Say T1 wakes up. The handle_read() method in
smac.py calls self.recv(8096), so it gets both replies in its buffer,
decodes both, and calls self.message_input() for both, which sticks
both replies in the self.replies dict. Now T1 finds its response, its
wait() call returns with it. But T2 is still stuck in
asyncore.poll(): its select() call never woke up, and has to "sit out"
the whole timeout of 10 seconds. (Good thing I added timeouts to
everything! Or perhaps not, since it masked the problem.)

One other condition must be satisfied before this becomes a disaster:
T2 must have started a transaction, and all other threads must be
waiting to start another transaction. This is what I saw in the log.
(Hmm, maybe a message should be logged when a thread is waiting to
start a transaction this way.)

In a real Zope application, this won't happen, because there's a
centralized asyncore loop in a separate thread (probably the client's
main thread) and the various threads would be waiting on the condition
variable; whenever a reply is inserted in the replies dict, all
threads are notified. But in the test suite there's no asyncore loop,
and I don't feel like adding one. So the exponential backoff seems
the easiest "solution".

8cba5055

Whitespace normalization. · 4a981c73
Guido van Rossum authored Sep 27, 2002

4a981c73
Add a log msg "closing troubled socket <address>" when we receive an · 7597373d
Guido van Rossum authored Sep 27, 2002
```
'x' event for a wrapper and then close it.
```
7597373d
Whle we're at it, show the length of the message output as well. Get · ba7fae18
Guido van Rossum authored Sep 27, 2002
```
rid of the silly "smac" word.
```
ba7fae18
Use the zrpc.log module's log() method so the process identity is · b70bef7f
Guido van Rossum authored Sep 27, 2002
```
logged with the message_output.
```
b70bef7f
If you're going to patch __builtin__, at least do it right, by · 3b4b2508
Guido van Rossum authored Sep 27, 2002
```
importing __builtin__, rather than using __main__.__builtins__.
```
3b4b2508
Add missing import of sys, needed for error logging in except clause · 19ece134
Guido van Rossum authored Sep 27, 2002
```
in load_class().  Found by pychecker.
```
19ece134
Fix typo: a -> addr. Found by pychecker. · e1dbb1bf
Guido van Rossum authored Sep 27, 2002

e1dbb1bf

When using textwrap, don't break long words. Occasionally a line will · 02c7e4a8

Guido van Rossum authored Sep 27, 2002

be too long, but breaking these at an arbitrary character looks wrong
(and can occasionally prevent you from finding a search string).

02c7e4a8

26 Sep, 2002 5 commits
- Add a read_only attribute to the ZEOStorage instance. This is · 11a9b5ab
  Guido van Rossum authored Sep 26, 2002
```
initialized from the StorageServer's read_only attribute, and later if
the client registers in read_only mode, it may be set even if was off
initially.  This attribute is tested by all write-ish operations.
```
  11a9b5ab
- Fix the Windows version of the ro_svr code. · c7c1f699
  Guido van Rossum authored Sep 26, 2002
  
  c7c1f699
- Differentiate between read-only storage and read-only server in the · d99cf84b
  Guido van Rossum authored Sep 26, 2002
```
simple read-only tests.  (These tests pass.)
```
  d99cf84b
- Add test framework so we can test the read_only behavior of the · 9c995c9b
  Guido van Rossum authored Sep 26, 2002
```
StorageServer as well as that of the storage itself.  Currently the
test fails.
```
  9c995c9b
- Add an XXX comment that this is broken. (The signature of · d179cabb
  Guido van Rossum authored Sep 26, 2002
```
start_zeo_server() has changed dramatically.)
```
  d179cabb
25 Sep, 2002 6 commits

Untested update to make this code work with ZEO 2 by default. · f518c990
Jeremy Hylton authored Sep 25, 2002

f518c990

Simplify error logging code. · 426ba922

Jeremy Hylton authored Sep 25, 2002

Don't catch a specific set of errors, catch anything, log the message
that failed, and re-raise the exception.

Eliminate unused class variable VERSION and unused import of struct.

426ba922

Fix error handling logic for pickling errors. · 4a34bfaf

Jeremy Hylton authored Sep 25, 2002

If an exception occurs while decoding a message, there is really
nothing the server can do to recover.  If the message was a
synchronous call, the client will wait for ever for the reply.  The
server can't send the reply, because it couldn't unpickle the message
id.  Instead of trying to recover, just let the exception propogate up
to asyncore where the connection will be closed.

As a result, eliminate DecodingError and special case in
handle_error() that handled flags == None.

4a34bfaf

send_reply(): catch errors in encode() and send a ZRPCError exception · 6d40690c
Guido van Rossum authored Sep 25, 2002
```
instead.

return_error(): be more careful calling repr() on err_value.
```
6d40690c
checkTransactionalUndoAfterPackWithObjectUnlinkFromRoot(): New method · aab687e8
Barry Warsaw authored Sep 25, 2002
```
which tests a particular combination of packing and transactional undo
at the ZODB layer.
```
aab687e8
undoLog(): Should have default arguments for first and last, to match · 56b13d6e
Barry Warsaw authored Sep 25, 2002
```
FileStorage/Berkeley storage definition, and undoInfo(), and the
storage interface definition.
```
56b13d6e

24 Sep, 2002 2 commits

Remove check for oid serialno matching transaction id. · 6eafeda8

Jeremy Hylton authored Sep 24, 2002

This is no longer an invariant.  Storage methods like restore(),
abortVersion(), and commitVersion() can result in a serialno that does
not match the transaction id.

6eafeda8

Fix the control flow in pending(). Thanks to Ury Marshak!!! · ddf9f94a

Guido van Rossum authored Sep 24, 2002

Rather than blaming window for reporting success as an error, the
else clause on the second try block should be an except clause.

ddf9f94a

23 Sep, 2002 5 commits

Rename update() back to sync(). It is actually referenced by · 5a4a9d6d

Guido van Rossum authored Sep 23, 2002

ZODB/Connection.py -- the Connection class has a sync() method which
calls the sync() method on the storage if it exists.

5a4a9d6d

Various repairs and nits: · 5e977514

Guido van Rossum authored Sep 23, 2002

- Change pending() to use select.select() instead of select.poll(), so
  it'll work on Windows.

- Clarify comment to say that only Exceptions are propagated.

- Change some private variables to public (everything else is public).

- Remove XXX comment about logging at INFO level (we already do that
  now :-).

5e977514

Fix a misleading comment. · 2ed5d5f6
Guido van Rossum authored Sep 23, 2002

2ed5d5f6

Inline _basic_init(). · 9afbe7bf

Guido van Rossum authored Sep 23, 2002

XXX This created two unused attributes, self._commit_lock_{acquire,release}.
Why?  I've gotten rid of them.  The test suite succeeds.  But they are
created by BaseStorage; maybe they play a role in the standard storage
API???

9afbe7bf

Fix variable reference in LOG message when 'import pwd' fails. · 1e0ce969
Guido van Rossum authored Sep 23, 2002

1e0ce969

20 Sep, 2002 3 commits

Add log messages about start and completion of pack() call. · 37170a17
Guido van Rossum authored Sep 20, 2002

37170a17

I set out making wait=1 work for fallback connections, i.e. the · 24afe7ac

Guido van Rossum authored Sep 20, 2002

ClientStorage constructor called with both wait=1 and
read_only_fallback=1 should return, indicating its readiness, when a
read-only connection was made.  This is done by calling
connect(sync=1).  Previously this waited for the ConnectThread to
finish, but that thread doesn't finish until it's made a read-write
connection, so a different mechanism is needed.

I ended up doing a major overhaul of the interfaces between
ClientStorage, ConnectionManager, ConnectThread/ConnectWrapper, and
even ManagedConnection.  Changes:

ClientStorage.py:

  ClientStorage:

  - testConnection() now returns just the preferred flag; stubs are
    cheap and I like to have the notifyConnected() signature be the
    same for clients and servers.

  - notifyConnected() now takes a connection (to match the signature
    of this method in StorageServer), and creates a new stub.  It also
    takes care of the reconnect business if the client was already
    connected, rather than the ClientManager.  It stores the
    connection as self._connection so it can close the previous one.
    This is also reset by notifyDisconnected().

zrpc/client.py:

  ConnectionManager:

  - Changed self.thread_lock into a condition variable.  It now also
    protects self.connection.  The condition is notified when
    self.connection is set to a non-None value in connect_done();
    connect(sync=1) waits for it.  The self.connected variable is no
    more; we test "self.connection is not None" instead.

  - Tried to made close() reentrant.  (There's a trick: you can't set
    self.connection to None, conn.close() ends up calling close_conn()
    which does this.)

  - Renamed notify_closed() to close_conn(), for symmetry with the
    StorageServer API.

  - Added an is_connected() method so ConnectThread.try_connect()
    doesn't have to dig inside the manager's guts to find out if the
    manager is connected (important for the disposition of fallback
    wrappers).

  ConnectThread and ConnectWrapper:

  - Follow above changes in the ClientStorage and ConnectionManager
    APIs: don't close the manager's connection when reconnecting, but
    leave that up to notifyConnected(); ConnectWrapper no longer
    manages the stub.

  - ConnectWrapper sets self.sock to None once it's created a
    ManagedConnection -- from there on the connection is is charge of
    closing the socket.

zrpc/connection.py:

  ManagedServerConnection:

  - Changed the order in which close() calls things; super_close()
    should be last.

  ManagedConnection:

  - Ditto, and call the manager's close_conn() instead of
    notify_closed().

tests/testZEO.py:

  - In checkReconnectSwitch(), we can now open the client storage with
    wait=1 and read_only_fallback=1.

24afe7ac

Address Chris McDonough's request: make the ClientStorage() · f8411024

Guido van Rossum authored Sep 20, 2002

constructor signature backwards compatible with ZEO 1.  This means
adding wait_for_server_on_startup and debug options.
wait_for_server_on_startup is an alias for wait, which makes the
argument decoding for these two a little tricky.  debug is ignored.

Also change the default of wait to True, like it was in ZEO 1.  This
is less likely to screw naive customers.

f8411024

19 Sep, 2002 6 commits

Update comments. Explain that zrpc is an internal detail of ZEO. · 62f870d5
Guido van Rossum authored Sep 19, 2002

62f870d5
This test (with ZEO) failed frequently on my Win98 box with a timeout · 9eb5bb8a
Guido van Rossum authored Sep 19, 2002
```
of 30 seconds.  There's nothing wrong with the code, it's just slow.
So increase the timeout to 60 seconds.
```
9eb5bb8a

Change the random port generator to only generate even port numbers. · e0300a10

Guido van Rossum authored Sep 19, 2002

On Windows, port+1 is used as well, so we don't want accidentally to
allocate two adjacent ports when we ask for multiple ports.

e0300a10

Make the default for pack() synchronous. · 5aa4c348
Guido van Rossum authored Sep 19, 2002

5aa4c348
pack()'s 'wait' argument is a boolean, not an object, so test it using · 0b9bb581
Guido van Rossum authored Sep 19, 2002
```
"if wait" rather than "if wait is not None".  Also change the default
to 0.
```
0b9bb581

The mystery of the Win98 hangs in the checkReconnectSwitch() test · da28b620

Guido van Rossum authored Sep 19, 2002

until I added an is_connected() test to testConnection() is solved.

After the ConnectThread has switched the client to the new, read-write
connection, it closes the read-only connection(s) that it was saving
up in case there was no read-write connection. But closing a
ManagedConnection calls notify_closed() on the manager, which
disconnected the manager and the client from its brand new read-write
connection. The mistake here is that this should only be done when
closing the manager's current connection!

The fix was to add an argument to notify_closed() that passes the
connection object being closed; notify_closed() returns without doing
a thing when that is not the current connection.

I presume this didn't happen on Linux because there the sockets
happened to connect in a different order, and there was no read-only
connection to close yet (just a socket trying to connect).

I'm taking out the previous "fix" to ClientStorage, because that only
masked the problem in this relatively simple test case. The problem
could still occur when both a read-only and a read-write server are up
initially, and the read-only server connects first; once the
read-write server connects, the read-write connection is installed,
and then the saved read-only connection is closed which would again
mistakenly disconnect the read-write connection.

Another (related) fix is not to call self.mgr.notify_closed() but to
call self.mgr.connection.close() when reconnecting. (Hmm, I wonder if
it would make more sense to have an explicit reconnect callback to the
manager and the client? Later.)

da28b620