Commits · 401163753a309ccc93fc696594cf75349f09f16c · nexedi / ZODB

20 Nov, 2020 2 commits

[ZODB4] IStorage: Require lastTransaction() to invalidate DB before returning ... · 40116375

Kirill Smelkov authored Jun 08, 2020

... and to provide correct "current" view for load().

This is ZODB4 backport of https://github.com/zopefoundation/ZODB/pull/313,
which itself is just a more explicit language of
https://github.com/zopefoundation/ZODB/commit/4a6b0283#diff-881ceb274f9e538d4144950eefce8682R685-R695

It has been amended with ZODB4-specific

      It is guaranteed that after lastTransaction returns, "current" view of
      the storage as observed by load() is ≥ returned tid.

because on ZODB<5 - contrary to ZODB5 - there is "load current" for
which correct semantic must be also provided:
https://github.com/zopefoundation/ZODB/pull/307#discussion_r436662348

Original description follows:

---- 8< ----

Because if lastTransaction() returns tid, for which local database
handle has not yet been updated with invalidations, it could lead to
data corruption due to concurrency issues similar to
https://github.com/zopefoundation/ZODB/issues/290:

- DB refreshes a Connection for new transaction;
- zstor.lastTransaction() is called to obtain database view for this connection.
- objects in live-cache for this Connection are invalidated with
  invalidations that were queued through DB.invalidate() calls from
  storage.
- if lastTransaction does not guarantee that all DB invalidations for
  transactions with ID ≤ returned tid have been completed, it can be
  that:

	incomplete set of objects are invalidated in live cache

  i.e. data corruption.

This particular data corruption has been hit when working on core of
ZODB and was not immediately noticed:

https://github.com/zopefoundation/ZODB/pull/307#pullrequestreview-423017996

this fact justifies the importance of explicitly stating what IStorage
guarantees are / must be in the interface.

This guarantee

- already holds for FileStorage (no database mutations from outside of
  single process);
- is already true for ZEO4 and ZEO5
  https://github.com/zopefoundation/ZODB/pull/307#pullrequestreview-423017996
  https://github.com/zopefoundation/ZODB/pull/307#discussion_r434166238
- holds for RelStorage because it implements IMVCCStorage natively;
- is *not* currently true for NEO because NEO sets zstor.last_tid before
  calling DB.invalidate:

  https://lab.nexedi.com/nexedi/neoppod/blob/fc58c089/neo/client/handlers/master.py#L109-124

However NEO is willing to change and already prepared the fix to provide
this guarantee because of data corruption scenario that can happen
without it:

  https://github.com/zopefoundation/ZODB/pull/307#discussion_r436662348
  nexedi/neoppod@a7d101ec
  nexedi/neoppod@a7d101ec (comment 112238)
  nexedi/neoppod@a7d101ec (comment 113331)

In other words all storages that, to my knowledge, are in current use
are either already providing specified semantic, or will be shortly
fixed to provide it.

This way we can fix up the interface and make the semantic clear.

/cc @jamadden, @vpelletier, @arnaud-fontaine, @jwolf083, @klawlf82, @gitzit, @jimfulton

40116375

[ZODB4] setup: tests_require += zope.testrunner · 998c8f86

Kirill Smelkov authored Nov 20, 2020

Zope.testrunner is used to run ZODB tests (see e.g. tox.ini). It was
specified as explicit dependency in tox.ini->deps and in buildout.cfg.
However if in SlapOS we install just ZODB[test] zope.testrunner won't be
installed.

-> Fix is by moving zope.testrunner dependency into setup, so that
ZODB[test] installs it. This is the same as what ZODB5 does:

https://github.com/zopefoundation/ZODB/blob/5.6.0-14-g0eae10cd0/setup.py#L48

998c8f86

31 Jul, 2020 1 commit

[ZODB4] Start of 4-nxd branch · d644e63b

Kirill Smelkov authored Jul 31, 2020

Upstream considers 4 flavour to be "dead":

https://github.com/zopefoundation/ZEO/pull/161#pullrequestreview-447245642

Let's keep on 4-nxd branch in our fork until ZODB4 is used on our side.

d644e63b

31 Mar, 2020 1 commit

FileStorage: Save committed transaction to disk even if changed data is empty · fdf9e7a2

Kirill Smelkov authored Mar 13, 2020

[ This is ZODB4 backport of commit bb9bf539
(https://github.com/zopefoundation/ZODB/pull/298) ]

ZODB tries to avoid saving empty transactions to storage on
`transaction.commit()`. The way it works is: if no objects were changed
during ongoing transaction, ZODB.Connection does not join current
TransactionManager, and transaction.commit() performs two-phase commit
protocol only on joined DataManagers. In other words if no objects were
changed, no tpc_*() methods are called at all on ZODB.Connection at
transaction.commit() time.

This way application servers like Zope/ZServer/ERP5/... can have
something as

try:
# process incoming request
transaction.commit() # processed ok
except:
transaction.abort()
# problem: log + reraise

in top-level code to process requests without creating many on-disk
transactions with empty data changes just because read-only requests
were served.

Everything is working as intended.

However at storage level, FileStorage currently also checks whether
transaction that is being committed also comes with empty data changes,
and _skips_ saving transaction into disk *at all* for such cases, even
if it has been explicitly told to commit the transaction via two-phase
commit protocol calls done at storage level.

This creates the situation, where contrary to promise in
ZODB/interfaces.py(*), after successful tpc_begin/tpc_vote/tpc_finish()
calls made at storage level, transaction is _not_ made permanent,
despite tid of "committed" transaction being returned to caller. In other
words FileStorage, when asked to commit a transaction, even if one with
empty data changes, reports "ok" and gives transaction ID to the caller,
without creating corresponding transaction record on disk.

This behaviour is

a) redundant to application-level avoidance to create empty transaction
on storage described in the beginning, and

b) creates problems:

The first problem is that application that works at storage-level might
be interested in persisting transaction, even with empty changes to
data, just because it wants to save the metadata similarly to e.g.
`git commit --allow-empty`.

The other problem is that an application view and data in database
become inconsistent: an application is told that a transaction was
created with corresponding transaction ID, but if the storage is
actually inspected, e.g. by iteration, the transaction is not there.
This, in particular, can create problems if TID of committed transaction
is reported elsewhere and that second database client does not find the
transaction it was told should exist.

I hit this particular problem with wendelin.core. In wendelin.core,
there is custom virtual memory layer that keeps memory in sync with
data in ZODB. At commit time, the memory is inspected for being dirtied,
and if a page was changed, virtual memory layer joins current
transaction _and_ forces corresponding ZODB.Connection - via which it
will be saving data into ZODB objects - to join the transaction too,
because it would be too late to join ZODB.Connection after 2PC process
has begun(+). One of the format in which data are saved tries to
optimize disk space usage, and it actually might happen, that even if
data in RAM were dirtied, the data itself stayed the same and so nothing
should be saved into ZODB. However ZODB.Connection is already joined
into transaction and it is hard not to join it because joining a
DataManager when the 2PC is already ongoing does not work.

This used to work ok with wendelin.core 1, but with wendelin.core 2 -
where separate virtual filesystem is also connected to the database to
provide base layer for arrays mappings - this creates problem, because
when wcfs (the filesystem) is told to synchronize to view the database
@tid of committed transaction, it can wait forever waiting for that, or
later, transaction to appear on disk in the database, creating
application-level deadlock.

I agree that some more effort might be made at wendelin.core side to
avoid committing transactions with empty data at storage level.

However the most clean way to fix this problem in my view is to fix
FileStorage itself, because if at storage level it was asked to commit
something, it should not silently skip doing so and dropping even non-empty
metadata + returning ok and committed transaction ID to the caller.

As described in the beginning this should not create problems for
application-level ZODB users, while at storage-level the implementation
is now consistently matching interface and common sense.

----

(*) tpc_finish: Finish the transaction, making any transaction changes permanent.
Changes must be made permanent at this point.
...

https://github.com/zopefoundation/ZODB/blob/5.5.1-35-gb5895a5c2/src/ZODB/interfaces.py#L828-L831

(+) https://lab.nexedi.com/kirr/wendelin.core/blob/9ff5ed32/bigfile/file_zodb.py#L788-822

fdf9e7a2

30 Apr, 2018 1 commit
- Appveyor (#157) including a fix for Python 3.6 · 7a1a4911
  Jim Fulton authored Apr 09, 2017
```
Tests running and passing on windows.

(cherry picked from commit 87fd29eb)
```
  7a1a4911
08 Apr, 2017 2 commits
- Fix date in changelog · 5b6c92a2
  Julien Muchembled authored Apr 08, 2017
  
  5b6c92a2
- Changelog for PR #153 · c9f750f3
  Jim Fulton authored Apr 08, 2017
```
This backports the following commits:
  6d673057
  244bb92b
```
  c9f750f3
03 Apr, 2017 1 commit

FileStorage: Report problem on read-only open of non-existent file · 56c96a11

Kirill Smelkov authored Apr 02, 2017

... instead of silently creating empty database on such opens.

Use-case for this are utilities like e.g. zodbdump and zodbcmp which
expect such storage opens to fail so that the tool can know there is no
such storage and report it to user.

In contrast current state is: read-only opens get created-on-the-fly
empty storage with no content, but which can be iterated over without
getting any error.

This way e.g. `zodbdump non-existent.fs` produces empty output _and_
exit code 0 which is not what caller expects.

(cherry picked from commit 30bbabf1)

56c96a11

06 Feb, 2017 1 commit
- 4.4.5 · 60ce5795
  Julien Muchembled authored Feb 06, 2017
  
  60ce5795
02 Feb, 2017 5 commits

Changelog for PR #136 · 7c0d963f
Julien Muchembled authored Jan 14, 2017

7c0d963f

Fix deadlock with storages that "sync" on a new transaction · 9821696f

Julien Muchembled authored Jan 13, 2017

This backports a change from commit 227953b9.

NEO, as well as ZEO+server_sync (ERP5 backports this feature with a
monkey-patch), pings the server (primary master node in the case of NEO) on
new transactions. However, this round-trip is actually performed by the thread
that also does tasks requiring to lock the DB, like processing of invalidations.

Since transaction 1.6.1 (more precisely commit e581a120a6), IStorage.sync()
is called indirectly by DB.open() when a transaction has already begun,
and the DB must not be locked when this happens.

9821696f

Don't manipulate Connection state after it has been returned to the pool. · 465b3502

Jason Madden authored Feb 01, 2017

Doing so leads to race conditions.

In particular, there can be an AttributeError.

See https://github.com/zodb/zodbshootout/issues/26 for details.

(cherry picked from commit f8cf23ec)

465b3502

Don't require persistent at setup time. We don't build native code that needs... · 98612054

Jason Madden authored Jan 28, 2017

Don't require persistent at setup time. We don't build native code that needs those headers anymore. Fixes #119.

(cherry picked from commit d7dae8b1)

98612054

use buildout to build so we don't have 2 environments to manage · 6a7d20fc

Jim Fulton authored Sep 08, 2016

This is also a workaround for pypa/setuptools#864, since the default pypy3 on
Travis is quite old.

(cherry picked from commit def73970
                       and 5dcca55e)

6a7d20fc

27 Nov, 2016 2 commits
- 4.4.4 · ee291e45
  Jim Fulton authored Nov 27, 2016
  
  ee291e45
- Merge pull request #134 from zopefoundation/transaction-203 · d268d538
  Jim Fulton authored Nov 27, 2016
```
Fixed to work with transaction 2.0.3.
```
  d268d538
25 Nov, 2016 1 commit
- Fixed to work with transaction 2.0.3. · 9a348a89
  Jim Fulton authored Nov 25, 2016
  
  9a348a89
27 Sep, 2016 2 commits

Merge pull request #121 from zopefoundation/4-simplify-README · 977e24bf
Jim Fulton authored Sep 27, 2016
```
Simplify the README file to avoid out of date information
```
977e24bf

Simplify the README file to avoid out of date information · 77f2ff75

Jim Fulton authored Sep 27, 2016

We have the same information scattered around in different places,
which increases the chance that some of it will be out of date.

See the end of:
https://groups.google.com/forum/#!topic/zodb/fy6RRVAF9-s

I want to try to avoid duplicating zodb.org.

77f2ff75

12 Sep, 2016 2 commits
- Merge pull request #116 from NextThought/114-for-4 · f618bffd
  Jim Fulton authored Sep 12, 2016
```
Clear Connection.transaction_manager on close. Fixes #114 for ZODB 4.
```
  f618bffd
- Clear Connection.transaction_manager on close. Fixes #114 · 760e22a9
  Jason Madden authored Sep 12, 2016
  
  760e22a9
09 Sep, 2016 1 commit
- Merge pull request #98 from zopefoundation/issue97 · 27f3a173
  Jim Fulton authored Sep 09, 2016
```
Call _p_resolveConflict() even if a conflicting change doesn't change the state
```
  27f3a173
21 Aug, 2016 2 commits
- Changelog for PR #98 · 7d436f39
  Julien Muchembled authored Aug 20, 2016
  
  7d436f39
- Call _p_resolveConflict() even if a conflicting change doesn't change the state · b74eef76
  Julien Muchembled authored Aug 20, 2016
```
This reverts to the behaviour of 3.10.3 and older.
```
  b74eef76
04 Aug, 2016 1 commit
- 4.4.3 · b53c7019
  Jim Fulton authored Aug 04, 2016
  
  b53c7019
27 Jul, 2016 2 commits
- Merge pull request #96 from zopefoundation/checkTransactionalUndoIterator · a2da8235
  Jim Fulton authored Jul 27, 2016
```
checkTransactionalUndoIterator: do not expect iterator to return sorted oids
```
  a2da8235
- checkTransactionalUndoIterator: do not expect iterator to return sorted oids · 9d418f12
  Julien Muchembled authored Jul 27, 2016
  
  9d418f12
26 Jul, 2016 1 commit
- Merge pull request #88 from zopefoundation/fstail-offset · 37e445a5
  Jim Fulton authored Jul 26, 2016
```
fstail: print the txn offset and header size, instead of only the data offset
```
  37e445a5
13 Jul, 2016 1 commit
- Changelog for PR #88 · 4392f902
  Julien Muchembled authored Jul 13, 2016
  
  4392f902
12 Jul, 2016 8 commits

fstail: print the txn offset and header size, instead of only the data offset · 3807ace8

Julien Muchembled authored Jul 10, 2016

Before:

    2016-07-01 09:41:50.416574: hash=d7101c5ee7b8e412d7b6d54873204421e09b7f34
    user='' description='' length=1629 offset=58990284

After:

    2016-07-01 09:41:50.416574: hash=d7101c5ee7b8e412d7b6d54873204421e09b7f34
    user='' description='' length=1629 offset=58990261 (+23)

The structure of a FileStorage DB is such that it's easy to revert the last
transactions, by truncating the file at the right offset. With the above
change, `fstail` can now be used to get this offset.

In the above example:

    truncate -s 58990261 Data.fs

would delete the transaction and all those after.

3807ace8

Merge pull request #89 from zopefoundation/undo-refactor · 75bae1a6
Jim Fulton authored Jul 12, 2016
```
Refactored FileStorage transactional undo
```
75bae1a6
removed out of date comment · b563487e
Jim Fulton authored Jul 12, 2016

b563487e
Merge pull request #86 from NextThought/handle-serials4 · e080bdcc
Jim Fulton authored Jul 12, 2016
```
Fix handle_all_serials for the new and old protocols.
```
e080bdcc
Update comment. [skip ci] · c13649da
Jason Madden authored Jul 12, 2016

c13649da
changes · d64a0cbf
Jim Fulton authored Jul 12, 2016

d64a0cbf

Refactored FileStorage transactional undo · d717a685

Jim Fulton authored Jul 12, 2016

As part of a project to provide object-level commit locks for ZEO, I'm
refactiring FileStorage to maintain transaction-specific data in
Tranaction.data.  This involved undo.  In trying to figure this out, I
found:

- A bug in _undoDataInfo, which I verified with some tests and

- _transactionalUndoRecord was maddeningly difficult to reason about
  (and thus change).

I was concerned less by the bug than my inability to know whether a
change to the code would be correct.

So I refactored the code, mainly transactionalUndoRecord, to make the
code easier to understand, fixing some logic errors (I'm pretty sure)
along the way.  This included lots of comments. (Comments are much
easier to compose when you're working out logic you didn't
understand.)

In addition to makeing the code cleaner, it allows undo to be handled
in cases that weren't handled before.

d717a685

Long lines. Grrrr. · a6c1713d
Jim Fulton authored Jul 12, 2016

a6c1713d

09 Jul, 2016 1 commit
- Fix handle_all_serials for the new and old protocols. · 87748b15
  Jason Madden authored Jul 09, 2016
  
  87748b15
08 Jul, 2016 1 commit
- 4.4.2 · 45831d69
  Jim Fulton authored Jul 08, 2016
  
  45831d69
05 Jul, 2016 1 commit
- Merge pull request #83 from zopefoundation/only-vote-reports-conflicts · d377efd3
  Jim Fulton authored Jul 05, 2016
```
Only tpc_vote can report resolved conflicts with the new commit protocol
```
  d377efd3