Commit fdf9e7a2 authored by Kirill Smelkov's avatar Kirill Smelkov Committed by Julien Muchembled

FileStorage: Save committed transaction to disk even if changed data is empty

[ This is ZODB4 backport of commit bb9bf539
  (https://github.com/zopefoundation/ZODB/pull/298) ]

ZODB tries to avoid saving empty transactions to storage on
`transaction.commit()`. The way it works is: if no objects were changed
during ongoing transaction, ZODB.Connection does not join current
TransactionManager, and transaction.commit() performs two-phase commit
protocol only on joined DataManagers. In other words if no objects were
changed, no tpc_*() methods are called at all on ZODB.Connection at
transaction.commit() time.

This way application servers like Zope/ZServer/ERP5/... can have
something as

    try:
        # process incoming request
        transaction.commit()    # processed ok
    except:
        transaction.abort()
        # problem: log + reraise

in top-level code to process requests without creating many on-disk
transactions with empty data changes just because read-only requests
were served.

Everything is working as intended.

However at storage level, FileStorage currently also checks whether
transaction that is being committed also comes with empty data changes,
and _skips_ saving transaction into disk *at all* for such cases, even
if it has been explicitly told to commit the transaction via two-phase
commit protocol calls done at storage level.

This creates the situation, where contrary to promise in
ZODB/interfaces.py(*), after successful tpc_begin/tpc_vote/tpc_finish()
calls made at storage level, transaction is _not_ made permanent,
despite tid of "committed" transaction being returned to caller. In other
words FileStorage, when asked to commit a transaction, even if one with
empty data changes, reports "ok" and gives transaction ID to the caller,
without creating corresponding transaction record on disk.

This behaviour is

a) redundant to application-level avoidance to create empty transaction
   on storage described in the beginning, and

b) creates problems:

The first problem is that application that works at storage-level might
be interested in persisting transaction, even with empty changes to
data, just because it wants to save the metadata similarly to e.g.
`git commit --allow-empty`.

The other problem is that an application view and data in database
become inconsistent: an application is told that a transaction was
created with corresponding transaction ID, but if the storage is
actually inspected, e.g. by iteration, the transaction is not there.
This, in particular, can create problems if TID of committed transaction
is reported elsewhere and that second database client does not find the
transaction it was told should exist.

I hit this particular problem with wendelin.core. In wendelin.core,
there is custom virtual memory layer that keeps memory in sync with
data in ZODB. At commit time, the memory is inspected for being dirtied,
and if a page was changed, virtual memory layer joins current
transaction _and_ forces corresponding ZODB.Connection - via which it
will be saving data into ZODB objects - to join the transaction too,
because it would be too late to join ZODB.Connection after 2PC process
has begun(+). One of the format in which data are saved tries to
optimize disk space usage, and it actually might happen, that even if
data in RAM were dirtied, the data itself stayed the same and so nothing
should be saved into ZODB. However ZODB.Connection is already joined
into transaction and it is hard not to join it because joining a
DataManager when the 2PC is already ongoing does not work.

This used to work ok with wendelin.core 1, but with wendelin.core 2 -
where separate virtual filesystem is also connected to the database to
provide base layer for arrays mappings - this creates problem, because
when wcfs (the filesystem) is told to synchronize to view the database
@tid of committed transaction, it can wait forever waiting for that, or
later, transaction to appear on disk in the database, creating
application-level deadlock.

I agree that some more effort might be made at wendelin.core side to
avoid committing transactions with empty data at storage level.

However the most clean way to fix this problem in my view is to fix
FileStorage itself, because if at storage level it was asked to commit
something, it should not silently skip doing so and dropping even non-empty
metadata + returning ok and committed transaction ID to the caller.

As described in the beginning this should not create problems for
application-level ZODB users, while at storage-level the implementation
is now consistently matching interface and common sense.

----

(*) tpc_finish: Finish the transaction, making any transaction changes permanent.
    Changes must be made permanent at this point.
    ...

    https://github.com/zopefoundation/ZODB/blob/5.5.1-35-gb5895a5c2/src/ZODB/interfaces.py#L828-L831

(+) https://lab.nexedi.com/kirr/wendelin.core/blob/9ff5ed32/bigfile/file_zodb.py#L788-822
parent 7a1a4911
...@@ -706,8 +706,6 @@ class FileStorage( ...@@ -706,8 +706,6 @@ class FileStorage(
raise StorageTransactionError( raise StorageTransactionError(
"tpc_vote called with wrong transaction") "tpc_vote called with wrong transaction")
dlen = self._tfile.tell() dlen = self._tfile.tell()
if not dlen:
return # No data in this trans
self._tfile.seek(0) self._tfile.seek(0)
user, descr, ext = self._ude user, descr, ext = self._ude
...@@ -750,22 +748,19 @@ class FileStorage( ...@@ -750,22 +748,19 @@ class FileStorage(
self._commit_lock_release() self._commit_lock_release()
def _finish(self, tid, u, d, e): def _finish(self, tid, u, d, e):
# If self._nextpos is 0, then the transaction didn't write any # Clear the checkpoint flag
# data, so we don't bother writing anything to the file. self._file.seek(self._pos+16)
if self._nextpos: self._file.write(as_bytes(self._tstatus))
# Clear the checkpoint flag try:
self._file.seek(self._pos+16) # At this point, we may have committed the data to disk.
self._file.write(as_bytes(self._tstatus)) # If we fail from here, we're in bad shape.
try: self._finish_finish(tid)
# At this point, we may have committed the data to disk. except:
# If we fail from here, we're in bad shape. # Ouch. This is bad. Let's try to get back to where we were
self._finish_finish(tid) # and then roll over and die
except: logger.critical("Failure in _finish. Closing.", exc_info=True)
# Ouch. This is bad. Let's try to get back to where we were self.close()
# and then roll over and die raise
logger.critical("Failure in _finish. Closing.", exc_info=True)
self.close()
raise
def _finish_finish(self, tid): def _finish_finish(self, tid):
# This is a separate method to allow tests to replace it with # This is a separate method to allow tests to replace it with
......
...@@ -317,6 +317,39 @@ class FileStorageTests( ...@@ -317,6 +317,39 @@ class FileStorageTests(
self._storage._files.flush = lambda: None self._storage._files.flush = lambda: None
self.checkFlushAfterTruncate(True) self.checkFlushAfterTruncate(True)
def checkCommitWithEmptyData(self):
"""
Verify that transaction is persisted even if it has no data, or even
both no data and empty metadata.
"""
# verify:
# - commit with empty data but non-empty metadata
# - commit with empty data and empty metadata
# (the fact of commit carries information by itself)
stor = self._storage
for description in (u'commit with empty data', u''):
t = transaction.Transaction()
t.description = description
stor.tpc_begin(t)
stor.tpc_vote(t)
head = stor.tpc_finish(t)
if head is None:
# in ZODB4 returning tid from tpc_finish is optional
head = stor.lastTransaction()
self.assertEqual(head, stor.lastTransaction())
v = list( stor.iterator(start=head, stop=head) )
self.assertEqual(len(v), 1)
trec = v[0] # FileStorage.TransactionRecord or hexstorage.Transaction
self.assertEqual(trec.tid, head)
self.assertEqual(trec.user, b'')
self.assertEqual(trec.description, description.encode('utf-8'))
self.assertEqual(trec.extension, {})
drecv = list(trec)
self.assertEqual(drecv, [])
class FileStorageHexTests(FileStorageTests): class FileStorageHexTests(FileStorageTests):
def open(self, **kwargs): def open(self, **kwargs):
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment