.

8a417197 · Kirill Smelkov · c4c25753 · 8a417197
Commit 8a417197 authored Dec 06, 2019 by Kirill Smelkov
Show whitespace changes
Inline Side-by-side

Showing with 37 additions and 34 deletions

bigfile/file_zodb.py bigfile/file_zodb.py +37 -34

No files found.
--- a/bigfile/file_zodb.py
+++ b/bigfile/file_zodb.py
@@ -78,22 +78,26 @@ Data format

 Due to weakness of current ZODB storage servers, wendelin.core cannot provide
 at the same time both fast reads and small database size growth on small data
-changes. "Small" here means something like 1-10000 bytes as larger changes
-become comparable to 2M block size and are handled efficiently out of the box.
-Until the problem is fixed on ZODB server side, users have to explicitly
-indicate via environment variable that their workload is "small changes" if
-they prefer to prioritize database size over access speed::
+changes. "Small" here means something like 1-10000 bytes per transaction as
+larger changes become comparable to 2M block size and are handled efficiently
+out of the box. Until the problem is fixed on ZODB server side, wendelin.core
+provides on-client workaround in the form of specialized block format, and
+users have to explicitly indicate via environment variable that their workload
+is "small changes" if they prefer to prioritize database size over access
+speed::

  $WENDELIN_CORE_ZBLK_FMT
      ZBlk0             fast reads      (default)
      ZBlk1             small changes


+Description of block formats follow:
+
 To represent BigFile as ZODB objects, each file block is represented separately
 either as

    1) one ZODB object, or          (ZBlk0)
-    2) group of ZODB objects        (ZBlk1)
+    2) group of ZODB objects        (ZBlk1)     XXX wcfs loads in parallel

 with top-level BTree directory #blk -> objects representing block.

@@ -118,36 +122,35 @@ On the other hand, if object management is moved to DB *server* side, it is
 possible to deduplicate them there and this way have low-overhead for both
 access-time and DB size with just client storing 1 object per file block. This
 will be our future approach after we teach NEO about object deduplication.
-
-~~~~
-
-As file pages are changed in RAM with changes being managed by virtmem
-subsystem, we need to propagate the changes to ZODB objects back at some time.
-
-Two approaches exist:
-
-    1) on every RAM page dirty, in a callback invoked by virtmem, mark
-       corresponding ZODB object as dirty, and at commit time, in
-       obj.__getstate__ retrieve memory content.
-
-    2) hook into commit process, and before committing, synchronize RAM page
-       state to ZODB objects state, propagating all dirtied pages to ZODB objects
-       and then do the commit process as usual.
-
-"1" is more natural to how ZODB works, but requires tight integration between
-virtmem subsystem and ZODB (to be able to receive callback on a page dirtying).
-
-"2" is less natural to how ZODB works, but requires less-tight integration
-between virtmem subsystem and ZODB, and virtmem->ZODB propagation happens only
-at commit time.
-
-Since, for performance reasons, virtmem subsystem is going away and BigFiles
-will be represented by real FUSE-based filesystem with virtual memory being
-done by kernel, where we cannot get callback on a page-dirtying, it is more
-natural to also use "2" here.
 """

-# FIXME ^^^ doc is horrible - add top-level up->down overview.
+# file_zodb organization
+#
+# As file pages are changed in RAM with changes being managed by virtmem
+# subsystem, we need to propagate the changes to ZODB objects back at some time.
+#
+# Two approaches exist:
+#
+#     1) on every RAM page dirty, in a callback invoked by virtmem, mark
+#        corresponding ZODB object as dirty, and at commit time, in
+#        obj.__getstate__ retrieve memory content.
+#
+#     2) hook into commit process, and before committing, synchronize RAM page
+#        state to ZODB objects state, propagating all dirtied pages to ZODB objects
+#        and then do the commit process as usual.
+#
+# "1" is more natural to how ZODB works, but requires tight integration between
+# virtmem subsystem and ZODB (to be able to receive callback on a page dirtying).
+#
+# "2" is less natural to how ZODB works, but requires less-tight integration
+# between virtmem subsystem and ZODB, and virtmem->ZODB propagation happens only
+# at commit time.
+#
+# Since, for performance reasons, virtmem subsystem is going away and BigFiles
+# will be represented by real FUSE-based filesystem with virtual memory being
+# done by kernel, where we cannot get callback on a page-dirtying, it is more
+# natural to also use "2" here.
+

 from wendelin.bigfile import BigFile, WRITEOUT_STORE, WRITEOUT_MARKSTORED
 from wendelin import wcfs