1. 07 Dec, 2010 1 commit
  2. 06 Dec, 2010 8 commits
  3. 03 Dec, 2010 1 commit
  4. 01 Dec, 2010 13 commits
    • Rusty Russell's avatar
      Merge branch 'tdb2' · 51a56b52
      Rusty Russell authored
      51a56b52
    • Rusty Russell's avatar
      tdb2: update documentation · a42bba8e
      Rusty Russell authored
      Specifically the linked free tables, and reflect on the status of each
      point of the design document.
      a42bba8e
    • Rusty Russell's avatar
      tdb2: use separate magic constants for chain, htable and ftable entries · 2491b65a
      Rusty Russell authored
      Rather than overloading TDB_USED_MAGIC and the hash value as we do now.
      We also rename "free list" to the more-accurate "free table" everywhere.
      2491b65a
    • Rusty Russell's avatar
      tdb2: direct access during transactions. · f5087965
      Rusty Russell authored
      Currently we fall back to copying data during a transaction, but we don't
      need to in many cases.  Grant direct access in those cases.
      
      Before:
      $ ./speed --transaction 1000000
      Adding 1000000 records:  2409 ns (59916680 bytes)
      Finding 1000000 records:  1156 ns (59916680 bytes)
      Missing 1000000 records:  604 ns (59916680 bytes)
      Missing 1000000 records:  604 ns (59916680 bytes)
      Traversing 1000000 records:  1226 ns (59916680 bytes)
      Deleting 1000000 records:  1556 ns (119361928 bytes)
      Re-adding 1000000 records:  2326 ns (119361928 bytes)
      Appending 1000000 records:  3269 ns (246656880 bytes)
      Churning 1000000 records:  5613 ns (338235248 bytes)
      
      After:
      $ ./speed --transaction 1000000
      Adding 1000000 records:  1902 ns (59916680 bytes)
      Finding 1000000 records:  1032 ns (59916680 bytes)
      Missing 1000000 records:  606 ns (59916680 bytes)
      Missing 1000000 records:  606 ns (59916680 bytes)
      Traversing 1000000 records:  741 ns (59916680 bytes)
      Deleting 1000000 records:  1347 ns (119361928 bytes)
      Re-adding 1000000 records:  1727 ns (119361928 bytes)
      Appending 1000000 records:  2561 ns (246656880 bytes)
      Churning 1000000 records:  4403 ns (338235248 bytes)
      f5087965
    • Rusty Russell's avatar
      tdb2: add write flag to tdb_direct · a56db4a5
      Rusty Russell authored
      This is a precursor to direct access during transactions: they care about
      whether we are going to read or write to the file.
      a56db4a5
    • Rusty Russell's avatar
      tdb2: hash chaining · 2523e67f
      Rusty Russell authored
      If we get enough hash collisions, we can run out of hash bits; this almost
      certainly is caused by a deliberate attempt to break the tdb (hash bombing).
      
      Implement chained records for this case; they're slow but will keep the
      rest of the database functioning.
      2523e67f
    • Rusty Russell's avatar
      tdb2: use magic freetable value rather than different magic for coalescing · 539f1af0
      Rusty Russell authored
      We have to unlock during coalescing, so we mark records specially to indicate
      to tdb_check that they're not on any list, and to prevent other coalescers
      from grabbing them.
      
      Use a special free list number, rather than a new magic.
      539f1af0
    • Rusty Russell's avatar
      tdb2: compare the extra 11 hash bits in the header. · 11564894
      Rusty Russell authored
      We already have 10 hash bits encoded in the offset itself; we only get
      here incorrectly about 1 time in 1000, so it's a pretty minor
      optimization at best.
      
      Nonetheless, we have the information, so let's check it before
      accessing the key.  This reduces the probability of a false keycmp by
      another factor of 2000.
      11564894
    • Rusty Russell's avatar
      tdb2: add comparison stats · f2c286c9
      Rusty Russell authored
      f2c286c9
    • Rusty Russell's avatar
      tdb2: clean up logging · 4e185ad8
      Rusty Russell authored
      Logged errors should always set tdb->ecode before they are called, and
      there's little reason to have a sprintf-style logging function since
      we can do the formatting internally.
      
      Change the tdb_log attribute to just take a "const char *", and create
      a new tdb_logerr() helper which sets ecode and calls it.  As a bonus,
      mark it COLD so the compiler can optimize appropriately knowing that
      it's unlikely to be invoked.
      4e185ad8
    • Rusty Russell's avatar
      tdb2: remove truncated bit. · 57680260
      Rusty Russell authored
      There was an idea that we would use a bit to indicate that we didn't have
      the full hash value; this would allow us to move records down when we
      expanded a hash without rehashing them.
      
      There's little evidence that rehashing in this case is particularly
      expensive, so remove the bit.  We use that bit simply to indicate that
      an offset refers to a subhash instead.
      57680260
    • Rusty Russell's avatar
      tdb2: use direct access for tdb_read_off/tdb_write_off · b44978e1
      Rusty Russell authored
      This is one case where getting rid of tdb_get() cost us.  Also, we
      add more read-only checks.
      
      Before we removed tdb_get:
      Adding 1000000 records:  6480 ns (59900296 bytes)
      Finding 1000000 records:  2839 ns (59900296 bytes)
      Missing 1000000 records:  2485 ns (59900296 bytes)
      Traversing 1000000 records:  2598 ns (59900296 bytes)
      Deleting 1000000 records:  5342 ns (59900296 bytes)
      Re-adding 1000000 records:  5613 ns (59900296 bytes)
      Appending 1000000 records:  12194 ns (93594224 bytes)
      Churning 1000000 records:  14549 ns (93594224 bytes)
      
      Now:
      Adding 1000000 records:  6307 ns (59900296 bytes)
      Finding 1000000 records:  2801 ns (59900296 bytes)
      Missing 1000000 records:  2515 ns (59900296 bytes)
      Traversing 1000000 records:  2579 ns (59900296 bytes)
      Deleting 1000000 records:  5225 ns (59900296 bytes)
      Re-adding 1000000 records:  5878 ns (59900296 bytes)
      Appending 1000000 records:  12665 ns (93594224 bytes)
      Churning 1000000 records:  16090 ns (93594224 bytes)
      b44978e1
    • Rusty Russell's avatar
      tdb2: remove tdb_get() · 96b169e9
      Rusty Russell authored
      We have four internal helpers for reading data from the database:
      1) tdb_read_convert() - read (and convert) into a buffer.
      2) tdb_read_off() - read (and convert) and offset.
      3) tdb_access_read() - malloc or direct access to the database.
      4) tdb_get() - copy into a buffer or direct access to the database.
      
      The last one doesn't really buy us anything, so remove it (except for
      tdb_read_off/tdb_write_off, see next patch).
      
      Before:
      Adding 1000000 records:  6480 ns (59900296 bytes)
      Finding 1000000 records:  2839 ns (59900296 bytes)
      Missing 1000000 records:  2485 ns (59900296 bytes)
      Traversing 1000000 records:  2598 ns (59900296 bytes)
      Deleting 1000000 records:  5342 ns (59900296 bytes)
      Re-adding 1000000 records:  5613 ns (59900296 bytes)
      Appending 1000000 records:  12194 ns (93594224 bytes)
      Churning 1000000 records:  14549 ns (93594224 bytes)
      
      After:
      Adding 1000000 records:  6497 ns (59900296 bytes)
      Finding 1000000 records:  2854 ns (59900296 bytes)
      Missing 1000000 records:  2563 ns (59900296 bytes)
      Traversing 1000000 records:  2735 ns (59900296 bytes)
      Deleting 1000000 records:  11357 ns (59900296 bytes)
      Re-adding 1000000 records:  8145 ns (59900296 bytes)
      Appending 1000000 records:  10939 ns (93594224 bytes)
      Churning 1000000 records:  18479 ns (93594224 bytes)
      96b169e9
  5. 23 Nov, 2010 2 commits
  6. 01 Dec, 2010 1 commit
  7. 23 Nov, 2010 6 commits
  8. 22 Nov, 2010 2 commits
    • Rusty Russell's avatar
      tdb2: relax locking to allow two free list locks at once · a5b66d70
      Rusty Russell authored
      As long as they are in descending order.  This prevents the common case of:
      
      1) Grab lock for bucket.
      2) Remove entry from bucket.
      3) Drop lock for bucket.
      4) Grab lock for bucket for leftover.
      5) Add leftover entry to bucket.
      6) Drop lock for leftover bucket.
      
      In particular it's quite common for the leftover bucket to be the same
      as the entry bucket (when it's the largest bucket); if it's not, we are
      no worse than before.
      
      Current results of speed test:
      $ ./speed 1000000
      Adding 1000000 records:  13194 ns (60083192 bytes)
      Finding 1000000 records:  2438 ns (60083192 bytes)
      Traversing 1000000 records:  2167 ns (60083192 bytes)
      Deleting 1000000 records:  9265 ns (60083192 bytes)
      Re-adding 1000000 records:  10241 ns (60083192 bytes)
      Appending 1000000 records:  17692 ns (93879992 bytes)
      Churning 1000000 records:  26077 ns (93879992 bytes)
      
      Previous:
      $ ./speed 1000000
      Adding 1000000 records:  23210 ns (59193360 bytes)
      Finding 1000000 records:  2387 ns (59193360 bytes)
      Traversing 1000000 records:  2150 ns (59193360 bytes)
      Deleting 1000000 records:  13392 ns (59193360 bytes)
      Re-adding 1000000 records:  11546 ns (59193360 bytes)
      Appending 1000000 records:  29327 ns (91193360 bytes)
      Churning 1000000 records:  33026 ns (91193360 bytes)
      a5b66d70
    • Rusty Russell's avatar
      tdb2: use expansion heuristics from tdb1 · 20defbbc
      Rusty Russell authored
      This reduces the amount of expansion we do.
      
      Before:
      ./speed 1000000
      Adding 1000000 records:  23210 ns (59193360 bytes)
      Finding 1000000 records:  2387 ns (59193360 bytes)
      Traversing 1000000 records:  2150 ns (59193360 bytes)
      Deleting 1000000 records:  13392 ns (59193360 bytes)
      Re-adding 1000000 records:  11546 ns (59193360 bytes)
      Appending 1000000 records:  29327 ns (91193360 bytes)
      Churning 1000000 records:  33026 ns (91193360 bytes)
      
      After:
      $ ./speed 1000000
      Adding 1000000 records:  17480 ns (61472904 bytes)
      Finding 1000000 records:  2431 ns (61472904 bytes)
      Traversing 1000000 records:  2194 ns (61472904 bytes)
      Deleting 1000000 records:  10948 ns (61472904 bytes)
      Re-adding 1000000 records:  11247 ns (61472904 bytes)
      Appending 1000000 records:  21826 ns (96051424 bytes)
      Churning 1000000 records:  27242 ns (96051424 bytes)
      20defbbc
  9. 01 Dec, 2010 1 commit
    • Rusty Russell's avatar
      tdb2: shrink free header from 32 to 24 bytes. · 5e30abc6
      Rusty Russell authored
      This reduces our minimum key+data length to 8 bytes; we do this by packing
      the prev pointer where we used to put the flist pointer, and storing the
      flist as an 8 bit index (meaning we can only have 256 free tables).
      
      Note that this has a perverse result on the size of the database, as our
      4-byte key and 4-byte data now fit perfectly in a minimal record, so
      appeding causes us to allocate new records which are 50% larger,
      since we detect growing.
      
      Current results of speed test:
      $ ./speed 1000000
      Adding 1000000 records:  23210 ns (59193360 bytes)
      Finding 1000000 records:  2387 ns (59193360 bytes)
      Traversing 1000000 records:  2150 ns (59193360 bytes)
      Deleting 1000000 records:  13392 ns (59193360 bytes)
      Re-adding 1000000 records:  11546 ns (59193360 bytes)
      Appending 1000000 records:  29327 ns (91193360 bytes)
      Churning 1000000 records:  33026 ns (91193360 bytes)
      
      Previous:
      $ ./speed 1000000
      Adding 1000000 records:  28324 ns (67232528 bytes)
      Finding 1000000 records:  2468 ns (67232528 bytes)
      Traversing 1000000 records:  2200 ns (67232528 bytes)
      Deleting 1000000 records:  13083 ns (67232528 bytes)
      Re-adding 1000000 records:  16433 ns (67232528 bytes)
      Appending 1000000 records:  2511 ns (67232528 bytes)
      Churning 1000000 records:  31068 ns (67570448 bytes)
      5e30abc6
  10. 23 Nov, 2010 1 commit
  11. 01 Dec, 2010 4 commits
    • Rusty Russell's avatar
      tdb2: Add speed test to tdb and tdb2 · 076c398e
      Rusty Russell authored
      Current results of speed test:
      $ ./speed 1000000
      Adding 1000000 records:  14726 ns (67244816 bytes)
      Finding 1000000 records:  2844 ns (67244816 bytes)
      Missing 1000000 records:  2528 ns (67244816 bytes)
      Traversing 1000000 records:  2572 ns (67244816 bytes)
      Deleting 1000000 records:  5358 ns (67244816 bytes)
      Re-adding 1000000 records:  9176 ns (67244816 bytes)
      Appending 1000000 records:  3035 ns (67244816 bytes)
      Churning 1000000 records:  18139 ns (67565840 bytes)
      $ ./speed 100000
      Adding 100000 records:  13270 ns (14349584 bytes)
      Finding 100000 records:  2769 ns (14349584 bytes)
      Missing 100000 records:  2422 ns (14349584 bytes)
      Traversing 100000 records:  2595 ns (14349584 bytes)
      Deleting 100000 records:  5331 ns (14349584 bytes)
      Re-adding 100000 records:  5875 ns (14349584 bytes)
      Appending 100000 records:  2751 ns (14349584 bytes)
      Churning 100000 records:  20666 ns (25771280 bytes)
      
      vs tdb1 (with hashsize 100003):
      $ ./speed 1000000
      Adding 1000000 records:  8547 ns (44306432 bytes)
      Finding 1000000 records:  5595 ns (44306432 bytes)
      Missing 1000000 records:  3469 ns (44306432 bytes)
      Traversing 1000000 records:  4571 ns (44306432 bytes)
      Deleting 1000000 records:  12115 ns (44306432 bytes)
      Re-adding 1000000 records:  10505 ns (44306432 bytes)
      Appending 1000000 records:  10610 ns (44306432 bytes)
      Churning 1000000 records:  28697 ns (44306432 bytes)
      $ ./speed 100000
      Adding 100000 records:  6030 ns (4751360 bytes)
      Finding 100000 records:  3141 ns (4751360 bytes)
      Missing 100000 records:  3143 ns (4751360 bytes)
      Traversing 100000 records:  4659 ns (4751360 bytes)
      Deleting 100000 records:  7891 ns (4751360 bytes)
      Re-adding 100000 records:  5913 ns (4751360 bytes)
      Appending 100000 records:  4242 ns (4751360 bytes)
      Churning 100000 records:  15300 ns (4751360 bytes)
      076c398e
    • Rusty Russell's avatar
      tdb2: fix tdb_check() return when free table entries missing. · dbbde019
      Rusty Russell authored
      It mistakenly returned -1 meaning "success".
      dbbde019
    • Rusty Russell's avatar
      c84c6577
    • Rusty Russell's avatar
      tdb2: make summary command handle recovery "dead zone" · ec88af5d
      Rusty Russell authored
      We can run summary with a recovery area, or a dead zone.
      ec88af5d