- 01 Dec, 2010 7 commits
-
-
Rusty Russell authored
We have to unlock during coalescing, so we mark records specially to indicate to tdb_check that they're not on any list, and to prevent other coalescers from grabbing them. Use a special free list number, rather than a new magic.
-
Rusty Russell authored
We already have 10 hash bits encoded in the offset itself; we only get here incorrectly about 1 time in 1000, so it's a pretty minor optimization at best. Nonetheless, we have the information, so let's check it before accessing the key. This reduces the probability of a false keycmp by another factor of 2000.
-
Rusty Russell authored
-
Rusty Russell authored
Logged errors should always set tdb->ecode before they are called, and there's little reason to have a sprintf-style logging function since we can do the formatting internally. Change the tdb_log attribute to just take a "const char *", and create a new tdb_logerr() helper which sets ecode and calls it. As a bonus, mark it COLD so the compiler can optimize appropriately knowing that it's unlikely to be invoked.
-
Rusty Russell authored
There was an idea that we would use a bit to indicate that we didn't have the full hash value; this would allow us to move records down when we expanded a hash without rehashing them. There's little evidence that rehashing in this case is particularly expensive, so remove the bit. We use that bit simply to indicate that an offset refers to a subhash instead.
-
Rusty Russell authored
This is one case where getting rid of tdb_get() cost us. Also, we add more read-only checks. Before we removed tdb_get: Adding 1000000 records: 6480 ns (59900296 bytes) Finding 1000000 records: 2839 ns (59900296 bytes) Missing 1000000 records: 2485 ns (59900296 bytes) Traversing 1000000 records: 2598 ns (59900296 bytes) Deleting 1000000 records: 5342 ns (59900296 bytes) Re-adding 1000000 records: 5613 ns (59900296 bytes) Appending 1000000 records: 12194 ns (93594224 bytes) Churning 1000000 records: 14549 ns (93594224 bytes) Now: Adding 1000000 records: 6307 ns (59900296 bytes) Finding 1000000 records: 2801 ns (59900296 bytes) Missing 1000000 records: 2515 ns (59900296 bytes) Traversing 1000000 records: 2579 ns (59900296 bytes) Deleting 1000000 records: 5225 ns (59900296 bytes) Re-adding 1000000 records: 5878 ns (59900296 bytes) Appending 1000000 records: 12665 ns (93594224 bytes) Churning 1000000 records: 16090 ns (93594224 bytes)
-
Rusty Russell authored
We have four internal helpers for reading data from the database: 1) tdb_read_convert() - read (and convert) into a buffer. 2) tdb_read_off() - read (and convert) and offset. 3) tdb_access_read() - malloc or direct access to the database. 4) tdb_get() - copy into a buffer or direct access to the database. The last one doesn't really buy us anything, so remove it (except for tdb_read_off/tdb_write_off, see next patch). Before: Adding 1000000 records: 6480 ns (59900296 bytes) Finding 1000000 records: 2839 ns (59900296 bytes) Missing 1000000 records: 2485 ns (59900296 bytes) Traversing 1000000 records: 2598 ns (59900296 bytes) Deleting 1000000 records: 5342 ns (59900296 bytes) Re-adding 1000000 records: 5613 ns (59900296 bytes) Appending 1000000 records: 12194 ns (93594224 bytes) Churning 1000000 records: 14549 ns (93594224 bytes) After: Adding 1000000 records: 6497 ns (59900296 bytes) Finding 1000000 records: 2854 ns (59900296 bytes) Missing 1000000 records: 2563 ns (59900296 bytes) Traversing 1000000 records: 2735 ns (59900296 bytes) Deleting 1000000 records: 11357 ns (59900296 bytes) Re-adding 1000000 records: 8145 ns (59900296 bytes) Appending 1000000 records: 10939 ns (93594224 bytes) Churning 1000000 records: 18479 ns (93594224 bytes)
-
- 23 Nov, 2010 1 commit
-
-
Rusty Russell authored
We currently only have one, so shortcut the case where we want our current one.
-
- 01 Dec, 2010 1 commit
-
-
Rusty Russell authored
This is good for deep debugging.
-
- 22 Nov, 2010 2 commits
-
-
Rusty Russell authored
As long as they are in descending order. This prevents the common case of: 1) Grab lock for bucket. 2) Remove entry from bucket. 3) Drop lock for bucket. 4) Grab lock for bucket for leftover. 5) Add leftover entry to bucket. 6) Drop lock for leftover bucket. In particular it's quite common for the leftover bucket to be the same as the entry bucket (when it's the largest bucket); if it's not, we are no worse than before. Current results of speed test: $ ./speed 1000000 Adding 1000000 records: 13194 ns (60083192 bytes) Finding 1000000 records: 2438 ns (60083192 bytes) Traversing 1000000 records: 2167 ns (60083192 bytes) Deleting 1000000 records: 9265 ns (60083192 bytes) Re-adding 1000000 records: 10241 ns (60083192 bytes) Appending 1000000 records: 17692 ns (93879992 bytes) Churning 1000000 records: 26077 ns (93879992 bytes) Previous: $ ./speed 1000000 Adding 1000000 records: 23210 ns (59193360 bytes) Finding 1000000 records: 2387 ns (59193360 bytes) Traversing 1000000 records: 2150 ns (59193360 bytes) Deleting 1000000 records: 13392 ns (59193360 bytes) Re-adding 1000000 records: 11546 ns (59193360 bytes) Appending 1000000 records: 29327 ns (91193360 bytes) Churning 1000000 records: 33026 ns (91193360 bytes)
-
Rusty Russell authored
This reduces the amount of expansion we do. Before: ./speed 1000000 Adding 1000000 records: 23210 ns (59193360 bytes) Finding 1000000 records: 2387 ns (59193360 bytes) Traversing 1000000 records: 2150 ns (59193360 bytes) Deleting 1000000 records: 13392 ns (59193360 bytes) Re-adding 1000000 records: 11546 ns (59193360 bytes) Appending 1000000 records: 29327 ns (91193360 bytes) Churning 1000000 records: 33026 ns (91193360 bytes) After: $ ./speed 1000000 Adding 1000000 records: 17480 ns (61472904 bytes) Finding 1000000 records: 2431 ns (61472904 bytes) Traversing 1000000 records: 2194 ns (61472904 bytes) Deleting 1000000 records: 10948 ns (61472904 bytes) Re-adding 1000000 records: 11247 ns (61472904 bytes) Appending 1000000 records: 21826 ns (96051424 bytes) Churning 1000000 records: 27242 ns (96051424 bytes)
-
- 01 Dec, 2010 1 commit
-
-
Rusty Russell authored
This reduces our minimum key+data length to 8 bytes; we do this by packing the prev pointer where we used to put the flist pointer, and storing the flist as an 8 bit index (meaning we can only have 256 free tables). Note that this has a perverse result on the size of the database, as our 4-byte key and 4-byte data now fit perfectly in a minimal record, so appeding causes us to allocate new records which are 50% larger, since we detect growing. Current results of speed test: $ ./speed 1000000 Adding 1000000 records: 23210 ns (59193360 bytes) Finding 1000000 records: 2387 ns (59193360 bytes) Traversing 1000000 records: 2150 ns (59193360 bytes) Deleting 1000000 records: 13392 ns (59193360 bytes) Re-adding 1000000 records: 11546 ns (59193360 bytes) Appending 1000000 records: 29327 ns (91193360 bytes) Churning 1000000 records: 33026 ns (91193360 bytes) Previous: $ ./speed 1000000 Adding 1000000 records: 28324 ns (67232528 bytes) Finding 1000000 records: 2468 ns (67232528 bytes) Traversing 1000000 records: 2200 ns (67232528 bytes) Deleting 1000000 records: 13083 ns (67232528 bytes) Re-adding 1000000 records: 16433 ns (67232528 bytes) Appending 1000000 records: 2511 ns (67232528 bytes) Churning 1000000 records: 31068 ns (67570448 bytes)
-
- 23 Nov, 2010 1 commit
-
-
Rusty Russell authored
-
- 01 Dec, 2010 5 commits
-
-
Rusty Russell authored
Current results of speed test: $ ./speed 1000000 Adding 1000000 records: 14726 ns (67244816 bytes) Finding 1000000 records: 2844 ns (67244816 bytes) Missing 1000000 records: 2528 ns (67244816 bytes) Traversing 1000000 records: 2572 ns (67244816 bytes) Deleting 1000000 records: 5358 ns (67244816 bytes) Re-adding 1000000 records: 9176 ns (67244816 bytes) Appending 1000000 records: 3035 ns (67244816 bytes) Churning 1000000 records: 18139 ns (67565840 bytes) $ ./speed 100000 Adding 100000 records: 13270 ns (14349584 bytes) Finding 100000 records: 2769 ns (14349584 bytes) Missing 100000 records: 2422 ns (14349584 bytes) Traversing 100000 records: 2595 ns (14349584 bytes) Deleting 100000 records: 5331 ns (14349584 bytes) Re-adding 100000 records: 5875 ns (14349584 bytes) Appending 100000 records: 2751 ns (14349584 bytes) Churning 100000 records: 20666 ns (25771280 bytes) vs tdb1 (with hashsize 100003): $ ./speed 1000000 Adding 1000000 records: 8547 ns (44306432 bytes) Finding 1000000 records: 5595 ns (44306432 bytes) Missing 1000000 records: 3469 ns (44306432 bytes) Traversing 1000000 records: 4571 ns (44306432 bytes) Deleting 1000000 records: 12115 ns (44306432 bytes) Re-adding 1000000 records: 10505 ns (44306432 bytes) Appending 1000000 records: 10610 ns (44306432 bytes) Churning 1000000 records: 28697 ns (44306432 bytes) $ ./speed 100000 Adding 100000 records: 6030 ns (4751360 bytes) Finding 100000 records: 3141 ns (4751360 bytes) Missing 100000 records: 3143 ns (4751360 bytes) Traversing 100000 records: 4659 ns (4751360 bytes) Deleting 100000 records: 7891 ns (4751360 bytes) Re-adding 100000 records: 5913 ns (4751360 bytes) Appending 100000 records: 4242 ns (4751360 bytes) Churning 100000 records: 15300 ns (4751360 bytes)
-
Rusty Russell authored
It mistakenly returned -1 meaning "success".
-
Rusty Russell authored
-
Rusty Russell authored
We can run summary with a recovery area, or a dead zone.
-
Rusty Russell authored
Otherwise we leak memory.
-
- 23 Nov, 2010 6 commits
-
-
Rusty Russell authored
-
Rusty Russell authored
It's problematic for transaction commit to get the expansion lock, but in fact we always grab a hash lock before the transaction lock, so it doesn't really need it (the transaction locks the entire database). Assert that this is true, and fix up a few lowlevel tests where it wasn't.
-
Rusty Russell authored
I left much tdb1 code in various files for inspiration, and in case I needed it later. Now we have all the major features implemented, remove it.
-
Rusty Russell authored
This adds transactions to tdb2; the code is taken from tdb1 with minimal modifications, as are the unit
-
Rusty Russell authored
If we have a write lock and ask for a read lock, that's OK, but not the other way around. tdb_nest_lock() allowed both, tdb_allrecord_lock() allowed neither.
-
Rusty Russell authored
This wasn't fixed when we converted to ccan/opt in 8d706678. Unfortunately, unistd.h defines optarg, so the compiler didn't catch this.
-
- 17 Nov, 2010 5 commits
-
-
Rusty Russell authored
This adds chains of free tables: we choose one at random at the start and we iterate through them when they are exhausted. Currently there is no code to actually add a new free table, but we test that we can handle it if we add one in future.
-
Rusty Russell authored
Zones were a bad idea. They mean we can't simply add stuff to the end of the file (which transactions relied upon), and there's no good heuristic in how to size them. This patch reverts us to a single free table.
-
Rusty Russell authored
We were previously jumping straight from the first bucket to the end.
-
Rusty Russell authored
We were adding 50% to datalen twice, so move it out of adjust_size and make the callers do it. We also add a test that the heuristic is working at all.
-
Rusty Russell authored
We don't actually need it.
-
- 15 Nov, 2010 9 commits
-
-
Rusty Russell authored
We can't enlarge the lock without risking deadlock, so tdb_chainlock() can't simply grab a one-byte lock; it needs to grab the lock we'd need to protect the hash. In theory, tdb_chainlock_read() could use a one-byte lock though.
-
Rusty Russell authored
When we're coalescing, we need to drop the lock on the current free list, as we've enlarged the block and it may now belong in a different list. Unfortunately (as shown by repeated tdbtorture -n 8) another coalescing run can do the coalescing while we've dropped the lock. So for this case, we use the TDB_COALESCING_MAGIC flag so it doesn't look free.
-
Rusty Russell authored
A new special record marker to indicate coalescing is in progress.
-
Rusty Russell authored
When we find a free block, we need to mark it as used before we drop the free lock, even though we've removed it from the list. Otherwise the coalescing code can find it. This means handing the information we need down to lock_and_alloc, which also means we know when we're grabbing a "growing" entry, and can relax the heuristics about finding a good-sized block in that case.
-
Rusty Russell authored
When coalescing, we check the adjacent entry then lock its free list: we need to *recheck* after locking, to make sure it's still in that free list.
-
Rusty Russell authored
We actually only need the bottom 5 bits of the hash value, so don't waste 8 bytes passing it.
-
Rusty Russell authored
We're going to want it in get_free() in the next patch, so move it upwards. Trivial changes, too: add to size before min length check, and rename growing to want_extra.
-
Rusty Russell authored
-
Rusty Russell authored
-
- 17 Nov, 2010 2 commits
-
-
Rusty Russell authored
This supersedes the previous Fails: section, into a more general set of lines of form: <testname> <option>... With the special <option> "FAIL" to mean we know we fail this test. We accept options to valgrind-tests; in particular tdb2 wants --partial-loads-ok=yes passed to valgrind.
-
Rusty Russell authored
I wanted to see what happened with tdb2's valgrind test (suppressed in the _info file).
-