Commits · 1fa54dfd198611a15f4c701a0525ea4ac2af4343 · mirror / ccan

21 Apr, 2011 1 commit
- tdb2: remove tailer from transaction record. · 1fa54dfd
  Rusty Russell authored Apr 21, 2011
```
We don't have tailers in tdb2, so it's just 8 bytes of confusing wastage.
```
  1fa54dfd
27 Apr, 2011 4 commits

tdb2: limit coalescing based on how successful we are. · 6b3c079f

Rusty Russell authored Apr 27, 2011

Instead of walking the entire free list, walk 8 entries, or more if we
are successful: the reward is scaled by the size coalesced.

We also move previously-examined records to the end of the list.

This reduces file size with very little speed penalty.

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real	1m17.022s
user	0m27.206s
sys	0m3.920s
-rw------- 1 rusty rusty 570130576 2011-04-27 21:17 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m27.355s
user	0m0.296s
sys	0m0.516s
-rw------- 1 rusty rusty 617352 2011-04-27 21:18 torture.tdb
Adding 2000000 records:  890 ns (110556088 bytes)
Finding 2000000 records:  565 ns (110556088 bytes)
Missing 2000000 records:  390 ns (110556088 bytes)
Traversing 2000000 records:  410 ns (110556088 bytes)
Deleting 2000000 records:  8623 ns (244003768 bytes)
Re-adding 2000000 records:  7089 ns (244003768 bytes)
Appending 2000000 records:  33708 ns (244003768 bytes)
Churning 2000000 records:  2029 ns (268404160 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real	1m7.096s
user	0m15.637s
sys	0m3.812s
-rw------- 1 rusty rusty 561270928 2011-04-27 21:22 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m13.850s
user	0m0.268s
sys	0m0.492s
-rw------- 1 rusty rusty 429768 2011-04-27 21:23 torture.tdb
Adding 2000000 records:  892 ns (110556088 bytes)
Finding 2000000 records:  570 ns (110556088 bytes)
Missing 2000000 records:  390 ns (110556088 bytes)
Traversing 2000000 records:  407 ns (110556088 bytes)
Deleting 2000000 records:  706 ns (244003768 bytes)
Re-adding 2000000 records:  822 ns (244003768 bytes)
Appending 2000000 records:  1262 ns (268404160 bytes)
Churning 2000000 records:  2320 ns (268404160 bytes)

6b3c079f

tdb2: use counters to decide when to coalesce records. · 024a5647

Rusty Russell authored Apr 27, 2011

This simply uses a 7 bit counter which gets incremented on each addition
to the list (but not decremented on removals).  When it wraps, we walk the
entire list looking for things to coalesce.

This causes performance problems, especially when appending records, so
we limit it in the next patch:

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real	0m59.687s
user	0m11.593s
sys	0m4.100s
-rw------- 1 rusty rusty 752004064 2011-04-27 21:14 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m17.738s
user	0m0.348s
sys	0m0.580s
-rw------- 1 rusty rusty 663360 2011-04-27 21:15 torture.tdb
Adding 2000000 records:  926 ns (110556088 bytes)
Finding 2000000 records:  592 ns (110556088 bytes)
Missing 2000000 records:  416 ns (110556088 bytes)
Traversing 2000000 records:  422 ns (110556088 bytes)
Deleting 2000000 records:  741 ns (244003768 bytes)
Re-adding 2000000 records:  799 ns (244003768 bytes)
Appending 2000000 records:  1147 ns (295244592 bytes)
Churning 2000000 records:  1827 ns (568411440 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real	1m17.022s
user	0m27.206s
sys	0m3.920s
-rw------- 1 rusty rusty 570130576 2011-04-27 21:17 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m27.355s
user	0m0.296s
sys	0m0.516s
-rw------- 1 rusty rusty 617352 2011-04-27 21:18 torture.tdb
Adding 2000000 records:  890 ns (110556088 bytes)
Finding 2000000 records:  565 ns (110556088 bytes)
Missing 2000000 records:  390 ns (110556088 bytes)
Traversing 2000000 records:  410 ns (110556088 bytes)
Deleting 2000000 records:  8623 ns (244003768 bytes)
Re-adding 2000000 records:  7089 ns (244003768 bytes)
Appending 2000000 records:  33708 ns (244003768 bytes)
Churning 2000000 records:  2029 ns (268404160 bytes)

024a5647

tdb2: overallocate the recovery area. · a8b30ad4

Rusty Russell authored Apr 27, 2011

I noticed a counter-intuitive phenomenon as I tweaked the coalescing
code: the more coalescing we did, the larger the tdb grew!  This was
measured using "growtdb-bench 250000 10".

The cause: more coalescing means larger transactions, and every time
we do a larger transaction, we need to allocate a larger recovery
area.  The only way to do this is to append to the file, so the file
keeps growing, even though it's mainly unused!

Overallocating by 25% seems reasonable, and gives better results in
such benchmarks.

The real fix is to reduce the transaction to a run-length based format
rather then the naive block system used now.

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real	0m57.403s
user	0m11.361s
sys	0m4.056s
-rw------- 1 rusty rusty 689536976 2011-04-27 21:10 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m24.901s
user	0m0.380s
sys	0m0.512s
-rw------- 1 rusty rusty 655368 2011-04-27 21:12 torture.tdb
Adding 2000000 records:  941 ns (110551992 bytes)
Finding 2000000 records:  603 ns (110551992 bytes)
Missing 2000000 records:  428 ns (110551992 bytes)
Traversing 2000000 records:  416 ns (110551992 bytes)
Deleting 2000000 records:  741 ns (199517112 bytes)
Re-adding 2000000 records:  819 ns (199517112 bytes)
Appending 2000000 records:  1228 ns (376542552 bytes)
Churning 2000000 records:  2042 ns (553641304 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real	0m59.687s
user	0m11.593s
sys	0m4.100s
-rw------- 1 rusty rusty 752004064 2011-04-27 21:14 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m17.738s
user	0m0.348s
sys	0m0.580s
-rw------- 1 rusty rusty 663360 2011-04-27 21:15 torture.tdb
Adding 2000000 records:  926 ns (110556088 bytes)
Finding 2000000 records:  592 ns (110556088 bytes)
Missing 2000000 records:  416 ns (110556088 bytes)
Traversing 2000000 records:  422 ns (110556088 bytes)
Deleting 2000000 records:  741 ns (244003768 bytes)
Re-adding 2000000 records:  799 ns (244003768 bytes)
Appending 2000000 records:  1147 ns (295244592 bytes)
Churning 2000000 records:  1827 ns (568411440 bytes)

a8b30ad4

tdb2: don't start again when we coalesce a record. · 5c4a21ab

Rusty Russell authored Apr 27, 2011

We currently start walking the free list again when we coalesce any record;
this is overzealous, as we only care about the next record being blatted,
or the record we currently consider "best".

We can also opportunistically try to add the coalesced record into the
new free list: if it fails, we go back to the old "mark record,
unlock, re-lock" code.

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real	1m0.243s
user	0m13.677s
sys	0m4.336s
-rw------- 1 rusty rusty 683302864 2011-04-27 21:03 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m24.074s
user	0m0.344s
sys	0m0.468s
-rw------- 1 rusty rusty 836040 2011-04-27 21:04 torture.tdb
Adding 2000000 records:  1015 ns (110551992 bytes)
Finding 2000000 records:  641 ns (110551992 bytes)
Missing 2000000 records:  445 ns (110551992 bytes)
Traversing 2000000 records:  439 ns (110551992 bytes)
Deleting 2000000 records:  807 ns (199517112 bytes)
Re-adding 2000000 records:  851 ns (199517112 bytes)
Appending 2000000 records:  1301 ns (376542552 bytes)
Churning 2000000 records:  2423 ns (553641304 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real	0m57.403s
user	0m11.361s
sys	0m4.056s
-rw------- 1 rusty rusty 689536976 2011-04-27 21:10 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m24.901s
user	0m0.380s
sys	0m0.512s
-rw------- 1 rusty rusty 655368 2011-04-27 21:12 torture.tdb
Adding 2000000 records:  941 ns (110551992 bytes)
Finding 2000000 records:  603 ns (110551992 bytes)
Missing 2000000 records:  428 ns (110551992 bytes)
Traversing 2000000 records:  416 ns (110551992 bytes)
Deleting 2000000 records:  741 ns (199517112 bytes)
Re-adding 2000000 records:  819 ns (199517112 bytes)
Appending 2000000 records:  1228 ns (376542552 bytes)
Churning 2000000 records:  2042 ns (553641304 bytes)

5c4a21ab

25 Mar, 2011 1 commit
- tdb2: make internal coalesce() function return length coalesced. · 6b999f45
  Rusty Russell authored Mar 25, 2011
```
This makes life easier for the next patch.
```
  6b999f45
27 Apr, 2011 1 commit

tdb2: expand more slowly. · 48241893

Rusty Russell authored Apr 27, 2011

We took the original expansion heuristic from TDB1, and they just
fixed theirs, so copy that.

Before:

After:
time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
growtdb-bench.c: In function ‘main’:
growtdb-bench.c:74:8: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result
growtdb-bench.c:108:9: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result

real	1m0.243s
user	0m13.677s
sys	0m4.336s
-rw------- 1 rusty rusty 683302864 2011-04-27 21:03 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real	1m24.074s
user	0m0.344s
sys	0m0.468s
-rw------- 1 rusty rusty 836040 2011-04-27 21:04 torture.tdb
Adding 2000000 records:  1015 ns (110551992 bytes)
Finding 2000000 records:  641 ns (110551992 bytes)
Missing 2000000 records:  445 ns (110551992 bytes)
Traversing 2000000 records:  439 ns (110551992 bytes)
Deleting 2000000 records:  807 ns (199517112 bytes)
Re-adding 2000000 records:  851 ns (199517112 bytes)
Appending 2000000 records:  1301 ns (376542552 bytes)
Churning 2000000 records:  2423 ns (553641304 bytes)

48241893

19 Apr, 2011 1 commit
- tdb2: use 64 bit file offsets on 32 bit systems if available. · 0f95489b
  Rusty Russell authored Apr 19, 2011
```
And testing reveals a latent bug on 32 bit systems.
```
  0f95489b
21 Apr, 2011 1 commit
- tdb2: test lock timeout plugin code. · 15cb319b
  Rusty Russell authored Apr 21, 2011
  
  15cb319b
07 Apr, 2011 1 commit

tdb2: allow transaction to nest. · 72e974b2

Rusty Russell authored Apr 07, 2011

This is definitely a bad idea in general, but SAMBA uses nested transactions
in many and varied ways (some of them probably reflect real bugs) and it's
far easier to support them inside tdb2 with a flag.

We already have part of the TDB1 infrastructure in place, so this patch
just completes it and fixes one place where I'd messed it up.

72e974b2

27 Apr, 2011 2 commits

tdb2: allow multiple chain locks. · dc9da1e3

Rusty Russell authored Apr 27, 2011

It's probably not a good idea, because it's a recipe for deadlocks if
anyone else grabs any *other* two chainlocks, or the allrecord lock,
but SAMBA definitely does it, so allow it as TDB1 does.

dc9da1e3

tdb2: TDB_ATTRIBUTE_STATS access via tdb_get_attribute. · 8cca0397

Rusty Russell authored Apr 27, 2011

Now we have tdb_get_attribute, it makes sense to make that the method
of accessing statistics.  That way they are always available, and it's
probably cheaper doing the direct increment than even the unlikely()
branch.

8cca0397

07 Apr, 2011 4 commits
- tdb2: make tdb_name() valid early in tdb_open() · 66ead2bc
  Rusty Russell authored Apr 07, 2011
```
Otherwise tdb_name() can be NULL in log functions.  And we might as
well allocate it with the tdb, as well.
```
  66ead2bc
- tdb2: fix an error message misspelling. · 142e3d31
  Rusty Russell authored Apr 07, 2011
  
  142e3d31
- tdb2: tdb_set_attribute, tdb_unset_attribute and tdb_get_attribute · 703cea0c
  Rusty Russell authored Apr 07, 2011
```
It makes sense for some attributes to be manipulated after tdb_open,
so allow that.
```
  703cea0c
- tdb2: TDB_ATTRIBUTE_FLOCK support · 7fe32184
  Rusty Russell authored Apr 07, 2011
```
This allows overriding of low-level locking functions.  This allows
special effects such as non-blocking operations, and lock proxying.

We rename some local lock vars to l to avoid -Wshadow warnings.
```
  7fe32184
06 Apr, 2011 1 commit

tdb2: don't cancel transaction when tdb_transaction_prepare_commit fails · 06935292

Rusty Russell authored Apr 07, 2011

And don't double-log.  Both of these cause problems if we want to do
tdb_transaction_prepare_commit non-blocking (and have it fail so we can
try again).

06935292

07 Apr, 2011 1 commit

tdb2: open hook for implementing TDB_CLEAR_IF_FIRST · 0468e699

Rusty Russell authored Apr 07, 2011

This allows the caller to implement clear-if-first semantics as per
TDB1.  The flag was removed for good reasons: performance and
unreliability, but SAMBA3 still uses it widely, so this allows them to
reimplement it themselves.

(There is no way to do it without help like this from tdb2, since it has
 to be done under the open lock).

0468e699

10 May, 2011 1 commit

tdb2: cleanups for tools/speed.c · 4ee7bd08

Rusty Russell authored May 10, 2011

1) The logging function needs to append a \n.
2) The transaction start code should be after the comment and print.
3) We should run tdb_check to make sure the database is OK after each op.

4ee7bd08

07 Apr, 2011 3 commits
- tdb2: rearrange log function to put data arg at the end. · 6ce40d6a
  Rusty Russell authored Apr 07, 2011
```
Also, rename private logfn to log_fn for consistency with other members.
```
  6ce40d6a
- tdb2: rename internal hashfn and logfn to hash_fn and log_fn. · 156e5eb9
  Rusty Russell authored Apr 07, 2011
```
We use underscores everywhere else, so be consistent.
```
  156e5eb9
- tdb2: shorten attribute members. · 007a7e4e
  Rusty Russell authored Apr 07, 2011
```
It's redundant calling hash.hash_fn for example.  Standardize on fn
and data as names (private conflicts with C++).
```
  007a7e4e
29 Mar, 2011 2 commits
- tdb2: extend start of hash locks. · b8903ca2
  Rusty Russell authored Mar 29, 2011
```
This gives us more locks for future use, plus allows a clear-if-first-style
hack to be implemented.
```
  b8903ca2
- tdb2: implement tdb_chainlock_read/tdb_chainunlock_read. · 5a5b9f8d
  Rusty Russell authored Mar 29, 2011
  
  5a5b9f8d
19 Apr, 2011 1 commit

tdb2: fix tdb_summary reports · 63e80faf

Rusty Russell authored Apr 19, 2011

1) Fix the bogus reporting of uncoalesced runs: there has to be more than 1
   free record to make a "run", and any other record interrups the run.
2) Fix the bogus data count in the top line (which was number of records,
   not bytes).
3) Remove the count of free buckets: it's now a constant.

63e80faf

28 Apr, 2011 4 commits

configurator: make it BSD-MIT licensed. · edd3043f
Rusty Russell authored Apr 28, 2011
```
Douglas Bagnell noted that it didn't specify.
```
edd3043f

compiler: don't override existing definitions. · e37b9067

Rusty Russell authored Apr 28, 2011

It's common when integrating CCAN into existing projects that they define
such constants for themselves.  In an ideal world, the entire project
switches to one set of definitions, but for large projects (eg SAMBA)
that's not realistic and conflicts with the aim of making CCAN modules
easy to "drop in" to existing code.

(This is a generalization of Douglas Bagnell's patch sent in Message-ID:
<4DB8D00D.8000800@paradise.net.nz>).

e37b9067

tools: don't unnecessarily redefine _GNU_SOURCE in config.h · 5bfb3995
Douglas Bagnall authored Apr 21, 2011
```
Makes it easier to reuse this code in other projects.
```
5bfb3995
ilog: credit Tim Terriberry as author in ccan/ilog/_info · d0842865
Douglas Bagnall authored Apr 24, 2011

d0842865

27 Apr, 2011 5 commits
- Makefile: don't define -Werror for ccanlint. · 683264a9
  Rusty Russell authored Apr 27, 2011
```
It's useful for developers, but not so much for casual users.  For example,
RHEL 5.6 has qsort_r, but no prototype, which causes a warning.
```
  683264a9
- tools: always include config.h before anything else. · 6edf8aee
  Rusty Russell authored Apr 27, 2011
```
Otherwise, _GNU_SOURCE isn't defined (for example) so prototypes such as
isblank can be missing.
```
  6edf8aee
- str: fix tests on unsigned chars, and !HAVE_ISBLANK. · 00e61068
  Rusty Russell authored Apr 27, 2011
  
  00e61068
- ciniparser: Add a check that len remains within bounds. · 5c922794
  Andreas Schlick authored Apr 26, 2011
```
If the line is made solely of whitespaces, they get removed and we end up with len
set to -1. Therefore we need an additional check to avoid dereferencing line[-1].
```
  5c922794
- ccanlint: Add more C++ keywords. · 9aa2e32a
  Andreas Schlick authored Apr 26, 2011
  
  9aa2e32a
19 Apr, 2011 2 commits
- str_talloc: make strjoin much more efficient. · ed7aec77
  Rusty Russell authored Apr 19, 2011
```
Inspired by patch from Volker.
```
  ed7aec77
- str_talloc: avoid const warnings on test/run.c · 377395cf
  Rusty Russell authored Apr 19, 2011
  
  377395cf
06 Apr, 2011 1 commit

typesafe_cb: simplify, preserve namespace. · b0fa019a

Rusty Russell authored Apr 07, 2011

Get rid of many variants, which were just confusing for most people.
Keep typesafe_cb(), typesafe_cb_preargs() and typesafe_cb_postarts(),
and rework cast_if_type() into typesafe_cb_cast() so we stay in our
namespace.

I should have done this as soon as I discovered the limitation that
the types have to be defined if I want const-taking callbacks.

b0fa019a

19 Apr, 2011 3 commits
- ccanlint: test for C++ reserved words in headers. · 076877c2
  Rusty Russell authored Apr 19, 2011
```
Don't check the whole source, but it's nice for headers to be C++-clean.
```
  076877c2
- configurator: HAVE_FILE_OFFSET_BITS · 4db6a670
  Rusty Russell authored Apr 19, 2011
```
Defines whether it's useful to do #define _FILE_OFFSET_BITS 64 to get a
larger off_t.
```
  4db6a670
- junkcode: upload via website. · 1c6f12a3
  Rusty Russell authored Apr 19, 2011
  
  1c6f12a3