When programming with the ZODB, Python dictionaries aren't always what you need. The most important case is where you want to store a very large mapping. When a Python dictionary is accessed in a ZODB, the whole dictionary has to be unpickled and brought into memory. If you're storing something very large, such as a 100,000-entry user database, unpickling such a large object will be slow. BTrees are a balanced tree data structure that behave like a mapping but distribute keys throughout a number of tree nodes. The nodes are stored in sorted order. Nodes are then only unpickled and brought into memory as they're accessed, so the entire tree doesn't have to occupy memory (unless you really are touching every single key).
The BTrees package provides a large collection of related data structures. There are variants of the data structures specialized to handle integer values, which are faster and use less memory. There are four modules that handle the different variants. The first two letters of the module name specify the types of the keys and values in mappings - O for any object and I for integer. The BTrees.IOBTree module provides a mapping that accepts integer keys and arbitrary objects as values.
The four data structures provide by each module are a btree, a bucket, a tree set, and a set. The btree and bucket types are mappings and support all the usual mapping methods, e.g. update() and keys(). The tree set and set types are similar to mappings but they have no values; they support the methods that make sense for a mapping with no keys, e.g. keys() but not items(). The bucket and set types are the individual building blocks for btrees and tree sets, respectively. A bucket or set can be used when you are sure that it will have few elements. If the data structure will grow large, you should use a btree or tree set.
The four modules are named OOBTree, IOBTree, OIBTree, and IIBTree. The two letter prefixes are repeated in the data types names. The BTrees.OOBTree module defines the following types: OOBTree, OOBucket, OOSet, and OOTreeSet.
The keys(), values(), and items() methods do not materialize a list with all of the data. Instead, they return lazy sequences that fetch data from the BTree as needed. They also support optional arguments to specify the minium and maximum values to return.
A BTree object supports all the methods you would expect of a mapping with a few extensions that exploit the fact that the keys are sorted. The example below demonstrates how some of the methods work. The extra methods re minKey() and maxKey(), which find the minimum and maximum key value subject to an optional bound argument, and byValue(), which returns value, key pairs in reversed sorted order subject to an optional minimum bound argument.
>>> from BTrees.OOBTree import OOBTree >>> t = OOBTree() >>> t.update({ 1: "red", 2: "green", 3: "blue", 4: "spades" }) >>> len(t) 4 >>> t[2] 'green' >>> t.keys() <OOBTreeItems object at 0x40269098> >>> [k for k in t.keys()] # use a listcomp to get a printable sequence [1, 2, 3, 4] >>> [k for k in t.values()] ['red', 'green', 'blue', 'spades'] >>> [k for k in t.values(1, 2)] ['red', 'green'] >>> [k for k in t.values(2)] ['green', 'blue', 'spades'] >>> t.byValue("glue") # all values > "glue" [('spades', 4), ('red', 1), ('green', 2)] >>> t.minKey(1.5) 2
Each of the modules also defines some functions that operate on BTrees - difference(), union(), and difference(). The difference() function returns a bucket, while the other two methods return a set. If the keys are integers, then the module also defines multiunion(). If the values are integers, then the module also defines weightedIntersection() and weighterUnion(). The function doc strings describe each function briefly.