Commit e20bccc1 authored by John Esmet's avatar John Esmet

fixes #46 Add dynamic-value omt clone (dmt) and use it to implement basement nodes

parent 45954010
Notes during 2014-01-08 Leif/Yoni
-Should verify (dmt?omt?bndata?) crash or return error on failed verify
DECISIONS:
Replace dmt_functor with implicit interface only. Instead of (for data type x) requiring the name to be dmt_functor<x> just pass the writer's class name into the dmt's template as a new parameter.
Replace dmt_functor<default> with comments explaining the "interface"
-==========================================-
See wiki:
https://github.com/Tokutek/ft-index/wiki/Improving-in-memory-query-performance---Design
ft/bndata.{cc,h} The basement node was heavily modified to split the key/value, and inline the keys
bn_data::initialize_from_separate_keys_and_vals
This is effectively the deserialize
The bn_data::omt_* functions (probably badly named) kind of treat the basement node as an omt of key+leafentry pairs
There are many references to 'omt' that could be renamed to dmt if it's worth it.
util/dmt.{cc,h} The new DMT structure
Possible questions:
1-Should we merge dmt<> & omt<>? (delete omt entirely)
2-Should omt<> become a wrapper for dmt<>?
3-Should we just keep both around?
If we plan to do this for a while, should we get rid of any scaffolding that would make it easier to do 1 or 2?
The dmt is basically an omt with dynamic sized nodes/values.
There are two representations: an array of values, or a tree of nodes.
The high-level algorithm is basically the same for dmt and omt, except the dmt tries not to move values around in tree form
Instead, it moves the metadata from nodes around.
Insertion into a dmt requires a functor that can provide information about size, since it's expected to be (potentially at least) dynamically sized
The dmt does not revert to array form when rebalancing the root, but it CAN revert to array form when it prepares for serializing (if it notices everything is fixed length)
The dmt also can serialize and deserialize the values (set) it represents. It saves no information about the dmt itself, just the values.
Some comments about what's in each file.
ft/CMakeLists.txt
add dmt-wrapper (test wrapper, nearly identical to ft/omt.cc which is also a test wrapper)
ft/dmt-wrapper.cc/h
Just like ft/omt.cc,h. Is a test wrapper for the dmt to implement a version of the old (non-templated) omt tests.
ft/ft-internal.h
Additional engine status
ft/ft-ops.cc/h
Additional engine status
in ftnode_memory_size()
fix a minor bug where we didn't count all the memory.
comments
ft/ft_layout_version.h
Update comment describing version change.
NOTE: May need to add version 26 if 25 is sent to customers before this goes live.
Adding 26 requires additional code changes (limited to a subset of places where version 24/25 are referred to)
ft/ft_node-serialize.cc
Changes calculation of size of a leaf node to include basement-node header
Adds optimized serialization for basement nodes with fixed-length keys
Maintains old method when not using fixed-length keys.
rebalance_ftnode_leaf()
Minor changes since key/leafentries are separated
deserialize_ftnode_partition()
Minor changes, including passing rbuf directly to child function (so ndone calculation is done by child)
ft/memarena.cc
Changes so that toku_memory_footprint is more accurate. (Not exactly related project)
ft/rollback.cc
Just uses new memarena function for memory footprint
ft/tests/dmt-test.cc
"clone" of old omt-test (non templated) ported to dmt
Basically not worth looking at except to make sure it imports dmt instead of omt.
ft/tests/dmt-test2.cc
New dmt tests.
You might decide not enough new tests were implemented.
ft/tests/ft-serialize-benchmark.cc
Minor improvements s.t. you can take an average of a bunch of runs.
ft/tests/ft-serialize-test.cc
Just ported to changed api
ft/tests/test-pick-child-to-flush.cc
The new basement-node headers reduce available memory.. reduce max size of test appropriately.
ft/wbuf.h
Added wbuf_nocrc_reserve_literal_bytes()
Gives you a pointer to write to the wbuf, but notes the memory was used.
util/mempool.cc
Made mempool allocations aligned to cachelines
Minor 'const' changes to help compilation
Some utility functions to get/give offsets
......@@ -31,6 +31,7 @@ set(FT_SOURCES
checkpoint
compress
dbufio
dmt-wrapper
fifo
ft
ft-cachetable-wrappers
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -689,16 +689,16 @@ ftleaf_get_split_loc(
switch (split_mode) {
case SPLIT_LEFT_HEAVY: {
*num_left_bns = node->n_children;
*num_left_les = BLB_DATA(node, *num_left_bns - 1)->omt_size();
*num_left_les = BLB_DATA(node, *num_left_bns - 1)->num_klpairs();
if (*num_left_les == 0) {
*num_left_bns = node->n_children - 1;
*num_left_les = BLB_DATA(node, *num_left_bns - 1)->omt_size();
*num_left_les = BLB_DATA(node, *num_left_bns - 1)->num_klpairs();
}
goto exit;
}
case SPLIT_RIGHT_HEAVY: {
*num_left_bns = 1;
*num_left_les = BLB_DATA(node, 0)->omt_size() ? 1 : 0;
*num_left_les = BLB_DATA(node, 0)->num_klpairs() ? 1 : 0;
goto exit;
}
case SPLIT_EVENLY: {
......@@ -707,8 +707,8 @@ ftleaf_get_split_loc(
uint64_t sumlesizes = ftleaf_disk_size(node);
uint32_t size_so_far = 0;
for (int i = 0; i < node->n_children; i++) {
BN_DATA bd = BLB_DATA(node, i);
uint32_t n_leafentries = bd->omt_size();
bn_data* bd = BLB_DATA(node, i);
uint32_t n_leafentries = bd->num_klpairs();
for (uint32_t j=0; j < n_leafentries; j++) {
size_t size_this_le;
int rr = bd->fetch_klpair_disksize(j, &size_this_le);
......@@ -725,7 +725,7 @@ ftleaf_get_split_loc(
(*num_left_les)--;
} else if (*num_left_bns > 1) {
(*num_left_bns)--;
*num_left_les = BLB_DATA(node, *num_left_bns - 1)->omt_size();
*num_left_les = BLB_DATA(node, *num_left_bns - 1)->num_klpairs();
} else {
// we are trying to split a leaf with only one
// leafentry in it
......@@ -754,7 +754,8 @@ move_leafentries(
)
//Effect: move leafentries in the range [lbi, upe) from src_omt to newly created dest_omt
{
src_bn->data_buffer.move_leafentries_to(&dest_bn->data_buffer, lbi, ube);
invariant(ube == src_bn->data_buffer.num_klpairs());
src_bn->data_buffer.split_klpairs(&dest_bn->data_buffer, lbi);
}
static void ftnode_finalize_split(FTNODE node, FTNODE B, MSN max_msn_applied_to_node) {
......@@ -851,7 +852,7 @@ ftleaf_split(
ftleaf_get_split_loc(node, split_mode, &num_left_bns, &num_left_les);
{
// did we split right on the boundary between basement nodes?
const bool split_on_boundary = (num_left_les == 0) || (num_left_les == (int) BLB_DATA(node, num_left_bns - 1)->omt_size());
const bool split_on_boundary = (num_left_les == 0) || (num_left_les == (int) BLB_DATA(node, num_left_bns - 1)->num_klpairs());
// Now we know where we are going to break it
// the two nodes will have a total of n_children+1 basement nodes
// and n_children-1 pivots
......@@ -912,7 +913,7 @@ ftleaf_split(
move_leafentries(BLB(B, curr_dest_bn_index),
BLB(node, curr_src_bn_index),
num_left_les, // first row to be moved to B
BLB_DATA(node, curr_src_bn_index)->omt_size() // number of rows in basement to be split
BLB_DATA(node, curr_src_bn_index)->num_klpairs() // number of rows in basement to be split
);
BLB_MAX_MSN_APPLIED(B, curr_dest_bn_index) = BLB_MAX_MSN_APPLIED(node, curr_src_bn_index);
curr_dest_bn_index++;
......@@ -954,10 +955,10 @@ ftleaf_split(
toku_destroy_dbt(&node->childkeys[num_left_bns - 1]);
}
} else if (splitk) {
BN_DATA bd = BLB_DATA(node, num_left_bns - 1);
bn_data* bd = BLB_DATA(node, num_left_bns - 1);
uint32_t keylen;
void *key;
int rr = bd->fetch_le_key_and_len(bd->omt_size() - 1, &keylen, &key);
int rr = bd->fetch_key_and_len(bd->num_klpairs() - 1, &keylen, &key);
invariant_zero(rr);
toku_memdup_dbt(splitk, key, keylen);
}
......@@ -1168,11 +1169,11 @@ merge_leaf_nodes(FTNODE a, FTNODE b)
a->dirty = 1;
b->dirty = 1;
BN_DATA a_last_bd = BLB_DATA(a, a->n_children-1);
bn_data* a_last_bd = BLB_DATA(a, a->n_children-1);
// this bool states if the last basement node in a has any items or not
// If it does, then it stays in the merge. If it does not, the last basement node
// of a gets eliminated because we do not have a pivot to store for it (because it has no elements)
const bool a_has_tail = a_last_bd->omt_size() > 0;
const bool a_has_tail = a_last_bd->num_klpairs() > 0;
// move each basement node from b to a
// move the pivots, adding one of what used to be max(a)
......@@ -1199,7 +1200,7 @@ merge_leaf_nodes(FTNODE a, FTNODE b)
if (a_has_tail) {
uint32_t keylen;
void *key;
int rr = a_last_bd->fetch_le_key_and_len(a_last_bd->omt_size() - 1, &keylen, &key);
int rr = a_last_bd->fetch_key_and_len(a_last_bd->num_klpairs() - 1, &keylen, &key);
invariant_zero(rr);
toku_memdup_dbt(&a->childkeys[a->n_children-1], key, keylen);
a->totalchildkeylens += keylen;
......
......@@ -1184,6 +1184,8 @@ typedef enum {
FT_PRO_NUM_STOP_LOCK_CHILD,
FT_PRO_NUM_STOP_CHILD_INMEM,
FT_PRO_NUM_DIDNT_WANT_PROMOTE,
FT_BASEMENT_DESERIALIZE_FIXED_KEYSIZE, // how many basement nodes were deserialized with a fixed keysize
FT_BASEMENT_DESERIALIZE_VARIABLE_KEYSIZE, // how many basement nodes were deserialized with a variable keysize
FT_STATUS_NUM_ROWS
} ft_status_entry;
......
This diff is collapsed.
......@@ -358,4 +358,6 @@ extern bool garbage_collection_debug;
void toku_ft_set_direct_io(bool direct_io_on);
void toku_ft_set_compress_buffers_before_eviction(bool compress_buffers);
void toku_note_deserialized_basement_node(bool fixed_key_size);
#endif
......@@ -462,6 +462,7 @@ serialize_ft_min_size (uint32_t version) {
size_t size = 0;
switch(version) {
case FT_LAYOUT_VERSION_26:
case FT_LAYOUT_VERSION_25:
case FT_LAYOUT_VERSION_24:
case FT_LAYOUT_VERSION_23:
......
......@@ -152,7 +152,7 @@ verify_msg_in_child_buffer(FT_HANDLE ft_handle, enum ft_msg_type type, MSN msn,
static DBT
get_ith_key_dbt (BASEMENTNODE bn, int i) {
DBT kdbt;
int r = bn->data_buffer.fetch_le_key_and_len(i, &kdbt.size, &kdbt.data);
int r = bn->data_buffer.fetch_key_and_len(i, &kdbt.size, &kdbt.data);
invariant_zero(r); // this is a bad failure if it happens.
return kdbt;
}
......@@ -422,7 +422,7 @@ toku_verify_ftnode_internal(FT_HANDLE ft_handle,
}
else {
BASEMENTNODE bn = BLB(node, i);
for (uint32_t j = 0; j < bn->data_buffer.omt_size(); j++) {
for (uint32_t j = 0; j < bn->data_buffer.num_klpairs(); j++) {
VERIFY_ASSERTION((rootmsn.msn >= this_msn.msn), 0, "leaf may have latest msn, but cannot be greater than root msn");
DBT kdbt = get_ith_key_dbt(bn, j);
if (curr_less_pivot) {
......
......@@ -1077,8 +1077,8 @@ garbage_helper(BLOCKNUM blocknum, int64_t UU(size), int64_t UU(address), void *e
goto exit;
}
for (int i = 0; i < node->n_children; ++i) {
BN_DATA bd = BLB_DATA(node, i);
r = bd->omt_iterate<struct garbage_helper_extra, garbage_leafentry_helper>(info);
bn_data* bd = BLB_DATA(node, i);
r = bd->iterate<struct garbage_helper_extra, garbage_leafentry_helper>(info);
if (r != 0) {
goto exit;
}
......
......@@ -119,6 +119,7 @@ enum ft_layout_version_e {
FT_LAYOUT_VERSION_23 = 23, // Ming: Fix upgrade path #5902
FT_LAYOUT_VERSION_24 = 24, // Riddler: change logentries that log transactions to store TXNID_PAIRs instead of TXNIDs
FT_LAYOUT_VERSION_25 = 25, // SecretSquirrel: ROLLBACK_LOG_NODES (on disk and in memory) now just use blocknum (instead of blocknum + hash) to point to other log nodes. same for xstillopen log entry
FT_LAYOUT_VERSION_26 = 26, // Hojo: basements store key/vals separately on disk for fixed klpair length BNs
FT_NEXT_VERSION, // the version after the current version
FT_LAYOUT_VERSION = FT_NEXT_VERSION-1, // A hack so I don't have to change this line.
FT_LAYOUT_MIN_SUPPORTED_VERSION = FT_LAYOUT_VERSION_13, // Minimum version supported
......
......@@ -284,32 +284,7 @@ serialize_node_header(FTNODE node, FTNODE_DISK_DATA ndd, struct wbuf *wbuf) {
invariant(wbuf->ndone == wbuf->size);
}
static int
wbufwriteleafentry(const void* key, const uint32_t keylen, const LEAFENTRY &le, const uint32_t UU(idx), struct wbuf * const wb) {
// need to pack the leafentry as it was in versions
// where the key was integrated into it
uint32_t begin_spot UU() = wb->ndone;
uint32_t le_disk_size = leafentry_disksize(le);
wbuf_nocrc_uint8_t(wb, le->type);
wbuf_nocrc_uint32_t(wb, keylen);
if (le->type == LE_CLEAN) {
wbuf_nocrc_uint32_t(wb, le->u.clean.vallen);
wbuf_nocrc_literal_bytes(wb, key, keylen);
wbuf_nocrc_literal_bytes(wb, le->u.clean.val, le->u.clean.vallen);
}
else {
paranoid_invariant(le->type == LE_MVCC);
wbuf_nocrc_uint32_t(wb, le->u.mvcc.num_cxrs);
wbuf_nocrc_uint8_t(wb, le->u.mvcc.num_pxrs);
wbuf_nocrc_literal_bytes(wb, key, keylen);
wbuf_nocrc_literal_bytes(wb, le->u.mvcc.xrs, le_disk_size - (1 + 4 + 1));
}
uint32_t end_spot UU() = wb->ndone;
paranoid_invariant((end_spot - begin_spot) == keylen + sizeof(keylen) + le_disk_size);
return 0;
}
static uint32_t
static uint32_t
serialize_ftnode_partition_size (FTNODE node, int i)
{
uint32_t result = 0;
......@@ -320,14 +295,14 @@ serialize_ftnode_partition_size (FTNODE node, int i)
result += toku_bnc_nbytesinbuf(BNC(node, i));
}
else {
result += 4; // n_entries in buffer table
result += 4 + bn_data::HEADER_LENGTH; // n_entries in buffer table + basement header
result += BLB_NBYTESINDATA(node, i);
}
result += 4; // checksum
return result;
}
#define FTNODE_PARTITION_OMT_LEAVES 0xaa
#define FTNODE_PARTITION_DMT_LEAVES 0xaa
#define FTNODE_PARTITION_FIFO_MSG 0xbb
static void
......@@ -374,16 +349,13 @@ serialize_ftnode_partition(FTNODE node, int i, struct sub_block *sb) {
serialize_nonleaf_childinfo(BNC(node, i), &wb);
}
else {
unsigned char ch = FTNODE_PARTITION_OMT_LEAVES;
BN_DATA bd = BLB_DATA(node, i);
unsigned char ch = FTNODE_PARTITION_DMT_LEAVES;
bn_data* bd = BLB_DATA(node, i);
wbuf_nocrc_char(&wb, ch);
wbuf_nocrc_uint(&wb, bd->omt_size());
wbuf_nocrc_uint(&wb, bd->num_klpairs());
//
// iterate over leafentries and place them into the buffer
//
bd->omt_iterate<struct wbuf, wbufwriteleafentry>(&wb);
bd->serialize_to_wbuf(&wb);
}
uint32_t end_to_end_checksum = x1764_memory(sb->uncompressed_ptr, wbuf_get_woffset(&wb));
wbuf_nocrc_int(&wb, end_to_end_checksum);
......@@ -546,7 +518,7 @@ rebalance_ftnode_leaf(FTNODE node, unsigned int basementnodesize)
// Count number of leaf entries in this leaf (num_le).
uint32_t num_le = 0;
for (uint32_t i = 0; i < num_orig_basements; i++) {
num_le += BLB_DATA(node, i)->omt_size();
num_le += BLB_DATA(node, i)->num_klpairs();
}
uint32_t num_alloc = num_le ? num_le : 1; // simplify logic below by always having at least one entry per array
......@@ -571,10 +543,10 @@ rebalance_ftnode_leaf(FTNODE node, unsigned int basementnodesize)
uint32_t curr_le = 0;
for (uint32_t i = 0; i < num_orig_basements; i++) {
BN_DATA bd = BLB_DATA(node, i);
bn_data* bd = BLB_DATA(node, i);
struct array_info ai {.offset = curr_le, .le_array = leafpointers, .key_sizes_array = key_sizes, .key_ptr_array = key_pointers };
bd->omt_iterate<array_info, array_item>(&ai);
curr_le += bd->omt_size();
bd->iterate<array_info, array_item>(&ai);
curr_le += bd->num_klpairs();
}
// Create an array that will store indexes of new pivots.
......@@ -592,9 +564,14 @@ rebalance_ftnode_leaf(FTNODE node, unsigned int basementnodesize)
// Create an array that will store the size of each basement.
// This is the sum of the leaf sizes of all the leaves in that basement.
// We don't know how many basements there will be, so we use num_le as the upper bound.
toku::scoped_malloc bn_sizes_buf(sizeof(size_t) * num_alloc);
size_t *bn_sizes = reinterpret_cast<size_t *>(bn_sizes_buf.get());
bn_sizes[0] = 0;
// Sum of all le sizes in a single basement
toku::scoped_calloc bn_le_sizes_buf(sizeof(size_t) * num_alloc);
size_t *bn_le_sizes = reinterpret_cast<size_t *>(bn_le_sizes_buf.get());
// Sum of all key sizes in a single basement
toku::scoped_calloc bn_key_sizes_buf(sizeof(size_t) * num_alloc);
size_t *bn_key_sizes = reinterpret_cast<size_t *>(bn_key_sizes_buf.get());
// TODO 4050: All these arrays should be combined into a single array of some bn_info struct (pivot, msize, num_les).
// Each entry is the number of leafentries in this basement. (Again, num_le is overkill upper baound.)
......@@ -611,7 +588,7 @@ rebalance_ftnode_leaf(FTNODE node, unsigned int basementnodesize)
for (uint32_t i = 0; i < num_le; i++) {
uint32_t curr_le_size = leafentry_disksize((LEAFENTRY) leafpointers[i]);
le_sizes[i] = curr_le_size;
if ((bn_size_so_far + curr_le_size > basementnodesize) && (num_le_in_curr_bn != 0)) {
if ((bn_size_so_far + curr_le_size + sizeof(uint32_t) + key_sizes[i] > basementnodesize) && (num_le_in_curr_bn != 0)) {
// cap off the current basement node to end with the element before i
new_pivots[curr_pivot] = i-1;
curr_pivot++;
......@@ -620,8 +597,9 @@ rebalance_ftnode_leaf(FTNODE node, unsigned int basementnodesize)
}
num_le_in_curr_bn++;
num_les_this_bn[curr_pivot] = num_le_in_curr_bn;
bn_le_sizes[curr_pivot] += curr_le_size;
bn_key_sizes[curr_pivot] += sizeof(uint32_t) + key_sizes[i]; // uint32_t le_offset
bn_size_so_far += curr_le_size + sizeof(uint32_t) + key_sizes[i];
bn_sizes[curr_pivot] = bn_size_so_far;
}
// curr_pivot is now the total number of pivot keys in the leaf node
int num_pivots = curr_pivot;
......@@ -688,17 +666,15 @@ rebalance_ftnode_leaf(FTNODE node, unsigned int basementnodesize)
uint32_t num_les_to_copy = num_les_this_bn[i];
invariant(num_les_to_copy == num_in_bn);
// construct mempool for this basement
size_t size_this_bn = bn_sizes[i];
BN_DATA bd = BLB_DATA(node, i);
bd->replace_contents_with_clone_of_sorted_array(
bn_data* bd = BLB_DATA(node, i);
bd->set_contents_as_clone_of_sorted_array(
num_les_to_copy,
&key_pointers[baseindex_this_bn],
&key_sizes[baseindex_this_bn],
&leafpointers[baseindex_this_bn],
&le_sizes[baseindex_this_bn],
size_this_bn
bn_key_sizes[i], // Total key sizes
bn_le_sizes[i] // total le sizes
);
BP_STATE(node,i) = PT_AVAIL;
......@@ -1541,15 +1517,14 @@ deserialize_ftnode_partition(
BP_WORKDONE(node, childnum) = 0;
}
else {
assert(ch == FTNODE_PARTITION_OMT_LEAVES);
assert(ch == FTNODE_PARTITION_DMT_LEAVES);
BLB_SEQINSERT(node, childnum) = 0;
uint32_t num_entries = rbuf_int(&rb);
// we are now at the first byte of first leafentry
data_size -= rb.ndone; // remaining bytes of leafentry data
BASEMENTNODE bn = BLB(node, childnum);
bn->data_buffer.initialize_from_data(num_entries, &rb.buf[rb.ndone], data_size);
rb.ndone += data_size;
bn->data_buffer.deserialize_from_rbuf(num_entries, &rb, data_size, node->layout_version_read_from_disk);
}
assert(rb.ndone == rb.size);
exit:
......@@ -2086,13 +2061,18 @@ deserialize_and_upgrade_leaf_node(FTNODE node,
assert_zero(r);
// Copy the pointer value straight into the OMT
LEAFENTRY new_le_in_bn = nullptr;
void *maybe_free;
bn->data_buffer.get_space_for_insert(
i,
key,
keylen,
new_le_size,
&new_le_in_bn
&new_le_in_bn,
&maybe_free
);
if (maybe_free) {
toku_free(maybe_free);
}
memcpy(new_le_in_bn, new_le, new_le_size);
toku_free(new_le);
}
......@@ -2101,8 +2081,7 @@ deserialize_and_upgrade_leaf_node(FTNODE node,
if (has_end_to_end_checksum) {
data_size -= sizeof(uint32_t);
}
bn->data_buffer.initialize_from_data(n_in_buf, &rb->buf[rb->ndone], data_size);
rb->ndone += data_size;
bn->data_buffer.deserialize_from_rbuf(n_in_buf, rb, data_size, node->layout_version_read_from_disk);
}
// Whatever this is must be less than the MSNs of every message above
......
......@@ -2917,7 +2917,7 @@ static void add_pair_to_leafnode (struct leaf_buf *lbuf, unsigned char *key, int
// #3588 TODO just make a clean ule and append it to the omt
// #3588 TODO can do the rebalancing here and avoid a lot of work later
FTNODE leafnode = lbuf->node;
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
DBT thekey = { .data = key, .size = (uint32_t) keylen };
DBT theval = { .data = val, .size = (uint32_t) vallen };
FT_MSG_S msg = { .type = FT_INSERT,
......
......@@ -230,7 +230,7 @@ typedef struct cachetable *CACHETABLE;
typedef struct cachefile *CACHEFILE;
typedef struct ctpair *PAIR;
typedef class checkpointer *CHECKPOINTER;
typedef class bn_data *BN_DATA;
class bn_data;
/* tree command types */
enum ft_msg_type {
......
......@@ -98,6 +98,7 @@ struct memarena {
char *buf;
size_t buf_used, buf_size;
size_t size_of_other_bufs; // the buf_size of all the other bufs.
size_t footprint_of_other_bufs; // the footprint of all the other bufs.
char **other_bufs;
int n_other_bufs;
};
......@@ -108,6 +109,7 @@ MEMARENA memarena_create_presized (size_t initial_size) {
result->buf_used = 0;
result->other_bufs = NULL;
result->size_of_other_bufs = 0;
result->footprint_of_other_bufs = 0;
result->n_other_bufs = 0;
XMALLOC_N(result->buf_size, result->buf);
return result;
......@@ -128,6 +130,7 @@ void memarena_clear (MEMARENA ma) {
// But reuse the main buffer
ma->buf_used = 0;
ma->size_of_other_bufs = 0;
ma->footprint_of_other_bufs = 0;
}
static size_t
......@@ -151,6 +154,7 @@ void* malloc_in_memarena (MEMARENA ma, size_t size) {
ma->other_bufs[old_n]=ma->buf;
ma->n_other_bufs = old_n+1;
ma->size_of_other_bufs += ma->buf_size;
ma->footprint_of_other_bufs += toku_memory_footprint(ma->buf, ma->buf_used);
}
// Make a new one
{
......@@ -217,7 +221,9 @@ void memarena_move_buffers(MEMARENA dest, MEMARENA source) {
#endif
dest ->size_of_other_bufs += source->size_of_other_bufs + source->buf_size;
dest ->footprint_of_other_bufs += source->footprint_of_other_bufs + toku_memory_footprint(source->buf, source->buf_used);
source->size_of_other_bufs = 0;
source->footprint_of_other_bufs = 0;
assert(other_bufs);
dest->other_bufs = other_bufs;
......@@ -246,4 +252,12 @@ size_t
memarena_total_size_in_use (MEMARENA m)
{
return m->size_of_other_bufs + m->buf_used;
}
}
size_t
memarena_total_footprint (MEMARENA m)
{
return m->footprint_of_other_bufs + toku_memory_footprint(m->buf, m->buf_used) +
sizeof(*m) +
m->n_other_bufs * sizeof(*m->other_bufs);
}
......@@ -129,5 +129,6 @@ size_t memarena_total_memory_size (MEMARENA);
size_t memarena_total_size_in_use (MEMARENA);
size_t memarena_total_footprint (MEMARENA);
#endif
......@@ -146,7 +146,7 @@ PAIR_ATTR
rollback_memory_size(ROLLBACK_LOG_NODE log) {
size_t size = sizeof(*log);
if (log->rollentry_arena) {
size += memarena_total_memory_size(log->rollentry_arena);
size += memarena_total_footprint(log->rollentry_arena);
}
return make_rollback_pair_attr(size);
}
......
This diff is collapsed.
This diff is collapsed.
......@@ -115,13 +115,18 @@ le_add_to_bn(bn_data* bn, uint32_t idx, const char *key, int keylen, const char
{
LEAFENTRY r = NULL;
uint32_t size_needed = LE_CLEAN_MEMSIZE(vallen);
void *maybe_free = nullptr;
bn->get_space_for_insert(
idx,
key,
keylen,
size_needed,
&r
&r,
&maybe_free
);
if (maybe_free) {
toku_free(maybe_free);
}
resource_assert(r);
r->type = LE_CLEAN;
r->u.clean.vallen = vallen;
......
......@@ -105,13 +105,18 @@ le_add_to_bn(bn_data* bn, uint32_t idx, char *key, int keylen, char *val, int va
{
LEAFENTRY r = NULL;
uint32_t size_needed = LE_CLEAN_MEMSIZE(vallen);
void *maybe_free = nullptr;
bn->get_space_for_insert(
idx,
key,
keylen,
size_needed,
&r
&r,
&maybe_free
);
if (maybe_free) {
toku_free(maybe_free);
}
resource_assert(r);
r->type = LE_CLEAN;
r->u.clean.vallen = vallen;
......@@ -127,7 +132,7 @@ long_key_cmp(DB *UU(e), const DBT *a, const DBT *b)
}
static void
test_serialize_leaf(int valsize, int nelts, double entropy) {
test_serialize_leaf(int valsize, int nelts, double entropy, int ser_runs, int deser_runs) {
// struct ft_handle source_ft;
struct ftnode *sn, *dn;
......@@ -214,47 +219,76 @@ test_serialize_leaf(int valsize, int nelts, double entropy) {
assert(size == 100);
}
struct timeval total_start;
struct timeval total_end;
total_start.tv_sec = total_start.tv_usec = 0;
total_end.tv_sec = total_end.tv_usec = 0;
struct timeval t[2];
gettimeofday(&t[0], NULL);
FTNODE_DISK_DATA ndd = NULL;
r = toku_serialize_ftnode_to(fd, make_blocknum(20), sn, &ndd, true, ft->ft, false);
assert(r==0);
gettimeofday(&t[1], NULL);
for (int i = 0; i < ser_runs; i++) {
gettimeofday(&t[0], NULL);
ndd = NULL;
sn->dirty = 1;
r = toku_serialize_ftnode_to(fd, make_blocknum(20), sn, &ndd, true, ft->ft, false);
assert(r==0);
gettimeofday(&t[1], NULL);
total_start.tv_sec += t[0].tv_sec;
total_start.tv_usec += t[0].tv_usec;
total_end.tv_sec += t[1].tv_sec;
total_end.tv_usec += t[1].tv_usec;
toku_free(ndd);
}
double dt;
dt = (t[1].tv_sec - t[0].tv_sec) + ((t[1].tv_usec - t[0].tv_usec) / USECS_PER_SEC);
printf("serialize leaf: %0.05lf\n", dt);
dt = (total_end.tv_sec - total_start.tv_sec) + ((total_end.tv_usec - total_start.tv_usec) / USECS_PER_SEC);
dt *= 1000;
dt /= ser_runs;
printf("serialize leaf(ms): %0.05lf (average of %d runs)\n", dt, ser_runs);
//reset
total_start.tv_sec = total_start.tv_usec = 0;
total_end.tv_sec = total_end.tv_usec = 0;
struct ftnode_fetch_extra bfe;
fill_bfe_for_full_read(&bfe, ft_h);
gettimeofday(&t[0], NULL);
FTNODE_DISK_DATA ndd2 = NULL;
r = toku_deserialize_ftnode_from(fd, make_blocknum(20), 0/*pass zero for hash*/, &dn, &ndd2, &bfe);
assert(r==0);
gettimeofday(&t[1], NULL);
dt = (t[1].tv_sec - t[0].tv_sec) + ((t[1].tv_usec - t[0].tv_usec) / USECS_PER_SEC);
printf("deserialize leaf: %0.05lf\n", dt);
printf("io time %lf decompress time %lf deserialize time %lf\n",
tokutime_to_seconds(bfe.io_time),
tokutime_to_seconds(bfe.decompress_time),
tokutime_to_seconds(bfe.deserialize_time)
for (int i = 0; i < deser_runs; i++) {
fill_bfe_for_full_read(&bfe, ft_h);
gettimeofday(&t[0], NULL);
FTNODE_DISK_DATA ndd2 = NULL;
r = toku_deserialize_ftnode_from(fd, make_blocknum(20), 0/*pass zero for hash*/, &dn, &ndd2, &bfe);
assert(r==0);
gettimeofday(&t[1], NULL);
total_start.tv_sec += t[0].tv_sec;
total_start.tv_usec += t[0].tv_usec;
total_end.tv_sec += t[1].tv_sec;
total_end.tv_usec += t[1].tv_usec;
toku_ftnode_free(&dn);
toku_free(ndd2);
}
dt = (total_end.tv_sec - total_start.tv_sec) + ((total_end.tv_usec - total_start.tv_usec) / USECS_PER_SEC);
dt *= 1000;
dt /= deser_runs;
printf("deserialize leaf(ms): %0.05lf (average of %d runs)\n", dt, deser_runs);
printf("io time(ms) %lf decompress time(ms) %lf deserialize time(ms) %lf (average of %d runs)\n",
tokutime_to_seconds(bfe.io_time)*1000,
tokutime_to_seconds(bfe.decompress_time)*1000,
tokutime_to_seconds(bfe.deserialize_time)*1000,
deser_runs
);
toku_ftnode_free(&dn);
toku_ftnode_free(&sn);
toku_block_free(ft_h->blocktable, BLOCK_ALLOCATOR_TOTAL_HEADER_RESERVE);
toku_blocktable_destroy(&ft_h->blocktable);
toku_free(ft_h->h);
toku_free(ft_h);
toku_free(ft);
toku_free(ndd);
toku_free(ndd2);
toku_free(ft_h);
r = close(fd); assert(r != -1);
}
static void
test_serialize_nonleaf(int valsize, int nelts, double entropy) {
test_serialize_nonleaf(int valsize, int nelts, double entropy, int ser_runs, int deser_runs) {
// struct ft_handle source_ft;
struct ftnode sn, *dn;
......@@ -353,7 +387,8 @@ test_serialize_nonleaf(int valsize, int nelts, double entropy) {
gettimeofday(&t[1], NULL);
double dt;
dt = (t[1].tv_sec - t[0].tv_sec) + ((t[1].tv_usec - t[0].tv_usec) / USECS_PER_SEC);
printf("serialize nonleaf: %0.05lf\n", dt);
dt *= 1000;
printf("serialize nonleaf(ms): %0.05lf (IGNORED RUNS=%d)\n", dt, ser_runs);
struct ftnode_fetch_extra bfe;
fill_bfe_for_full_read(&bfe, ft_h);
......@@ -363,11 +398,13 @@ test_serialize_nonleaf(int valsize, int nelts, double entropy) {
assert(r==0);
gettimeofday(&t[1], NULL);
dt = (t[1].tv_sec - t[0].tv_sec) + ((t[1].tv_usec - t[0].tv_usec) / USECS_PER_SEC);
printf("deserialize nonleaf: %0.05lf\n", dt);
printf("io time %lf decompress time %lf deserialize time %lf\n",
tokutime_to_seconds(bfe.io_time),
tokutime_to_seconds(bfe.decompress_time),
tokutime_to_seconds(bfe.deserialize_time)
dt *= 1000;
printf("deserialize nonleaf(ms): %0.05lf (IGNORED RUNS=%d)\n", dt, deser_runs);
printf("io time(ms) %lf decompress time(ms) %lf deserialize time(ms) %lf (IGNORED RUNS=%d)\n",
tokutime_to_seconds(bfe.io_time)*1000,
tokutime_to_seconds(bfe.decompress_time)*1000,
tokutime_to_seconds(bfe.deserialize_time)*1000,
deser_runs
);
toku_ftnode_free(&dn);
......@@ -394,19 +431,32 @@ test_serialize_nonleaf(int valsize, int nelts, double entropy) {
int
test_main (int argc __attribute__((__unused__)), const char *argv[] __attribute__((__unused__))) {
long valsize, nelts;
const int DEFAULT_RUNS = 5;
long valsize, nelts, ser_runs = DEFAULT_RUNS, deser_runs = DEFAULT_RUNS;
double entropy = 0.3;
if (argc != 3) {
fprintf(stderr, "Usage: %s <valsize> <nelts>\n", argv[0]);
if (argc != 3 && argc != 5) {
fprintf(stderr, "Usage: %s <valsize> <nelts> [<serialize_runs> <deserialize_runs>]\n", argv[0]);
fprintf(stderr, "Default (and min) runs is %d\n", DEFAULT_RUNS);
return 2;
}
valsize = strtol(argv[1], NULL, 0);
nelts = strtol(argv[2], NULL, 0);
if (argc == 5) {
ser_runs = strtol(argv[3], NULL, 0);
deser_runs = strtol(argv[4], NULL, 0);
}
if (ser_runs <= 0) {
ser_runs = DEFAULT_RUNS;
}
if (deser_runs <= 0) {
deser_runs = DEFAULT_RUNS;
}
initialize_dummymsn();
test_serialize_leaf(valsize, nelts, entropy);
test_serialize_nonleaf(valsize, nelts, entropy);
test_serialize_leaf(valsize, nelts, entropy, ser_runs, deser_runs);
test_serialize_nonleaf(valsize, nelts, entropy, ser_runs, deser_runs);
return 0;
}
This diff is collapsed.
......@@ -119,7 +119,7 @@ append_leaf(FTNODE leafnode, void *key, size_t keylen, void *val, size_t vallen)
DBT theval; toku_fill_dbt(&theval, val, vallen);
// get an index that we can use to create a new leaf entry
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
MSN msn = next_dummymsn();
......
......@@ -96,13 +96,18 @@ le_add_to_bn(bn_data* bn, uint32_t idx, const char *key, int keysize, const cha
{
LEAFENTRY r = NULL;
uint32_t size_needed = LE_CLEAN_MEMSIZE(valsize);
void *maybe_free = nullptr;
bn->get_space_for_insert(
idx,
key,
keysize,
size_needed,
&r
&r,
&maybe_free
);
if (maybe_free) {
toku_free(maybe_free);
}
resource_assert(r);
r->type = LE_CLEAN;
r->u.clean.vallen = valsize;
......@@ -113,14 +118,19 @@ static void
le_overwrite(bn_data* bn, uint32_t idx, const char *key, int keysize, const char *val, int valsize) {
LEAFENTRY r = NULL;
uint32_t size_needed = LE_CLEAN_MEMSIZE(valsize);
void *maybe_free = nullptr;
bn->get_space_for_overwrite(
idx,
key,
keysize,
size_needed, // old_le_size
size_needed,
&r
&r,
&maybe_free
);
if (maybe_free) {
toku_free(maybe_free);
}
resource_assert(r);
r->type = LE_CLEAN;
r->u.clean.vallen = valsize;
......
......@@ -734,7 +734,7 @@ flush_to_leaf(FT_HANDLE t, bool make_leaf_up_to_date, bool use_flush) {
int total_messages = 0;
for (i = 0; i < 8; ++i) {
total_messages += BLB_DATA(child, i)->omt_size();
total_messages += BLB_DATA(child, i)->num_klpairs();
}
assert(total_messages <= num_parent_messages + num_child_messages);
......@@ -747,7 +747,7 @@ flush_to_leaf(FT_HANDLE t, bool make_leaf_up_to_date, bool use_flush) {
memset(parent_messages_present, 0, sizeof parent_messages_present);
memset(child_messages_present, 0, sizeof child_messages_present);
for (int j = 0; j < 8; ++j) {
uint32_t len = BLB_DATA(child, j)->omt_size();
uint32_t len = BLB_DATA(child, j)->num_klpairs();
for (uint32_t idx = 0; idx < len; ++idx) {
LEAFENTRY le;
DBT keydbt, valdbt;
......@@ -969,7 +969,7 @@ flush_to_leaf_with_keyrange(FT_HANDLE t, bool make_leaf_up_to_date) {
int total_messages = 0;
for (i = 0; i < 8; ++i) {
total_messages += BLB_DATA(child, i)->omt_size();
total_messages += BLB_DATA(child, i)->num_klpairs();
}
assert(total_messages <= num_parent_messages + num_child_messages);
......@@ -1145,10 +1145,10 @@ compare_apply_and_flush(FT_HANDLE t, bool make_leaf_up_to_date) {
toku_ftnode_free(&parentnode);
for (int j = 0; j < 8; ++j) {
BN_DATA first = BLB_DATA(child1, j);
BN_DATA second = BLB_DATA(child2, j);
uint32_t len = first->omt_size();
assert(len == second->omt_size());
bn_data* first = BLB_DATA(child1, j);
bn_data* second = BLB_DATA(child2, j);
uint32_t len = first->num_klpairs();
assert(len == second->num_klpairs());
for (uint32_t idx = 0; idx < len; ++idx) {
LEAFENTRY le1, le2;
DBT key1dbt, val1dbt, key2dbt, val2dbt;
......
......@@ -352,7 +352,7 @@ doit (int state) {
assert(node->height == 0);
assert(!node->dirty);
assert(node->n_children == 1);
assert(BLB_DATA(node, 0)->omt_size() == 1);
assert(BLB_DATA(node, 0)->num_klpairs() == 1);
toku_unpin_ftnode(c_ft->ft, node);
toku_pin_ftnode_with_dep_nodes(
......@@ -369,7 +369,7 @@ doit (int state) {
assert(node->height == 0);
assert(!node->dirty);
assert(node->n_children == 1);
assert(BLB_DATA(node, 0)->omt_size() == 1);
assert(BLB_DATA(node, 0)->num_klpairs() == 1);
toku_unpin_ftnode(c_ft->ft, node);
}
else if (state == ft_flush_aflter_merge || state == flt_flush_before_unpin_remove) {
......@@ -387,7 +387,7 @@ doit (int state) {
assert(node->height == 0);
assert(!node->dirty);
assert(node->n_children == 1);
assert(BLB_DATA(node, 0)->omt_size() == 2);
assert(BLB_DATA(node, 0)->num_klpairs() == 2);
toku_unpin_ftnode(c_ft->ft, node);
}
else {
......
......@@ -355,7 +355,7 @@ doit (int state) {
assert(node->height == 0);
assert(!node->dirty);
assert(node->n_children == 1);
assert(BLB_DATA(node, 0)->omt_size() == 2);
assert(BLB_DATA(node, 0)->num_klpairs() == 2);
toku_unpin_ftnode(c_ft->ft, node);
toku_pin_ftnode(
......@@ -370,10 +370,9 @@ doit (int state) {
assert(node->height == 0);
assert(!node->dirty);
assert(node->n_children == 1);
assert(BLB_DATA(node, 0)->omt_size() == 2);
assert(BLB_DATA(node, 0)->num_klpairs() == 2);
toku_unpin_ftnode(c_ft->ft, node);
DBT k;
struct check_pair pair1 = {2, "a", 0, NULL, 0};
r = toku_ft_lookup(c_ft, toku_fill_dbt(&k, "a", 2), lookup_checkf, &pair1);
......
......@@ -338,7 +338,7 @@ doit (bool after_split) {
assert(node->height == 0);
assert(!node->dirty);
assert(node->n_children == 1);
assert(BLB_DATA(node, 0)->omt_size() == 1);
assert(BLB_DATA(node, 0)->num_klpairs() == 1);
toku_unpin_ftnode(c_ft->ft, node);
toku_pin_ftnode(
......@@ -353,7 +353,7 @@ doit (bool after_split) {
assert(node->height == 0);
assert(!node->dirty);
assert(node->n_children == 1);
assert(BLB_DATA(node, 0)->omt_size() == 1);
assert(BLB_DATA(node, 0)->num_klpairs() == 1);
toku_unpin_ftnode(c_ft->ft, node);
}
else {
......@@ -369,7 +369,7 @@ doit (bool after_split) {
assert(node->height == 0);
assert(!node->dirty);
assert(node->n_children == 1);
assert(BLB_DATA(node, 0)->omt_size() == 2);
assert(BLB_DATA(node, 0)->num_klpairs() == 2);
toku_unpin_ftnode(c_ft->ft, node);
}
......
......@@ -213,7 +213,7 @@ test_le_offsets (void) {
static void
test_ule_packs_to_nothing (ULE ule) {
LEAFENTRY le;
int r = le_pack(ule, NULL, 0, NULL, 0, 0, &le);
int r = le_pack(ule, NULL, 0, NULL, 0, 0, &le, nullptr);
assert(r==0);
assert(le==NULL);
}
......@@ -319,7 +319,7 @@ test_le_pack_committed (void) {
size_t memsize;
LEAFENTRY le;
int r = le_pack(&ule, nullptr, 0, nullptr, 0, 0, &le);
int r = le_pack(&ule, nullptr, 0, nullptr, 0, 0, &le, nullptr);
assert(r==0);
assert(le!=NULL);
memsize = le_memsize_from_ule(&ule);
......@@ -329,7 +329,7 @@ test_le_pack_committed (void) {
verify_ule_equal(&ule, &tmp_ule);
LEAFENTRY tmp_le;
size_t tmp_memsize;
r = le_pack(&tmp_ule, nullptr, 0, nullptr, 0, 0, &tmp_le);
r = le_pack(&tmp_ule, nullptr, 0, nullptr, 0, 0, &tmp_le, nullptr);
tmp_memsize = le_memsize_from_ule(&tmp_ule);
assert(r==0);
assert(tmp_memsize == memsize);
......@@ -377,7 +377,7 @@ test_le_pack_uncommitted (uint8_t committed_type, uint8_t prov_type, int num_pla
size_t memsize;
LEAFENTRY le;
int r = le_pack(&ule, nullptr, 0, nullptr, 0, 0, &le);
int r = le_pack(&ule, nullptr, 0, nullptr, 0, 0, &le, nullptr);
assert(r==0);
assert(le!=NULL);
memsize = le_memsize_from_ule(&ule);
......@@ -387,7 +387,7 @@ test_le_pack_uncommitted (uint8_t committed_type, uint8_t prov_type, int num_pla
verify_ule_equal(&ule, &tmp_ule);
LEAFENTRY tmp_le;
size_t tmp_memsize;
r = le_pack(&tmp_ule, nullptr, 0, nullptr, 0, 0, &tmp_le);
r = le_pack(&tmp_ule, nullptr, 0, nullptr, 0, 0, &tmp_le, nullptr);
tmp_memsize = le_memsize_from_ule(&tmp_ule);
assert(r==0);
assert(tmp_memsize == memsize);
......@@ -448,7 +448,7 @@ test_le_apply(ULE ule_initial, FT_MSG msg, ULE ule_expected) {
LEAFENTRY le_expected;
LEAFENTRY le_result;
r = le_pack(ule_initial, nullptr, 0, nullptr, 0, 0, &le_initial);
r = le_pack(ule_initial, nullptr, 0, nullptr, 0, 0, &le_initial, nullptr);
CKERR(r);
size_t result_memsize = 0;
......@@ -467,7 +467,7 @@ test_le_apply(ULE ule_initial, FT_MSG msg, ULE ule_expected) {
}
size_t expected_memsize = 0;
r = le_pack(ule_expected, nullptr, 0, nullptr, 0, 0, &le_expected);
r = le_pack(ule_expected, nullptr, 0, nullptr, 0, 0, &le_expected, nullptr);
CKERR(r);
if (le_expected) {
expected_memsize = leafentry_memsize(le_expected);
......@@ -749,7 +749,7 @@ test_le_apply_messages(void) {
static bool ule_worth_running_garbage_collection(ULE ule, TXNID oldest_referenced_xid_known) {
LEAFENTRY le;
int r = le_pack(ule, nullptr, 0, nullptr, 0, 0, &le); CKERR(r);
int r = le_pack(ule, nullptr, 0, nullptr, 0, 0, &le, nullptr); CKERR(r);
invariant_notnull(le);
txn_gc_info gc_info(nullptr, oldest_referenced_xid_known, oldest_referenced_xid_known, true);
bool worth_running = toku_le_worth_running_garbage_collection(le, &gc_info);
......
......@@ -189,7 +189,7 @@ doit (void) {
r = toku_testsetup_root(t, node_root);
assert(r==0);
char filler[900];
char filler[900-2*bn_data::HEADER_LENGTH];
memset(filler, 0, sizeof(filler));
// now we insert filler data so that a merge does not happen
r = toku_testsetup_insert_to_leaf (
......
......@@ -119,13 +119,18 @@ le_add_to_bn(bn_data* bn, uint32_t idx, const char *key, int keysize, const cha
{
LEAFENTRY r = NULL;
uint32_t size_needed = LE_CLEAN_MEMSIZE(valsize);
void *maybe_free = nullptr;
bn->get_space_for_insert(
idx,
key,
keysize,
size_needed,
&r
&r,
&maybe_free
);
if (maybe_free) {
toku_free(maybe_free);
}
resource_assert(r);
r->type = LE_CLEAN;
r->u.clean.vallen = valsize;
......
......@@ -122,7 +122,7 @@ append_leaf(FTNODE leafnode, void *key, size_t keylen, void *val, size_t vallen)
DBT theval; toku_fill_dbt(&theval, val, vallen);
// get an index that we can use to create a new leaf entry
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
MSN msn = next_dummymsn();
......
......@@ -111,7 +111,7 @@ append_leaf(FTNODE leafnode, void *key, size_t keylen, void *val, size_t vallen)
DBT theval; toku_fill_dbt(&theval, val, vallen);
// get an index that we can use to create a new leaf entry
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
// apply an insert to the leaf node
MSN msn = next_dummymsn();
......
......@@ -112,7 +112,7 @@ append_leaf(FTNODE leafnode, void *key, size_t keylen, void *val, size_t vallen)
DBT theval; toku_fill_dbt(&theval, val, vallen);
// get an index that we can use to create a new leaf entry
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
// apply an insert to the leaf node
MSN msn = next_dummymsn();
......
......@@ -111,7 +111,7 @@ append_leaf(FTNODE leafnode, void *key, size_t keylen, void *val, size_t vallen)
DBT theval; toku_fill_dbt(&theval, val, vallen);
// get an index that we can use to create a new leaf entry
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
// apply an insert to the leaf node
MSN msn = next_dummymsn();
......
......@@ -112,7 +112,7 @@ append_leaf(FTNODE leafnode, void *key, size_t keylen, void *val, size_t vallen)
DBT theval; toku_fill_dbt(&theval, val, vallen);
// get an index that we can use to create a new leaf entry
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
// apply an insert to the leaf node
MSN msn = next_dummymsn();
......
......@@ -114,7 +114,7 @@ append_leaf(FTNODE leafnode, void *key, size_t keylen, void *val, size_t vallen)
toku_fill_dbt(&theval, val, vallen);
// get an index that we can use to create a new leaf entry
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
// apply an insert to the leaf node
MSN msn = next_dummymsn();
......
......@@ -111,7 +111,7 @@ append_leaf(FTNODE leafnode, void *key, size_t keylen, void *val, size_t vallen)
DBT theval; toku_fill_dbt(&theval, val, vallen);
// get an index that we can use to create a new leaf entry
uint32_t idx = BLB_DATA(leafnode, 0)->omt_size();
uint32_t idx = BLB_DATA(leafnode, 0)->num_klpairs();
// apply an insert to the leaf node
MSN msn = next_dummymsn();
......
......@@ -315,9 +315,9 @@ dump_node (int f, BLOCKNUM blocknum, FT h) {
}
} else {
printf(" n_bytes_in_buffer= %" PRIu64 "", BLB_DATA(n, i)->get_disk_size());
printf(" items_in_buffer=%u\n", BLB_DATA(n, i)->omt_size());
printf(" items_in_buffer=%u\n", BLB_DATA(n, i)->num_klpairs());
if (dump_data) {
BLB_DATA(n, i)->omt_iterate<void, print_le>(NULL);
BLB_DATA(n, i)->iterate<void, print_le>(NULL);
}
}
}
......
......@@ -149,7 +149,8 @@ le_pack(ULE ule, // data to be packed into new leafentry
void* keyp,
uint32_t keylen,
uint32_t old_le_size,
LEAFENTRY * const new_leafentry_p // this is what this function creates
LEAFENTRY * const new_leafentry_p, // this is what this function creates
void **const maybe_free
);
......
This diff is collapsed.
......@@ -187,6 +187,13 @@ static inline void wbuf_uint (struct wbuf *w, uint32_t i) {
wbuf_int(w, (int32_t)i);
}
static inline uint8_t* wbuf_nocrc_reserve_literal_bytes(struct wbuf *w, uint32_t nbytes) {
assert(w->ndone + nbytes <= w->size);
uint8_t * dest = w->buf + w->ndone;
w->ndone += nbytes;
return dest;
}
static inline void wbuf_nocrc_literal_bytes(struct wbuf *w, bytevec bytes_bv, uint32_t nbytes) {
const unsigned char *bytes = (const unsigned char *) bytes_bv;
#if 0
......
This diff is collapsed.
This diff is collapsed.
......@@ -130,8 +130,10 @@ void toku_mempool_init(struct mempool *mp, void *base, size_t free_offset, size_
*/
void toku_mempool_construct(struct mempool *mp, size_t data_size) {
if (data_size) {
mp->base = toku_xmalloc(data_size);
mp->size = data_size;
// add 25% slack
size_t mp_size = data_size + (data_size / 4);
mp->base = toku_xmalloc_aligned(64, mp_size);
mp->size = mp_size;
mp->free_offset = 0;
mp->frag_size = 0;
}
......@@ -140,6 +142,22 @@ void toku_mempool_construct(struct mempool *mp, size_t data_size) {
}
}
void toku_mempool_reset(struct mempool *mp) {
mp->free_offset = 0;
mp->frag_size = 0;
}
void toku_mempool_realloc_larger(struct mempool *mp, size_t data_size) {
invariant(data_size >= mp->free_offset);
size_t mpsize = data_size + (data_size/4); // allow 1/4 room for expansion (would be wasted if read-only)
void* newmem = toku_xmalloc_aligned(64, mpsize); // allocate new buffer for mempool
memcpy(newmem, mp->base, mp->free_offset); // Copy old info
toku_free(mp->base);
mp->base = newmem;
mp->size = mpsize;
}
void toku_mempool_destroy(struct mempool *mp) {
// printf("mempool_destroy %p %p %lu %lu\n", mp, mp->base, mp->size, mp->frag_size);
......@@ -148,27 +166,44 @@ void toku_mempool_destroy(struct mempool *mp) {
toku_mempool_zero(mp);
}
void *toku_mempool_get_base(struct mempool *mp) {
void *toku_mempool_get_base(const struct mempool *mp) {
return mp->base;
}
size_t toku_mempool_get_size(struct mempool *mp) {
void *toku_mempool_get_pointer_from_base_and_offset(const struct mempool *mp, size_t offset) {
return reinterpret_cast<void*>(reinterpret_cast<char*>(mp->base) + offset);
}
size_t toku_mempool_get_offset_from_pointer_and_base(const struct mempool *mp, const void* p) {
paranoid_invariant(p >= mp->base);
return reinterpret_cast<const char*>(p) - reinterpret_cast<const char*>(mp->base);
}
size_t toku_mempool_get_size(const struct mempool *mp) {
return mp->size;
}
size_t toku_mempool_get_frag_size(struct mempool *mp) {
size_t toku_mempool_get_frag_size(const struct mempool *mp) {
return mp->frag_size;
}
size_t toku_mempool_get_used_space(struct mempool *mp) {
size_t toku_mempool_get_used_size(const struct mempool *mp) {
return mp->free_offset - mp->frag_size;
}
size_t toku_mempool_get_free_space(struct mempool *mp) {
void* toku_mempool_get_next_free_ptr(const struct mempool *mp) {
return toku_mempool_get_pointer_from_base_and_offset(mp, mp->free_offset);
}
size_t toku_mempool_get_offset_limit(const struct mempool *mp) {
return mp->free_offset;
}
size_t toku_mempool_get_free_size(const struct mempool *mp) {
return mp->size - mp->free_offset;
}
size_t toku_mempool_get_allocated_space(struct mempool *mp) {
size_t toku_mempool_get_allocated_size(const struct mempool *mp) {
return mp->free_offset;
}
......@@ -209,10 +244,10 @@ size_t toku_mempool_footprint(struct mempool *mp) {
return rval;
}
void toku_mempool_clone(struct mempool* orig_mp, struct mempool* new_mp) {
void toku_mempool_clone(const struct mempool* orig_mp, struct mempool* new_mp) {
new_mp->frag_size = orig_mp->frag_size;
new_mp->free_offset = orig_mp->free_offset;
new_mp->size = orig_mp->free_offset; // only make the cloned mempool store what is needed
new_mp->base = toku_xmalloc(new_mp->size);
new_mp->base = toku_xmalloc_aligned(64, new_mp->size);
memcpy(new_mp->base, orig_mp->base, new_mp->size);
}
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment