Use a __thread cache for the GC's thread-local ThreadBlockCache
__thread seems quite a bit faster than pthread_get_specific, so if we give up on having multiple Heap objects, then we can store a reference to the current thread's ThreadBlockCache in a static __thread variable. It looks like this ends up mattering (5% average speedup) since SmallArena::_alloc() is so hot
Showing
Please register or sign in to comment