- 17 Jul, 2015 19 commits
-
-
Kevin Modzelewski authored
The main capi calling convention is to box all the positional arguments into a tuple, and then pass the tuple to PyArg_ParseTuple along with a format string that describes how to parse out the arguments. This ends up being pretty wasteful and misses all of the fast argument-rearrangement that we are able to JIT out. These unicode functions are particularly egregious, since they use a helper function that ends up having to dynamically generate the format string to include the function name. This commit is a very simple change gets some of the common cases: in addition to the existing METH_O calling convention ('self' plus one positional arg), add the METH_O2 and METH_O3 calling conventions. Plus add METH_D1/D2/D3 as additional flags that can be or'd into the calling convention flags, which specify that there should some number of default arguments. This is pretty limited: - only handles up to 3 arguments / defaults - only handles "O" type specifiers (ie no unboxing of ints) - only allows NULL as the default value - doesn't give as much diagnostic info on error The first two could be handled by passing the format string as part of the function metadata instead of using it in the function body, though this would mean having to add the ability to understand the format strings. The last two issues are tricky from an API perspective since they would require a larger change to pass through variable-length data structures. So anyway, punt on those issues for now, and just use the simple flag approach. This cuts the function call overhead by about 4x for the functions that it's applied to, which are some common ones: string.count, unicode.count, unicode.startswith. (endswith, [r]find, and [r]index should all get updated as well)
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
ie django_template minus the lexing. We are faster now on the lexing, but the parsing is where most of the time gets spent. Also, change this benchmark and django_lexing to have a unicode template. Usually django does that conversion automatically, but the templates bypass where that happens, and we end up doing a lot of extra unicode decoding.
-
Kevin Modzelewski authored
Convert "a in (b, c)" to "a == b or a == c"
-
Kevin Modzelewski authored
optimize regex handling
-
Kevin Modzelewski authored
some fixes and cleanups
-
Kevin Modzelewski authored
Another 2.7.9 compatibility fix
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Do this by adding "contains" to our codegen type system, and implement a special contains on the unboxedtuple type. This makes this operation quite a lot faster, but it looks like largely because we don't implement a couple optimizations that we should: - we create a new tuple object every time we hit that line - our generic contains code goes through compare(), which returns a box (since "<" and friends can return non-bools), but contains will always return a bool, so we have a bunch of extra boxing/unboxing We probably should separate out the contains logic from the rest of the comparisons, since it works quite differently and doesn't gain anything by being there.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
- copy CPython's implementation (that uses C slots) - implement the C slots for str and list - avoid doing a division for non-step slices
-
Kevin Modzelewski authored
It was unused
-
Kevin Modzelewski authored
Particularly for string slicing, where we would always memset the string data to zero, and then immediately memcpy it.
-
Kevin Modzelewski authored
- put it into a header file (and start including it) - move the grow-the-array part into a separate function to encourage the fast-path to get inlined.
-
Kevin Modzelewski authored
This division is expensive; the divisor is always sizeof(char) or sizeof(Py_UNICODE), and it seems to be faster to do a branch and then possibly a shift.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
int(str) and int(float) don't always return ints (cant return longs, doh). If we call int() on a subclass of int, we should call its __int__ method in case the subclass overrode it.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
And some other small cleanups
-
- 16 Jul, 2015 6 commits
-
-
Kevin Modzelewski authored
This only does the lexing portion of the process. Further cut that down into a re.split ubench
-
Chris Toshok authored
Some refactors in GC code + class-freed-before-instance bug fix.
-
Rudi Chen authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Add sre_compile_test.expected
-
Rudi Chen authored
In some rare instances, class objects can be freed before the last instance of that class, causing a problem in the sweep phase where we look at the class of the object being freed. So we keep unreachable classes around for an extra collection to be safe.
-
- 15 Jul, 2015 15 commits
-
-
Rudi Chen authored
-
Rudi Chen authored
-
Kevin Modzelewski authored
Ie, manually specify the reference output for sre_compile_test.py instead of running CPython to generate it. CPython changed the behavior (and interface) of _optimize_charset at some point between 2.7.7 (where we copied from) and 2.7.9. We should probably copy their new implementation (seems to have a few more optimizations), but for now this should fix #707.
-
Kevin Modzelewski authored
jemalloc
-
Daniel Agar authored
-
Kevin Modzelewski authored
some misc microoptimizations
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Yes they exist, for example "try getting this attribute and if it exists call it, otherwise do something else". Probably not a huge perf improvement since the exception-throwing will probably dominate. Use the same "only do this for immortal strings" trick to get around gc issues.
-
Kevin Modzelewski authored
Only do this for calls with immortally-interned strings so that we can side-step the track-gc-references-in-ics issue for now.
-
Kevin Modzelewski authored
type speculation support
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
I have a whole bunch of mechanical changes to allow speculation, but I think our pre-existing speculation rules end up hurting the macrobenchmarks. I'd like to get those in and then separately work on making in beneficial.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-