Experimental: speed up calling of capi code
The main capi calling convention is to box all the positional arguments into a tuple, and then pass the tuple to PyArg_ParseTuple along with a format string that describes how to parse out the arguments. This ends up being pretty wasteful and misses all of the fast argument-rearrangement that we are able to JIT out. These unicode functions are particularly egregious, since they use a helper function that ends up having to dynamically generate the format string to include the function name. This commit is a very simple change gets some of the common cases: in addition to the existing METH_O calling convention ('self' plus one positional arg), add the METH_O2 and METH_O3 calling conventions. Plus add METH_D1/D2/D3 as additional flags that can be or'd into the calling convention flags, which specify that there should some number of default arguments. This is pretty limited: - only handles up to 3 arguments / defaults - only handles "O" type specifiers (ie no unboxing of ints) - only allows NULL as the default value - doesn't give as much diagnostic info on error The first two could be handled by passing the format string as part of the function metadata instead of using it in the function body, though this would mean having to add the ability to understand the format strings. The last two issues are tricky from an API perspective since they would require a larger change to pass through variable-length data structures. So anyway, punt on those issues for now, and just use the simple flag approach. This cuts the function call overhead by about 4x for the functions that it's applied to, which are some common ones: string.count, unicode.count, unicode.startswith. (endswith, [r]find, and [r]index should all get updated as well)
Showing
Please register or sign in to comment