- 04 Feb, 2015 3 commits
-
-
Marius Wachtler authored
Speeds up the interpreter by about 10-15% when the higher tiers are disabled
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Most importantly, intern all the strings we put into the AST* nodes. (the AST_Module* owns them) This should save us some memory, but it also improves performance pretty substantially since now we can do string comparisons very cheaply. Performance of the interpreter tier is up by something like 30%, and JIT-compilation times are down as well (though not by as much as I was hoping). The overall effect on perf is more muted since we tier out of the interpreter pretty quickly; to see more benefit, we'll have to retune the OSR/reopt thresholds. For better or worse (mostly better IMO), the interned-ness is encoded in the type system, and things will not automatically convert between an InternedString and a std::string. It means that this diff is quite large, but it also makes it a lot more clear where we are making our string copies or have other room for optimization.
-
- 03 Feb, 2015 4 commits
-
-
Kevin Modzelewski authored
In certain cases we wouldn't do well if we were sure that a type error would occur (ex indexing into what we know is None) -- we would error in codegen instead of generating the code to throw the error at runtime. (sneak in another travis.yml attempt)
-
Kevin Modzelewski authored
I'm sure there's a better way to test the travis build than committing to master, but why bother when this time will obviously work!
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Our previous travis build steps had a circular dependency between cmake and llvm: we need to run cmake to update llvm to our picked revision, but we need to be on our specific llvm revision in order to run cmake (newer LLVM's are incompatible with our build scripts). Break the dependency by manually calling git_svn_gotorev.py Hopefully this syntax works
-
- 02 Feb, 2015 7 commits
-
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
The goal is to not continually call functions that deopt every time, since the deopt is expensive. Right now the threshold is simple: if a function deopts 4 (configurable) times, then mark that function version as invalid and force a recompilation on the next call.
-
Kevin Modzelewski authored
Old deopt worked by compiling two copies of every BB, one with speculations and one without, and stitching the two together. This has a number of issues: - doubles the amount of code LLVM has to jit - can't ever get back on the optimized path - doesn't support 'deopt if branch taken' - horrifically complex - doesn't support deopt from within try blocks We actually ran into that last issue (see test from previous commit). So rather than wade in and try to fix old-deopt, just start switching to new-deopt. (new) deopt works by using the frame introspection features, gathering up all the locals, and passing them to the interpreter.
-
Kevin Modzelewski authored
We currently can't deopt from inside an exception block.
-
Kevin Modzelewski authored
You can imagine what happens if the variable is undefined and we try to return it.
-
Kevin Modzelewski authored
Mark enumerate_cls as safe for type call rewriting
-
- 29 Jan, 2015 11 commits
-
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
The macros would cast to PyObject*, but our functions would just take a PyObject* -- which is an issue if the argument is something else (like a PyListObject*). We need to have wrapper macros that cast and then call the underlying function.
-
Kevin Modzelewski authored
Many of them originally failed due to a missing language feature, but now were failing due to simple reasons like printing out the repr of an object which doesn't define __repr__ and getting something like "<C object at 0x1234567890>" (ie nondeterministic).
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
More gc perf changes
-
Kevin Modzelewski authored
-
- 28 Jan, 2015 14 commits
-
-
Chris Toshok authored
add a PRECISE and HIDDEN_CLASS GCKind, with special visit behavior (visitRange for PRECISE, and HiddenClass::gc_visit for HIDDEN_CLASS). don't scan HiddenClasses or HCAttrs conservatively.
-
Chris Toshok authored
store frequently used values in the Block. also give scanForNext an out from entering the loop at all
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
use a vector of chunks for the TraceStack, instead of a vector of individual pointers.
-
Kevin Modzelewski authored
My theory is that this is because it overflows a signed int in 32-bit builds. This should hopefully fix #272
-
Travis Hance authored
Str just funcs
-
Travis Hance authored
-
Chris Toshok authored
also keep a free list of chunks around to make subsequent collections faster. results in TraceStack::pop and ::push being inlined and disappearing from the perf report (::push was at 3.99% before). Also drops aggregate GC times for ray trace by ~5%. before: gc_collections_us: 2151827 after: gc_collections_us: 2023809
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Had to modify tempfile to not depend on io, which is a big module (due to importing _io). tempfile seems to barely even use it, and I think I was able to replace its usage with a simpler os.write call.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
- 27 Jan, 2015 1 commit
-
-
Kevin Modzelewski authored
-