This release 0.3.0 is the first release to focus on performance. In the 0.2.x series Nuitka achieved feature parity with CPython 2.6 and that was very important, but now it is time to make it really useful.
Optimization has been one of the main points, although I was also a bit forward looking to Python 2.7 language constructs. This release is the first where I really started to measure things and removed the most important bottlenecks.
Added option to control
--debug. With this option the C++ debug information is present in the file, otherwise it is not. This will give much smaller ".so" and ".exe" files than before.
--no-optimizationto disable all optimization.
It enables C++ asserts and compiles with less aggressive C++ compiler optimization, so it can be used for debugging purposes.
Support for Python 2.7 set literals has been added.
Fast global variables: Reads of global variables were fast already. This was due to a trick that is now also used to check them and to do a much quicker update if they are already set.
continuestatements: To make sure these statements execute the finally handlers if inside a try, these used C++ exceptions that were caught by
This was very slow and had very bad performance. Now it is checked if this is at all necessary and then it's only done for the rare case where a
continuereally is inside the tried block. Otherwise it is now translated to a C++
continuewhich the C++ compiler handles more efficiently.
unlikely()compiler hints to all errors handling cases to allow the C++ compiler to generate more efficient branch code.
The for loop code was using an exception handler to make sure the iterated value was released, using
PyObjectTemporaryfor that instead now, which should lead to better generated code.
Using constant dictionaries and copy from them instead of building them at run time even when contents was constant.
Merged some bits from the CPython 2.7 test suite that do not harm 2.6, but generally it's a lot due to some
unittestmodule interface changes.
Added CPython 2.7 tests
test_dictviews.pywhich both pass when using Python 2.7.
Added another benchmark extract from "PyStone" which uses a while loop with break.
Pystone(1.1) time for 50000 passes = 0.65 This machine benchmarks at 76923.1 pystones/second
Pystone(1.1) time for 50000 passes = 0.52 This machine benchmarks at 96153.8 pystones/second
That's a 25% speedup now and a good start clearly. It's not yet in the range of where i want it to be, but there is always room for more. And