Thursday, February 16, 2012

Py3k status update

Thank to all the people who donated to the py3k proposal, we managed to collect enough money to start to work on the first step. This is a quick summary of what I did since I began working on this.
First of all, many thanks to Amaury Forgeot d'Arc, who started the py3k branch months ago, and already implemented lots of features including e.g. switching to "unicode everywhere" and the int/long unification, making my job considerably easier :-)
I started to work on the branch at the last Leysin sprint together with Romain Guillebert, where we worked on various syntactical changes such as extended tuple unpacking and keyword-only arguments. Working on such features is a good way to learn about a lot of the layers which the PyPy Python interpreter is composed of, because often you have to touch the tokenizer, the parser, the ast builder, the compiler and finally the interpreter.
Then I worked on improving our test machinery in various way, e.g. by optimizing the initialization phase of the object space created by tests, which considerably speeds up small test runs, and adding the possibility to automatically run our tests against CPython 3, to ensure that what we are not trying to fix a test which is meant to fail :-). I also setup our buildbot to run the py3k tests nightly, so that we can have an up to date overview of what is left to do.
Finally I started to look at all the tests in the interpreter/ directory, trying to unmangle the mess of failing tests. Lots of tests were failing because of simple syntax errors (e.g., by using the no longer valid except Exception, e syntax or the old print statement), others for slightly more complex reasons like unicode vs bytes or the now gone int/long distinction. Others were failing simply because they relied on new features, such as the new lexical exception handlers.
To give some numbers, at some point in january we had 1621 failing tests in the branch, while today we are under 1000 (to be exact: 999, and this is why I've waited until today to post the status update :-)).
Before ending this blog post, I would like to thank once again all the people who donated to PyPy, who let me to do this wonderful job. That's all for now, I'll post more updates soon.
cheers, Antonio


Piotr HusiatyƄski said...

Will the py3 branch be finally merged into main branch and the pypy will be albe to run both 2.x and 3.x depending on the boot switch or the 2.x support will be dropped or the will be no merge?

Antonio Cuni said...

@Piotr: the work in the py3k branch is destroying some of the python2 semantics, so we won't merge the twos as long as we support python2 (and we'll do it for a long time, because pypy itself is written in python2).
The current plan is just to keep the development in parallel, and regularly merge "default" into "py3k".

Alberto Berti said...

Ciao Antonio,

very good news, i'm glad that my little monetary contribution allowed you to be paid to work on that. Keep us posted!



Seb said...

So, it has to be asked: any plan of rewriting PyPy in python 3? :D

Antonio Cuni said...

@Alberto: thank you very much :-)

@Seb: not in the short/middle term (and it's unclear whether we want to go there at all)

Echo said...

Good luck preventing the python2 and python3 branches from convergng; it's what merges do, and the alternative is a lot of cherry-picking. A lot of codebases use feature-flipping instead.

Antonio Cuni said...

@Echo: no, the plan is to regularly merge "default" (i.e. python2) into "py3k". Note that in "default" we are mostly developing other parts than the Python interpreter (e.g., the JIT compiler), so this should not be too much of a problem, although it will be annoying sometimes