Thursday, April 30, 2009

4 weeks of GDB

Hello.

So, according to our jit plan we're mostly done with point 1, that is to provide a JIT that compiles python code to assembler in the most horrible manner possible but doesn't break. That meant mostly 4 weeks of glaring at GDB and megabytess of assembler generated by C code generated from python code. The figure of 4 weeks proves that our approach is by far superior to the one of psyco, since Armin says it's "only 4 weeks" :-)

Right now, pypy compiled with JIT can run the whole CPython test suite without crashing, which means we're done with obvious bugs and the only ones waiting for us are really horrible. (Or they really don't exist. At least they should never be about obscure Python corner cases: they can only be in the 10'000 lines of relatively clear code that is our JIT generator.)

But... the fun thing is that we can actually concentrate on optimizations! So the next step is to provide a JIT that is correct *and* actually speeds up python. Stay tuned for more :-)

Cheers,
fijal, armin & benjamin

UPDATE: for those of you blessed with no knowledge of C, gdb stands for GNU debugger, a classic debugger for C. (It's also much more powerful than python debugger, pdb, which is kind of surprising).

16 comments:

Alexander Kellett said...

*bow*

Luis said...

I love this kind of posts. Keep'em coming!

Unknown said...

This is probably the most exciting thing I've heard since I started tracking PyPy. Can't wait to see how fast JIT Python flies. :-)

René Dudfield said...

nice one! Really looking forward to it.

Is this for just i386? Or is this for amd64/ppc etc?

Maciej Fijalkowski said...

amd64 and ppc are only available in enterprise version :-)

We cannot really solve all problems at once, it's one-by-one approach.

Armin Rigo said...

illume: if you are comparing with Psyco, then it's definitely "any platform provided someone writes a backend for it". Writing a backend is really much easier than porting the whole of Psyco...

Our vague plans include an AMD64 backend and an LLVM-JIT one, the latter being able to target any platform that LLVM targets.

DSM said...

Nice!

I assume that it would be (relatively, as these things go) straightforward for those us interested to turn the x86 assembly backend into a C backend?

I know that even mentioning number-crunching applications gets certain members of the pypy team beating their heads against the wall (lurkers can read the grumbling on irc too!). But with a delegate-to-C backend, those of us who have unimplemented architectures and are in the happy regime where we don't care about compilation overhead can get the benefits of icc's excellent optimizations without having to do any of the work. We'd just need to make sure that the generated C is code that icc can handle. (There are unfortunately idioms that gcc and icc don't do very well with.)

To be clear, I'm not suggesting that the pypy team itself go this route: at the moment it feels like the rest of us should stay out of your way.. laissez les bon temps roulez! :^)

I'm asking instead if there are any obvious gotchas involved in doing so.

Tim Parkin said...

Congrats for stage one... exciting times for python..

Luis said...
This comment has been removed by the author.
proteusguy said...

Nice job guys! Once you announce that PyPy is pretty much of comparable (90% or better) speed to that of CPython then we will be happy to start running capacity tests of our web services environment on top of it and report back our results.

Given the growing number of python implementations has there ever been a discussion of PyPy replacing CPython as the canonical implementation of python once it consistently breaks performance & reliability issues? I don't know enough of the details to advocate such a position - just curious if there's been any official thought to the possibility.

Armin Rigo said...

DSM, Proteusguy: I'd be happy to answer your questions on the pypy-dev mailing list. I think that there is no answer short enough to fit a blog post comment.

Jacob Hallén said...

proteusguy: It is our hope that PyPy can one day replace CPython as the reference implementation, but this depends on many factors. Most of them are way out of our control. It will depend very much on the level of PyPy uptake in the community, but this is just a first step. With enough adoption, the Python developers (the people actually making new versions of CPython) need to be convinced that working from PyPy as a base to develop the language makes sense. If they are convinced, Guido may decide that it is a good idea and make the switch, but not before then.

Anonymous said...

See http://moderator.appspot.com/#9/e=c9&t=pypy for Guido's opinion about PyPy.

Anonymous said...

Surely gdb is more powerful than pdb because many more people are forced to used gdb. c code is much harder to debug than python code, and needs debugging more often than python code.

Cacas Macas said...

Good day.
I use Python for about 3 years and i am following your blog almost every day to see news.
I am very excited to see more Pypy, though i don't understand how to use it (?!?!) and i never managed to install it!
Wikipedia says "PyPy is a followup to the Psyco project" and i use Psyco, so Pypy must be a very good thing. I use Psyco very intense, in all my applications, but it's very easy to use.
I have Windows and Windows document is incomplete "http://codespeak.net/pypy/dist/pypy/doc/windows.html". I have MinGW compiler.
Pypy is not very friendly with users. I think more help documents would be very useful. When i will understand how to install Pypy, i will use it.
Keep up the good work!

stracin said...

"""Rumors have it that the secret goal is being faster-than-C which is nonsense, isn't it?"""

what does this statement from the pypy homepage mean?

that c-pypy will be faster than cpython?

or that code run in c-pypy will be faster than compiled C code? :o

because of the "nonsense" i think you mean the latter? but isn't it nonsense? :) would be awesome though.