Monday, November 30, 2009

Using CPython extension modules with PyPy, or: PyQt on PyPy

If you have ever wanted to use CPython extension modules on PyPy, we want to announce that there is a solution that should be compatible to quite a bit of the available modules. It is neither new nor written by us, but works nevertheless great with PyPy.

The trick is to use RPyC, a transparent, symmetric remote procedure call library written in Python. The idea is to start a CPython process that hosts the PyQt libraries and connect to it via TCP to send RPC commands to it.

I tried to run PyQt applications using it on PyPy and could get quite a bit of the functionality of these working. Remaining problems include regular segfaults of CPython because of PyQt-induced memory corruption and bugs because classes like StandardButtons behave incorrectly when it comes to arithmetical operations.

Changes to RPyC needed to be done to support remote unbound __init__ methods, shallow call by value for list and dict types (PyQt4 methods want real lists and dicts as parameters), and callbacks to methods (all remote method objects are wrapped into small lambda functions to ease the call for PyQt4).

If you want to try RPyC to run the PyQt application of your choice, you just need to follow these steps. Please report your experience here in the blog comments or on our mailing list.

  1. Download RPyC from the RPyC download page.
  2. Download this patch and apply it to RPyC by running patch -p1 < rpyc-3.0.7-pyqt4-compat.patch in the RPyC directory.
  3. Install RPyc by running python setup.py install as root.
  4. Run the file rpyc/servers/classic_server.py using CPython.
  5. Execute your PyQt application on PyPy.

PyPy will automatically connect to CPython and use its PyQt libraries.

Note that this scheme works with nearly every extension library. Look at pypy/lib/sip.py on how to add new libraries (you need to create such a file for every proxied extension module).

Have fun with PyQt

Alexander Schremmer

10 comments:

intgr said...

OT: you should separate labels by commas, so that Blogspot recognizes them as distinct labels.

Carl Friedrich Bolz said...

intgr: Thanks, done.

Anonymous said...

"regular segfaults of CPython because of PyQt-induced memory corruption and bugs because classes like StandardButtons behave incorrectly when it comes to arithmetical operations."

These sound interesting. Could you please elaborate? A link would suffice, if these are already documented by non-pypy people. Thanks!

holger krekel said...

cool stuff, alexander! Generic access to all CPython-provided extension could remove an importing blocker for PyPy usage, allows incremental migrations.

Besides, I wonder if having two processes, one for application and one for bindings can have benefits to stability.

Alexander Schremmer said...

Dear anonymous,

the StandardButtons bug was already communicated to a Nokia employee.
If you are interested in the segfaults, contact me and I give you the source code that I used for testing.

Zemantic dreams said...

This is an important step forward!

There are probably two reasons why people use extensions: bindings to libraries and performance.

Unfortunately this specific approach does not address performance. Is there anything on horizon that would allow near-CPython API for extensions. So modules would just need to be recompiled against PyPy bindings for CPython API? Probably not 100% compatible, but close?

Any chances of that happening?

Andraz Tori, Zemanta

Alexander Schremmer said...

Any chances of that happening?

In theory, this is possible, but a lot of work. Nobody has stepped up to implement it, yet.

Andrew said...

Isn't the exposure of refcounts in the CPython C API going to be a bit of a problem for implementing the API on pypy? perhaps a "fake" refcount could be associated with an object when it is first passed to an extension? This could still be problematic if the extension code expects to usefully manipulate the refcount, or to learn anything by examining it...

Alexander Schremmer said...

Isn't the exposure of refcounts in the CPython C API going to be a bit of a problem for implementing the API on pypy?

Indeed, it would be part of the task to introduce support in the GCs for such refcounted objects. Note that real refcounting is necessary because the object could be stored in an C array, invisible to the GC.

Andrew said...

I'm trying to think of ways around that, but any API change to make objects held only in extensions trackable by the GC would probably be much worse than adding refcounted objects, wouldn't it, unless the extension were written in rpython...