Tuesday, August 30, 2011

Wrapping C++ Libraries with Reflection — Status Report One Year Later

Well over a year ago, work was started on the cppyy module which lives in the reflex-support branch. Since then, work has progressed at a varying pace and has included a recent sprint in Düsseldorf, last July.

Let's first take a step back and recap why we're interested in doing this, given that it is perfectly possible to use C++ through generated bindings and cpyext. cppyy makes use of reflection information generated for the C++ classes of interest, and has that reflection information available at run time. Therefore, it is able to open up complex C++ types to the JIT in a conceptually similar manner as simple types are open to it. This means that it is possible to get rid of a lot of the marshalling layers when making cross-language calls, resulting in much lower call overhead than is possible when going through the CPython API, or other methods of wrapping.

There are two problems that need to be solved: C++ language constructs need to be presented on the Python side in a natural way; and cross-language impedance mismatches need to be minimized, with some hints of the user if need be. For the former, the list of mapped features has grown to a set that is sufficient to do real work. There is now support for:

  • builtin, pointer, and array types
  • namespaces, classes, and inner classes
  • global functions, global data
  • static/instance data members and methods
  • default variables, object return by value
  • single and multiple (virtual) inheritance
  • templated classes
  • basic STL support and pythonizations
  • basic (non-global) operator mapping

The second problem is harder and will always be an on-going process. But one of the more important issues has been solved at the recent Düsseldorf sprint, namely, that of reclaiming C++ objects instantiated from the Python side by the garbage collector.

Performance has also improved, especially that of the nicer "pythonized" interface that the user actually sees, although it still misses out on about a factor of 2.5 in comparison to the lower-level interface (which has gotten uglier, so you really don't want to use that). Most of this improvement is due to restructuring so that it plays nicer with the JIT and libffi, both of which themselves have seen improvements.

Work is currently concentrated on the back-ends: a CINT back-end is underway and a LLVM/CLang pre-compiled headers (PCH) back-end is planned. The latter is needed for this code to be released in the wild, rather than just used in high energy physics (HEP), as that would be easier to support. Also, within HEP, CLang's PCH are foreseen to be the future format of reflection information.

At the end of the Düsseldorf sprint, we tried a little code that did something actually "useful," namely the filling of a histogram with some random values. We did get it to work, but trying cppyy on a large class library showed that a good warning system for such things like missing classes was sorely needed. That has been added since, and revisiting the histogram example later, here is an interesting note: the pypy-c run takes 1.5x the amount of time of that of the compiled, optimized, C++ code. The run was timed start to finish, including the reflection library loading and JIT warm-up that is needed in the case of Python, but not for the compiled C++ code. However, in HEP, scientists run many short jobs while developing their analysis codes, before submitting larger jobs on the GRID to run during lunch time or overnight. Thus, a more realistic comparison is to include the compilation time needed for the C++ code and with that, the Python code needs only 55% of the time required by C++.

The choice of a programming language is often a personal one, and such arguments like the idea that C++ is hard to use typically do not carry much weight with the in-crowd that studies quantum field dynamics for fun. However, getting the prompt with your analysis results back faster is a sure winner. We hope that cppyy will soon have progressed far enough to make it useful first to particle physicists and then other uses for wrapping C++ libraries.

Wim Lavrijsen, Carl Friedrich Bolz, Armin Rigo

5 comments:

René Dudfield said...

Hi,

nice result. Wrapping C++ code can be even more tiresome than C, especially with large code bases. This will be a very welcome tool.


This question has probably been answered before... but I ask anyway since I couldn't find the answer.

Can the jit information be saved, so it does not need to be worked out again? Assuming all of the dependencies have not changed (.py files, pypy itself, .so files etc). Maybe if location independent code can not be saved, then trace hints or some higher level structure could be saved to inform the jit about what traces to jit? That sounds like a solution to jit warm up for code that is used repeatedly.

cu.

Wim Lavrijsen said...

Hi,

thanks! :)

There was a recent thread on saving JIT information on pypy-dev:

http://mail.python.org/pipermail/pypy-dev/2011-August/008073.html

and the conclusion there was that it is too hard to be of benefit because too many parts contain addresses or calculated variables that were turned into constants.

For our (HEP) purposes, it would be of limited benefit: in the development cycle, the .py's would change all the time, and it is a safe assumption that the user codes that are being developed are the most "hot." If there is anything in the supporting code that is "hot" (most likely in the framework) it'd be in C/C++ at that point anyway.

Rather, I'd like to have an easy way of letting the user determine which portions of the code will be hot. Saving not having to run a hot loop 1000x in interpreted mode before the JIT kicks in, is going to be more valuable in scientific codes where the hot loops tend to be blatantly obvious.

Cheers,
Wim

Anonymous said...

This is great. I have been looking for just such a tool to wrap C++ numerical code.

I guess I have two questions:
1. Is there any documentation on how to use it?
2. It is very important to be able to translate between NumPy data structure and C++ data structure for me, so is there any plan to make this easy?

Thanks for great work.

Wim Lavrijsen said...

Hi,

thanks! :)

ad 1) it's not at the level of being usable in a production environment. I have two known issues to resolve and probably some more unknowns. I've posted a description on pypy-dev, and I'm helping a few patient, very friendly users along. But actual documentation suggest a level of support that currently can't be offered, because all the current (and soon to disappear) caveats would need documenting as well.

ad 2) not sure what data translation you're thinking of, but in the CPython equivalent, support was added for the buffer interface and MemoryView. Those, or something similar, will be there so that numpy array's etc. can be build from return values, from public data members, and passed into function calls as arguments. Those are not translations, but rather extraction of the data pointers (which is typically intended and the most efficient, to be sure).

Cheers,
Wim

wholesale electronics said...

Maybe if location independent code can not be saved, then trace hints or some higher level structure could be saved to inform the jit about what traces to jit?