Thursday, March 8, 2012

Call for donations for Software Transactional Memory

Hi all,

The Software Transactional Memory call for donations is up. From the proposal:

Previous attempts on Hardware Transactional Memory focused on parallelizing existing programs written using the thread or threading modules. However, as argued here, this may not be the most practical way to achieve real multithreading; it seems that better alternatives would offer good scalability too. Notably, Transactional Memory could benefit any event-based system that is written to dispatch events serially (Twisted-based, most GUI toolkit, Stackless, gevent, and so on). The events would internally be processed in parallel, while maintaining the illusion of serial execution, with all the corresponding benefits of safety. This should be possible with minimal changes to the event dispatchers. This approach has been described by the Automatic Mutual Exclusion work at Microsoft Research, but not been implemented anywhere (to the best of our knowledge).

Note that, yes, this gives you both sides of the coin: you keep using your non-thread-based program (without worrying about locks and their drawbacks like deadlocks, races, and friends), and your programs benefit from all your cores.

In more details, a low-level built-in module will provide the basics to start transactions in parallel; but this module will be only used internally in a tweaked version of, say, a Twisted reactor. Using this reactor will be enough for your existing Twisted-based programs to actually run on multiple cores. You, as a developer of the Twisted-based program, have only to care about improving the parallelizability of your program (e.g. by splitting time-consuming transactions into several parts; the exact rules will be published in detail once they are known).

The point is that your program is always correct, and can be tweaked to improve performance. This is the opposite from what explicit threads and locks give you, which is a performant program which you need to tweak to remove bugs. Arguably, this approach is the reason for why you use Python in the first place :-)

Armin

10 comments:

Konstantine Rybnikov said...

Great news, really looking into experimenting with that, good luck!

My question is: will it map to os thread being created on each event dispatch or can it potentially be somehow optimized? I mean, you can potentially end up with code that has tons of small events, and creating os thread on each event would slow down your program.

Anonymous said...

@k_bx it's not like that at all. There are links in the proposal that may enlighten, depending on what you already know.

Armin Rigo said...

Indeed, it is creating a pool of N threads and reusing them, where N is configurable. Ideally it should default to the number of cores you have, detected in some (sadly non-portable) way.

Anonymous said...

Are any of you affiliated with a university? Since this is research, maybe you can get a grant for a post-doc or a PhD position.

Anonymous said...

Trivial comment - on the donation page in the "What is Transactional Memory?" section, I think a (TM) has been turned into a superscript TM (as in trademark).

Steve Phillips said...

This sounds exciting for the kinds of Python programs that would benefit from TM, but can anyone give a ballpark estimate of what percentage of programs that might be?

Personally, I write various (non-evented) Python scripts (replacements for Bash scripts, IRC bot, etc) and do a lot of Django web dev. It's not clear that I or similar people would benefit from Transactional Memory.

Is that correct?

Anonymous said...

Could u update the donation page? It doesn't seem to be tallying the amounts.

I am really excited to see this work even if it is pure research (I donated $200). It would be awesome if

stm:
....pre:
........# init transaction state
....trans:
........# parallel stuff

So it would be easy to retry failed transactions or be able to reorder them for contention or perf.

kurdakov said...

offtopic:

there is a project to help bring C# and C++ together

https://github.com/mono/cxxi
and fork https://github.com/kthompson/cxxi

in essence: there is a generation step which allows then to easily use C++ objects in C# and vice versa.

considering that ctypes are very much like p/invoke, it looks like pypy might have something similar for python/C++ environments , this might allow much easier to port, for example, Blender to use pypy as scripting language.

Arne Babenhauserheide said...

Could you post an example snippet of code which would benefit from that?

I ask because I have trouble really imagining example code.

Something minimal with the least possible amount of extension modules which I could just throw into the pypy and pypy-tm interpreter and see the difference.

Armin Rigo said...

I wrote a minimal example here:

https://bitbucket.org/pypy/pypy/raw/stm-gc/lib_pypy/transaction.py