tag:blogger.com,1999:blog-39712021897094621522024-03-14T07:15:48.830+01:00PyPy Status BlogCarl Friedrich Bolz-Tereickhttp://www.blogger.com/profile/00518922641059511014noreply@blogger.comBlogger405125tag:blogger.com,1999:blog-3971202189709462152.post-24493610806618859892021-03-10T07:16:00.000+01:002021-03-10T07:16:30.638+01:00PyPy's blog has moved<p>For many years, PyPy has been publishing blog posts here at
<a href="https://morepypy.blogspot.com">https://morepypy.blogspot.com</a>. From now on, the posts will be at
<a href="https://pypy.org/blog">https://pypy.org/blog</a>. The RSS feed has moved to <a href="https://pypy.org/rss.xml">https://pypy.org/rss.xml</a>. The original
content has been migrated to the newer site, including comments.
</p><div class="e-content entry-content" itemprop="articleBody text"><div><p>Among the motivations for the move were:</p>
<h4>One site to rule them all</h4>
<p>Adding the blog posts on pypy.org seems like a natural extension of the web site
rather than outsourcing it to a third-party. Since the site is generated using
the static site generator <a href="https://getnikola.com/">nikola</a> from the github repo
<a href="https://github.com/pypy/pypy.org">https://github.com/pypy/pypy.org</a>, we now have good source control for the
content.</p>
<h4>CI previews, and github</h4>
<p>Those of you who follow PyPy may note something new in the URL for the repo:
until now PyPy has been using <a href="https://mercurial-scm.org">mercurial</a> as hosted
on <a href="https://foss.heptapod.net">https://foss.heptapod.net</a>. While <a href="https://heptapod.net/">heptapod</a> (a
community driven effort to bring mercurial support to GitLab™) does provide a
GitLab CI runner for the open source offering, it is easier to use <a href="http://netlify.com">netlify</a> for
previews on github. Hopefully the move to the more popular github platform will encourage
new contributors to publish their success stories around using PyPy and
the RPython toolchain.</p>
<h4>Comments</h4>
<p>Comments to blog posts are generated via the <a href="https://utteranc.es/">utterances</a>
javascript plugin. The comments appear as issues in the repo.
When viewing the site, a query is made to fetch the comments to the issue with
that name. To comment, users must authorize the utterances app to post on their
behalf using the <a href="https://developer.github.com/v3/oauth/#web-application-flow">GitHub
OAuth</a> flow.
Alternatively, users can comment on the GitHub issue directly. The interaction
with github for authentication and moderation seems more natural than the
manual moderation required on blogspot.</p>
<h4>Please prove to us that the move is worth it</h4>
<p>Help us with guest blog posts, and PRs to improve the styling of the site. One
already open issue is that the navbar needlessly uses javascript, help to keep
the responsive style in pure CSS is welcome.</p>
<p>The PyPy Team</p>
</div>
</div>mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-9408223356190990392020-12-31T10:53:00.002+01:002020-12-31T10:53:24.427+01:00Mac meets Arm64<b>Looking for sponsorship</b>
<p>Apple now ships Macs which are running on an arm64 variant machine with the
latest version of MacOS, Big Sur M1. We are getting requests for PyPy to
support this new architecture. Here is our position on this topic (or at least
mine, Armin Rigo's), and how you can help.</p>
<p>Porting PyPy is harder than just re-running the compiler, because PyPy contains
a few big architecture-dependent "details", like the JIT compiler and the
foreign function interfaces (CFFI and ctypes).</p>
<p>Fixing the JIT compiler should not be too much work: we already support arm64,
just the Linux one. But Apple made various details different (like the calling
conventions). A few other parts need to be fixed too, notably CFFI and ctypes,
again because of the calling conventions.</p>
<p>Fixing that would be a reasonable amount of work. I would do it myself for a
small amount of money. However, the story doesn't finish here. Obviously, the
<b>start</b> of the story would be to get ssh access to a Big Sur M1 machine. (If at
this point you're thinking "sure, I can give you ssh access for three months",
then please read on.) The <b>next</b> part of the story is that we need a machine
available long term. It can be either a machine provided and maintained by a
third party, or alternatively a pot of money big enough to support the
acquision of a machine and ongoing work of one of us.</p>
<p>If we go with the provided-machine solution: What we need isn't a lot of
resources. Our CI requires maybe 10 GB of disk space, and a few hours of CPU
per run. It should fit into 8 GB of RAM. We normally do a run every night but
we can certainly lower the frequency a bit if that would help. However, we'd
ideally like some kind of assurance that you are invested into maintaining the
machine for the next 3-5 years (I guess, see below). We had far too many
machines that disappeared after a few months.</p>
<p>If we go with the money-supported solution: it's likely that after 3-5 years
the whole Mac base will have switched to arm64, we'll drop x86-64 support for
Mac, and we'll be back to the situation of the past where there was only one
kind of Mac machine to care about. In the meantime, we are looking at 3-5
years of lightweight extra maintenance. We have someone that has said he would
do it, but not for free.</p>
<p>If either of these two solutions occurs, we'll still have, I quote, "probably
some changes in distutils-type stuff to make python happy", and then some
packaging/deployment changes to support the "universal2" architecture, i.e.
including both versions inside a single executable (which will <b>not</b> be just an
extra switch to clang, because the two versions need a different JIT backend
and so must be translated separately).</p>
<p>So, now all the factors are on the table. We won't do the minimal "just the
JIT compiler fixes" if we don't have a plan that goes farther. Either we get
sufficient money, and maybe support, and then we can do it quickly; or PyPy
will just remain not natively available on M1 hardware for the next 3-5 years.
We are looking forward to supporting M1, and view resources contributed by
the community as a vote of confidence in assuring the future of PyPy on this
hardware. Contact us: <a href="mailto:pypy-dev@python.org">pypy-dev@python.org</a>, or our private mailing
list <a href="mailto:pypy-z@python.org">pypy-z@python.org</a>.</p>
<p>Thanks for reading!</p>
<p>Armin Rigo</p>Armin Rigohttp://www.blogger.com/profile/06300515270104686574noreply@blogger.com8tag:blogger.com,1999:blog-3971202189709462152.post-34465968044082627492020-11-21T20:32:00.000+01:002020-11-21T20:32:57.067+01:00PyPy 7.3.3 triple release: python 3.7, 3.6, and 2.7<p> The PyPy team is proud to release the version 7.3.3 of PyPy, which includes
three different interpreters:
</p><blockquote>
<div><ul class="simple"><li>PyPy2.7, which is an interpreter supporting the syntax and the features of
Python 2.7 including the stdlib for CPython 2.7.18 (updated from the
previous version)</li><li>PyPy3.6: which is an interpreter supporting the syntax and the features of
Python 3.6, including the stdlib for CPython 3.6.12 (updated from the
previous version).</li><li>PyPy3.7 beta: which is our second release of an interpreter supporting the
syntax and the features of Python 3.7, including the stdlib for CPython
3.7.9. We call this beta quality software, there may be issues about
compatibility with new and changed features in CPython 3.7.
Please let us know what is broken or missing. We have not implemented the
<a class="reference external" href="https://docs.python.org/3/whatsnew/3.7.html#re">documented changes</a> in the <code class="docutils literal notranslate"><span class="pre">re</span></code> module, and a few other pieces are also
missing. For more information, see the <a class="reference external" href="https://foss.heptapod.net/pypy/pypy/-/wikis/py3.7%20status">PyPy 3.7 wiki</a> page</li></ul>
</div></blockquote>
<p>The interpreters are based on much the same codebase, thus the multiple
release. This is a micro release, all APIs are compatible with the 7.3
releases, but read on to find out what is new.</p>
<p>Several issues found in the 7.3.2 release were fixed. Many of them came from the
great work by <a class="reference external" href="https://conda-forge.org/blog//2020/03/10/pypy">conda-forge</a> to ship PyPy binary packages. A big shout out
to them for taking this on.</p>
<p>Development of PyPy has moved to <a class="reference external" href="https://foss.heptapod.net/pypy/pypy">https://foss.heptapod.net/pypy/pypy</a>.
This was covered more extensively in this <a class="reference external" href="https://morepypy.blogspot.com/2020/02/pypy-and-cffi-have-moved-to-heptapod.html">blog post</a>. We have seen an
increase in the number of drive-by contributors who are able to use gitlab +
mercurial to create merge requests.</p>
<p>The <a class="reference external" href="https://cffi.readthedocs.io">CFFI</a> backend has been updated to version 1.14.3. We recommend using CFFI
rather than c-extensions to interact with C, and using <a class="reference external" href="https://cppyy.readthedocs.io">cppyy</a> for performant
wrapping of C++ code for Python.</p>
<p>A new contributor took us up on the challenge to get <a href="https://doc.pypy.org/en/latest/windows.html#what-is-missing-for-a-full-64-bit-translation">windows 64-bit</a> support.
The work is proceeding on the <code class="docutils literal notranslate"><span class="pre">win64</span></code> branch, more help in coding or
sponsorship is welcome. In anticipation of merging this large change, we fixed
many test failures on windows.</p>
<p>As always, this release fixed several issues and bugs. We strongly recommend
updating. Many of the fixes are the direct result of end-user bug reports, so
please continue reporting issues as they crop up.</p>
<p>You can find links to download the v7.3.3 releases here:</p>
<blockquote>
<div><a class="reference external" href="https://pypy.org/download.html">https://pypy.org/download.html</a></div></blockquote>
<p>We would like to thank our <a href="https://opencollective.com/pypy">donors</a> for the continued support of the PyPy
project. If PyPy is not quite good enough for your needs, we are available for
direct consulting work.</p>
<p>We would also like to thank our contributors and encourage new people to join
the project. PyPy has many layers and we need help with all of them: <a class="reference external" href="https://doc.pypy.org/en/latest/index.html">PyPy</a>
and <a class="reference external" href="https://rpython.readthedocs.org">RPython</a> documentation improvements, tweaking popular modules to run
on pypy, or general <a class="reference external" href="https://doc.pypy.org/en/latest/project-ideas.html">help</a> with making RPython’s JIT even better. Since the
previous release, we have accepted contributions from 2 new contributors,
thanks for pitching in.</p>
<p>If you are a python library maintainer and use c-extensions, please consider
making a cffi / cppyy version of your library that would be performant on PyPy.
In any case both <a class="reference external" href="https://github.com/joerick/cibuildwheel">cibuildwheel</a> and the <a class="reference external" href="https://github.com/matthew-brett/multibuild">multibuild system</a> support
building wheels for PyPy.</p>
<div class="section" id="what-is-pypy">
<h2 style="text-align: center;"><span style="font-size: x-large;">What is PyPy?</span></h2>
<p>PyPy is a Python interpreter, a drop-in replacement for CPython 2.7, 3.6, and
3.7. It’s fast (<a class="reference external" href="https://speed.pypy.org">PyPy and CPython 3.7.4</a> performance
comparison) due to its integrated tracing JIT compiler.</p>
<p>We also welcome developers of other <a class="reference external" href="https://rpython.readthedocs.io/en/latest/examples.html">dynamic languages</a> to see what RPython
can do for them.</p>
<p>This PyPy release supports:</p>
<blockquote>
<div><ul class="simple"><li><b>x86</b> machines on most common operating systems
(Linux 32/64 bits, Mac OS X 64 bits, Windows 32 bits, OpenBSD, FreeBSD)</li><li>big- and little-endian variants of <b>PPC64</b> running Linux,</li><li><b>s390x</b> running Linux</li><li>64-bit <b>ARM</b> machines running Linux.</li></ul>
</div></blockquote>
<p>PyPy does support ARM 32 bit processors, but does not release binaries.</p><p> </p><h2 style="text-align: center;">
<span style="font-size: x-large;">What else is new?</span></h2>
For more information about the 7.3.3 release, see the full <a href="https://doc.pypy.org/en/latest/release-v7.3.3.html">changelog</a>.<br />
<br />
Please update, and continue to help us make PyPy better.<br />
<br />
Cheers,<br />
The PyPy team
<p> </p>
</div>mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-39809013354908727872020-09-25T07:45:00.000+02:002020-09-25T07:45:43.923+02:00PyPy 7.3.2 triple release: python 2.7, 3.6, and 3.7<p> </p><div style="text-align: left;">The PyPy team is proud to release version 7.3.2 of PyPy, which includes
three different interpreters:
</div><blockquote>
</blockquote><ul style="text-align: left;"><li>PyPy2.7, which is an interpreter supporting the syntax and the features of
Python 2.7 including the stdlib for CPython 2.7.13</li><li>PyPy3.6: which is an interpreter supporting the syntax and the features of
Python 3.6, including the stdlib for CPython 3.6.9.</li><li>PyPy3.7 alpha: which is our first release of an interpreter supporting the
syntax and the features of Python 3.7, including the stdlib for CPython
3.7.9. We call this an alpha release since it is our first. It is based off PyPy 3.6 so
issues should be around compatibility and not stability. Please try it out
and let us know what is broken or missing. We have not implemented some of the
<a class="reference external" href="https://docs.python.org/3/whatsnew/3.7.html#re">documented changes</a> in the <code class="docutils literal notranslate"><span class="pre">re</span></code> module, and other pieces are also
missing. For more information, see the <a class="reference external" href="https://foss.heptapod.net/pypy/pypy/-/wikis/py3.7%20status">PyPy 3.7 wiki</a> page</li></ul><blockquote><div>
</div></blockquote>
<p>The interpreters are based on much the same codebase, thus the multiple
release. This is a micro release, all APIs are compatible with the 7.3.0 (Dec
2019) and 7.3.1 (April 2020) releases, but read on to find out what is new.</p>
<p>Conda Forge now <a class="reference external" href="https://conda-forge.org/blog//2020/03/10/pypy">supports PyPy</a> as a python interpreter. The support is quite
complete for linux and macOS. This is the result of a lot of
hard work and good will on the part of the Conda Forge team. A big shout out
to them for taking this on.</p>
<p>Development of PyPy has transitioning to <a class="reference external" href="https://foss.heptapod.net/pypy/pypy">https://foss.heptapod.net/pypy/pypy</a>.
This move was covered more extensively in this <a class="reference external" href="https://morepypy.blogspot.com/2020/02/pypy-and-cffi-have-moved-to-heptapod.html">blog post</a>. We have seen an
increase in the number of drive-by contributors who are able to use gitlab +
mercurial to create merge requests.</p>
<p>The <a class="reference external" href="https://cffi.readthedocs.io">CFFI</a> backend has been updated to version 1.14.2. We recommend using CFFI
rather than c-extensions to interact with C, and using <a class="reference external" href="https://cppyy.readthedocs.io">cppyy</a> for performant
wrapping of C++ code for Python.</p>
<p>NumPy has begun shipping wheels on PyPI for PyPy, currently for linux 64-bit
only. Wheels for PyPy windows will be available from the next NumPy release. Thanks to NumPy for their support.<br /></p>
<p>A new contributor took us up on the challenge to get windows 64-bit support.
The work is proceeding on the <code class="docutils literal notranslate"><span class="pre">win64</span></code> branch, more help in coding or
sponsorship is welcome.</p>
<p>As always, this release fixed several issues and bugs. We strongly recommend
updating. Many of the fixes are the direct result of end-user bug reports, so
please continue reporting issues as they crop up.</p><p>You can find links to download the v7.3.2 releases here:</p>
<blockquote>
<div><a class="reference external" href="https://pypy.org/download.html">https://pypy.org/download.html</a></div></blockquote>
<p>We would like to thank our donors for the continued support of the PyPy
project. Please help support us at <a href="https://opencollective.com/pypy#section-contribute" target="_blank">Open Collective</a>. If PyPy is not yet good enough for your needs, we are available for
direct consulting work.<br /></p>
<p>We would also like to thank our contributors and encourage new people to join
the project. PyPy has many layers and we need help with all of them: <a class="reference external" href="https://doc.pypy.org/en/latest/index.html">PyPy</a>
and <a class="reference external" href="https://rpython.readthedocs.org">RPython</a> documentation improvements, tweaking popular modules to run
on pypy, or general <a class="reference external" href="https://doc.pypy.org/en/latest/project-ideas.html">help</a> with making RPython’s JIT even better. Since the
previous release, we have accepted contributions from 8 new contributors,
thanks for pitching in.</p>
<p>If you are a python library maintainer and use c-extensions, please consider
making a cffi / cppyy version of your library that would be performant on PyPy.
In any case both <a class="reference external" href="https://github.com/joerick/cibuildwheel">cibuildwheel</a> and the <a class="reference external" href="https://github.com/matthew-brett/multibuild">multibuild system</a> support
building wheels for PyPy.</p>
<div class="section" id="what-is-pypy">
<h2 style="text-align: center;">What is PyPy?</h2>
<p>PyPy is a very compliant Python interpreter, almost a drop-in replacement for
CPython 2.7, 3.6, and 3.7. It’s fast (<a class="reference external" href="https://speed.pypy.org">PyPy and CPython 2.7.x</a> performance
comparison) due to its integrated tracing JIT compiler.</p>
<p>We also welcome developers of other <a class="reference external" href="https://rpython.readthedocs.io/en/latest/examples.html">dynamic languages</a> to see what RPython
can do for them.</p>
<p>This PyPy release supports:</p>
<blockquote>
<div><ul class="simple"><li><b>x86</b> machines on most common operating systems
(Linux 32/64 bits, Mac OS X 64 bits, Windows 32 bits, OpenBSD, FreeBSD)</li><li>big- and little-endian variants of <b>PPC64</b> running Linux,</li><li><b>s390x</b> running Linux</li><li>64-bit <b>ARM</b> machines running Linux.</li></ul>
</div></blockquote>
<p>PyPy does support ARM 32 bit processors, but does not release binaries.</p>
</div>
<div class="section" id="changelog">
<h2 style="text-align: center;">
<span style="font-size: x-large;">What else is new?</span></h2>
For more information about the 7.3.2 release, see the full <a href="https://pypy.readthedocs.io/en/latest/release-v7.3.2.html">changelog</a>.<br />
<br />
Please update, and continue to help us make PyPy better.<br />
<br />
Cheers,<br />
The PyPy team
</div>
<p> </p><p> </p>mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com4tag:blogger.com,1999:blog-3971202189709462152.post-56733224288143647372020-08-29T12:53:00.000+02:002020-08-29T12:53:50.722+02:00PyPy is on Open Collective<p>Hi all,</p>
<p>PyPy is now a <a href="https://opencollective.com/pypy">member of Open Collective</a>, a fiscal host. We have been thinking about switching to this organization for a couple of years; we like it for various reasons, like the budget transparency and the lightweight touch. We can now officially announce our membership!</p>
<p>With this, we are now again free to use PyPy for all financial issues, like receiving funds professionally, paying parts of sprint budgets as we like, and so on. We will shortly be reintroducing buttons that link to Open Collective from the PyPy web site.</p>
<p>Although the old donation buttons were removed last year, we believe that there are still a few people that send regularly money to the SFC, the not-for-profit charity we were affiliated with. If you do, please stop doing it now (and, if you like to do so, please set up an equivalent donation to <a href="https://opencollective.com/pypy">PyPy on Open Collective</a>).</p>
<p>And by the way, sorry for all of you who were getting mixed feelings from the previous blog post (co-written with the SFC). <b>PyPy is committed to continue being Open Source just like before.</b> This was never in question. What these two blog posts mean is only that we switched to a different organization for our internal finances.</p>
<p>We're looking forward to how this new relationship will go!</p>
<p>Armin Rigo, for the PyPy team</p>
Armin Rigohttp://www.blogger.com/profile/06300515270104686574noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-83883227096673283892020-08-12T20:00:00.000+02:002020-08-12T20:00:21.991+02:00A new chapter for PyPy<p><i>PyPy winds down its membership in the Software Freedom Conservancy</i></p>
<h1>Conservancy and PyPy's great work together</h1>
<p><a href="https://pypy.org/">PyPy</a> joined <a href="https://sfconservancy.org/">Conservancy</a> in
the <a href="https://sfconservancy.org/blog/2011/jan/02/oct-dec-2010/">second half of 2010</a>, shortly after the release of
PyPy 1.2, the first version to contain a fully functional JIT. <a href="https://lwn.net/Articles/550427/">In 2013</a>, PyPy
started supporting ARM, bringing its just-in-time speediness to many more devices and began working toward supporting NumPy to help
scientists crunch their numbers faster. Together, PyPy and Conservancy ran successful fundraising drives and facilitated payment
and oversight for <a href="https://sfconservancy.org/blog/2016/dec/01/pypy-2016/">contractors and code sprints</a>.</p>
<p>Conservancy supported PyPy's impressive growth as it expanded support for
different hardware platforms, greatly improved the performance of C extensions,
and added support for Python 3 as the language itself evolved.</p>
<h1>The road ahead</h1>
<p>Conservancy provides a fiscal and organizational home for projects that find the
freedoms and guardrails that come along with a charitable home advantageous for
their community goals. While this framework was a great fit for the early PyPy
community, times change and all good things must come to an end.</p>
<p>PyPy will remain a free and open source project, but the community's structure
and organizational underpinnings will be changing and the PyPy community will be
exploring options outside of the charitable realm for its next phase of growth
("charitable" in the legal sense -- PyPy will remain a community project).</p>
<p>During the last year PyPy and Conservancy have worked together to properly
utilise the generous donations made by stalwart PyPy enthusiats over the years
and to wrap up PyPy's remaining charitable obligations. PyPy is grateful for
the Conservancy's help in shepherding the project toward its next chapter.</p>
<h1>Thank yous</h1><p>From Conservancy: <br /></p><p style="text-align: left;"></p><blockquote>"We are happy that Conservancy was able to help PyPy bring important software
for the public good during a critical time in its history. We wish the
community well and look forward to seeing it develop and succeed in new ways." <br /></blockquote><blockquote>— Karen Sandler, Conservancy's Executive Director</blockquote><p></p><p>From PyPy:</p><p></p><div style="text-align: left;"><div style="text-align: left;"><blockquote><p>"PyPy would like to thank Conservancy for their decade long support in
building the community and wishes Conservancy continued success in their
journey promoting, improving, developing and defending free and open source
sofware." <br /></p></blockquote><blockquote><p style="text-align: left;">— Simon Cross & Carl Friedrich Bolz-Tereick, on behalf of PyPy.</p></blockquote></div></div><p></p><blockquote>
</blockquote>
<h1>About</h1>
<p><a class="reference external" href="https://pypy.org/">PyPy</a> is a multi-layer python interpreter with a built-in JIT compiler that runs
Python quickly across different computing environments.
<a class="reference external" href="https://sfconservancy.org/">Software Freedom Conservancy</a> (Conservancy) is a charity that provides a home
to over forty free and open source software projects.</p>hodgestarhttp://www.blogger.com/profile/01625611082424480664noreply@blogger.com5tag:blogger.com,1999:blog-3971202189709462152.post-62664516473876574802020-04-10T15:29:00.000+02:002020-08-26T14:28:20.745+02:00PyPy 7.3.1 released
<div dir="ltr" style="text-align: left;" trbidi="on">
The PyPy team is proud to release the version 7.3.1 of PyPy, which includes
two different interpreters:<br />
<blockquote>
<div>
<ul class="simple">
<li>PyPy2.7, which is an interpreter supporting the syntax and the features of
Python 2.7 including the stdlib for CPython 2.7.13</li>
<li>PyPy3.6: which is an interpreter supporting the syntax and the features of
Python 3.6, including the stdlib for CPython 3.6.9.</li>
</ul>
</div>
</blockquote>
The interpreters are based on much the same codebase, thus the multiple
release. This is a micro release, no APIs have changed since the 7.3.0 release
in December, but read on to find out what is new.<br />
<br />
Conda Forge now <a class="reference external" href="https://conda-forge.org/blog//2020/03/10/pypy">supports PyPy</a> as a Python interpreter. The support right now
is being built out. After this release, many more c-extension-based
packages can be successfully built and uploaded. This is the result of a lot of
hard work and good will on the part of the Conda Forge team. A big shout out
to them for taking this on.<br />
<br />
We have worked with the Python packaging group to support tooling around
building third party packages for Python, so this release updates the pip and
setuptools installed when executing <code class="docutils literal notranslate"><span class="pre">pypy</span> <span class="pre">-mensurepip</span></code> to <code class="docutils literal notranslate"><span class="pre">pip>=20</span></code>. This
completes the work done to update the PEP 425 <a class="reference external" href="https://www.python.org/dev/peps/pep-0425/#python-tag">python tag</a> from <code class="docutils literal notranslate"><span class="pre">pp373</span></code> to
mean “PyPy 7.3 running python3” to <code class="docutils literal notranslate"><span class="pre">pp36</span></code> meaning “PyPy running Python
3.6” (the format is recommended in the PEP). The tag itself was
changed in 7.3.0, but older pip versions build their own tag without querying
PyPy. This means that wheels built for the previous tag format will not be
discovered by pip from this version, so library authors should update their
PyPy-specific wheels on PyPI.<br />
<br />
Development of PyPy is transitioning to <a class="reference external" href="https://foss.heptapod.net/pypy/pypy">https://foss.heptapod.net/pypy/pypy</a>.
This move was covered more extensively in the <a class="reference external" href="https://morepypy.blogspot.com/2020/02/pypy-and-cffi-have-moved-to-heptapod.html">blog post</a> from last month.<br />
<br />
The <a class="reference external" href="http://cffi.readthedocs.io/">CFFI</a> backend has been updated to version 14.0. We recommend using CFFI
rather than c-extensions to interact with C, and using <a class="reference external" href="https://cppyy.readthedocs.io/">cppyy</a> for performant
wrapping of C++ code for Python. The <code class="docutils literal notranslate"><span class="pre">cppyy</span></code> backend has been enabled
experimentally for win32, try it out and let use know how it works.<br />
<br />
Enabling <code class="docutils literal notranslate"><span class="pre">cppyy</span></code> requires a more modern C compiler, so win32 is now built
with MSVC160 (Visual Studio 2019). This is true for PyPy 3.6 as well as for 2.7.<br />
<br />
We have improved warmup time by up to 20%, performance of <code class="docutils literal notranslate"><span class="pre">io.StringIO</span></code> to
match if not be faster than CPython, and improved JIT code generation for
generators (and generator expressions in particular) when passing them to
functions like <code class="docutils literal notranslate"><span class="pre">sum</span></code>, <code class="docutils literal notranslate"><span class="pre">map</span></code>, and <code class="docutils literal notranslate"><span class="pre">map</span></code> that consume them. Performance of closures has also be improved in certain situations.<br />
<br />
As always, this release fixed several issues and bugs raised by the growing
community of PyPy users. We strongly recommend updating. Many of the fixes are
the direct result of end-user bug reports, so please continue reporting issues
as they crop up.<br />
You can find links to download the v7.3.1 releases here:<br />
<blockquote>
<div>
<a class="reference external" href="http://pypy.org/download.html">http://pypy.org/download.html</a></div>
</blockquote>
We would like to thank our donors for the continued support of the PyPy
project. If PyPy is not quite good enough for your needs, we are available for
direct consulting work.<br />
<br />
We would also like to thank our contributors and encourage new people to join
the project. PyPy has many layers and we need help with all of them: <a class="reference external" href="http://doc.pypy.org/en/latest/index.html">PyPy</a>
and <a class="reference external" href="https://rpython.readthedocs.org/">RPython</a> documentation improvements, tweaking popular modules to run
on PyPy, or general <a class="reference external" href="http://doc.pypy.org/en/latest/project-ideas.html">help</a> with making RPython’s JIT even better. Since the
previous release, we have accepted contributions from 13 new contributors,
thanks for pitching in.<br />
<br />
If you are a Python library maintainer and use c-extensions, please consider
making a cffi / cppyy version of your library that would be performant on PyPy.
In any case both <a class="reference external" href="https://github.com/joerick/cibuildwheel">cibuildwheel</a> and the <a class="reference external" href="https://github.com/matthew-brett/multibuild">multibuild system</a> support
building wheels for PyPy wheels.<br />
<div class="section" id="what-is-pypy">
<h2 style="text-align: center;">
</h2>
<h2 style="text-align: center;">
<span style="font-size: x-large;">What is PyPy?</span></h2>
PyPy is a very compliant Python interpreter, almost a drop-in replacement for
CPython 2.7, 3.6, and soon 3.7. It’s fast (<a class="reference external" href="http://speed.pypy.org/">PyPy and CPython 2.7.x</a> performance
comparison) due to its integrated tracing JIT compiler.<br />
<br />
We also welcome developers of other <a class="reference external" href="http://rpython.readthedocs.io/en/latest/examples.html">dynamic languages</a> to see what RPython
can do for them.<br />
<br />
This PyPy release supports:<br />
<blockquote>
<div>
<ul class="simple">
<li><strong>x86</strong> machines on most common operating systems
(Linux 32/64 bits, Mac OS X 64 bits, Windows 32 bits, OpenBSD, FreeBSD)</li>
<li>big- and little-endian variants of <strong>PPC64</strong> running Linux,</li>
<li><strong>s390x</strong> running Linux</li>
<li>64-bit <strong>ARM</strong> machines running Linux.</li>
</ul>
</div>
</blockquote>
<br />
<div class="section" id="changelog">
<h2 style="text-align: center;">
<span style="font-size: x-large;">What else is new?</span></h2>
For more information about the 7.3.1 release, see the full <a href="https://pypy.readthedocs.io/en/latest/release-v7.3.1.html">changelog</a>.<br />
<br />
Please update, and continue to help us make PyPy better.<br />
<br />
Cheers,<br />
The PyPy team
</div>
<br /><br />
<br />
The PyPy Team <br />
</div>
</div>
mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-7645677773539558972020-03-17T22:57:00.001+01:002020-03-17T23:09:12.964+01:00Leysin 2020 Sprint ReportAt the end of February ten of us gathered in Leysin, Switzerland to work on<br />
a variety of topics including <a class="reference external" href="https://github.com/pyhandle/hpy/">HPy</a>, <a class="reference external" href="http://buildbot.pypy.org/summary?branch=py3.7">PyPy Python 3.7</a> support and the PyPy<br />
migration to <a class="reference external" href="https://foss.heptapod.net/pypy/">Heptapod</a>.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-PIs_hVhn3RY/XnFDceuihNI/AAAAAAAAbRg/LKMOMWxeFw4jhcwqy8jx7iKzKE01fbfxQCEwYBhgL/s1600/2020_leysin_sprint_attendees.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="720" data-original-width="1280" height="180" src="https://1.bp.blogspot.com/-PIs_hVhn3RY/XnFDceuihNI/AAAAAAAAbRg/LKMOMWxeFw4jhcwqy8jx7iKzKE01fbfxQCEwYBhgL/s320/2020_leysin_sprint_attendees.jpg" width="320" /></a></div>
<br />
We had a fun and productive week. The snow was beautiful. There was skiing<br />
and lunch at the top of <a class="reference external" href="https://en.wikipedia.org/wiki/Berneuse">Berneuse</a>, cooking together, some late nights at<br />
the pub next door, some even later nights coding, and of course the<br />
obligatory cheese fondue outing.<br />
<br />
There were a few of us participating in a PyPy sprint for the first time<br />
and a few familiar faces who had attended many sprints. Many different<br />
projects were represented including PyPy, <a class="reference external" href="https://github.com/pyhandle/hpy/">HPy</a>, <a class="reference external" href="https://github.com/graalvm/graalpython">GraalPython</a>,<br />
<a class="reference external" href="https://foss.heptapod.net/pypy/">Heptapod</a>, and <a class="reference external" href="https://github.com/dgrunwald/rust-cpython">rust-cpython</a>. The atmosphere was relaxed and welcoming, so if<br />
you're thinking of attending the next one -- please do!<br />
<br />
Topics worked on:<br />
<br />
<h2>
HPy</h2>
HPy is a new project to design and implement a better API for extending<br />
Python in C. If you're unfamiliar with it you can read more about it at<br />
<a class="reference external" href="https://github.com/pyhandle/hpy/">HPy</a>.<br />
<br />
A lot of attention was devoted to the Big HPy Design Discussion which<br />
took up two full mornings. So much was decided that this will likely<br />
get its own detailed write-up, but bigger topics included:<br />
<ul class="simple">
<li>the HPy GetAttr, SetAttr, GetItem and SetItem methods,</li>
<li>HPy_FromVoidP and HPy_AsVoidP for passing HPy handles to C functions<br />
that pass void* pointers to callbacks,</li>
<li>avoiding having va_args as part of the ABI,</li>
<li>exception handling,</li>
<li>support for creating custom types.</li>
</ul>
Quite a few things got worked on too:<br />
<ul class="simple">
<li>implemented support for writing methods that take keyword arguments with<br />
HPy_METH_KEYWORDS,</li>
<li>implemented HPy_GetAttr, HPy_SetAttr, HPy_GetItem, and HPy_SetItem,</li>
<li>started implementing support for adding custom types,</li>
<li>started implementing dumping JSON objects in ultrajson-hpy,</li>
<li>refactored the PyPy GIL to improve the interaction between HPy and<br />
PyPy's cpyext,</li>
<li>experimented with adding HPy support to rust-cpython.</li>
</ul>
And there was some discussion of the next steps of the HPy initiative<br />
including writing documentation, setting up websites and funding, and<br />
possibly organising another HPy gathering later in the year.<br />
<br />
<h2>
PyPy</h2>
<ul class="simple">
<li>Georges gave a presentation on the Heptapod topic and branch workflows<br />
and showed everyone how to use hg-evolve.</li>
<li>Work was done on improving the PyPy CI buildbot post the move to<br />
heptapod, including a light-weight pre-merge CI and restricting<br />
when the full CI is run to only branch commits.</li>
<li>A lot of work was done improving the -D tests. </li>
</ul>
<br />
<h2>
Miscellaneous</h2>
<ul class="simple">
<li>Armin demoed VRSketch and NaN Industries in VR, including an implementation<br />
of the Game of Life within NaN Industries!</li>
<li>Skiing!</li>
</ul>
<br />
<h2>
Aftermath</h2>
Immediately after the sprint large parts of Europe and the world were<br />
hit by the COVID-19 epidemic. It was good to spend time together before<br />
travelling ceased to be a sensible idea and many gatherings were cancelled.<br />
<br />
Keep safe out there everyone.<br />
<br />
The HPy & PyPy Team & Friends<br />
<br />
<i>In joke for those who attended the sprint: Please don't replace this blog post<br />
with its Swedish translation (or indeed a translation to any other language :).</i>hodgestarhttp://www.blogger.com/profile/01625611082424480664noreply@blogger.com21854 Leysin, Switzerland46.3435634 7.01203346.2558859 6.8506715 46.4312409 7.1733945tag:blogger.com,1999:blog-3971202189709462152.post-57915951524727470322020-02-16T15:36:00.000+01:002020-02-16T15:36:00.573+01:00PyPy and CFFI have moved to Heptapod<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="document">
<div class="body">
<div class="section" id="pypy-and-cffi-have-moved-to-heptapod">
It has been a very busy month, not so much because of deep changes in the JIT of PyPy but more around the development, deployment, and packaging of the project.<br />
<div class="section" id="hosting">
<h2>
<span style="font-size: x-large;"> </span></h2>
<h2>
<span style="font-size: x-large;">Hosting</span></h2>
The biggest news is that we have moved the center of our development off Bitbucket and to the new <a class="reference external" href="https://foss.heptapod.net/pypy">https://foss.heptapod.net/pypy</a>. This is a friendly fork of Gitlab called <a class="reference external" href="https://heptapod.net/">heptapod</a> that understands Mercurial and is hosted by <a class="reference external" href="https://www.clever-cloud.com/en/heptapod">Clever Cloud</a>. When Atlassian decided to close down Mercurial hosting on bitbucket.org, PyPy debated what to do. Our development model is based on long-lived branches, and we want to keep the ability to immediately see which branch each commit came from. Mercurial has this, git does not (see <a class="reference external" href="http://doc.pypy.org/en/latest/faq.html#why-doesn-t-pypy-use-git-and-move-to-github">our FAQ</a>). <a class="reference external" href="https://octobus.net/">Octobus</a>, whose business is Mercurial, developed a way to use Mercurial with Gitlab called heptapod. The product is still <a class="reference external" href="https://heptapod.net/pages/getting-involved.html">under development</a>, but quite usable (i.e., it doesn't get in the way). Octobus partnered with Clever Cloud hosting to offer community FOSS projects hosted on Bitbucket who wish to remain with Mercurial a new home. PyPy took them up on the offer, and migrated its repos to <a class="reference external" href="https://foss.heptapod.net/pypy">https://foss.heptapod.net/pypy</a>. We were very happy with how smooth it was to import the repos to heptapod/GitLab, and are learning the small differences between Bitbucket and GitLab. All the pull requests, issues, and commits kept the same ids, but work is still being done to attribute the issues, pull requests, and comments to the correct users. So from now on, when you want to contribute to PyPy, you do so at the new home.<br />
<br />
CFFI, which previously was also hosted on Bitbucket, has joined the PyPy group at <a class="reference external" href="https://foss.heptapod.net/pypy/cffi">https://foss.heptapod.net/pypy/cffi</a>.</div>
<div class="section" id="website">
<h2>
<span style="font-size: x-large;"> </span></h2>
<h2>
<span style="font-size: x-large;">Website</span></h2>
Secondly, thanks to work by <a class="reference external" href="https://baroquesoftware.com/">https://baroquesoftware.com/</a> in leading a redesign and updating the logo, the <a class="reference external" href="https://www.pypy.org/">https://www.pypy.org</a> website has undergone a facelift. It should now be easier to use on small-screen devices. Thanks also to the PSF for hosting the site.</div>
<div class="section" id="packaging">
<h2>
<span style="font-size: x-large;"> </span></h2>
<h2>
<span style="font-size: x-large;">Packaging</span></h2>
Also, building PyPy from source takes a fair amount of time. While we provide downloads in the form of tarballs or zipfiles, and some platforms such as debian and Homebrew provide packages, traditionally the downloads have only worked on a specific flavor of operating system. A few years ago squeaky-pl started providing <a class="reference external" href="https://github.com/squeaky-pl/portable-pypy">portable builds</a>. We have adopted that build system for our linux offerings, so the <a class="reference external" href="https://buildbot.pypy.org/nightly">nightly downloads</a> and <a class="reference external" href="https://bitbucket.org/pypy/pypy/downloads">release downloads</a> should now work on any glibc platform that has not gone EndOfLife. So there goes another excuse not to use PyPy. And the "but does it run scipy" excuse also no longer holds, although "does it speed up scipy" still has the wrong answer. For that we are working on <a class="reference external" href="https://morepypy.blogspot.com/2019/12/hpy-kick-off-sprint-report.html">HPy</a>, and will be <a class="reference external" href="https://morepypy.blogspot.com/2020/01/leysin-winter-sprint-2020-feb-28-march.html">sprinting soon</a>.<br />
The latest versions of pip, wheel, and setuptools, together with the manylinux2010 standard for linux wheels and tools such as <a class="reference external" href="https://github.com/matthew-brett/multibuild/">multibuild</a> or <a class="reference external" href="https://github.com/joerick/cibuildwheel">cibuildwheels</a> (well, from the next version) make it easier for library developers to build binary wheels for PyPy. If you are having problems getting going with this, please reach out.</div>
</div>
<div class="section" id="give-it-a-try">
<h2 style="text-align: left;">
<span style="font-size: x-large;"> </span></h2>
<h2 style="text-align: left;">
<span style="font-size: x-large;">Give it a try</span></h2>
Thanks to all the folks who provide the infrastructure PyPy depends on. We hope the new look will encourage more involvement and engagement. Help prove us right!<br />
<br />
The PyPy Team</div>
</div>
</div>
</div>
mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com5tag:blogger.com,1999:blog-3971202189709462152.post-63497615247974090122020-01-17T11:36:00.000+01:002020-01-29T14:44:33.960+01:00Leysin Winter sprint 2020: Feb 29 - March 8th<a href="https://q-cf.bstatic.com/images/hotel/max1280x900/321/32136520.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="600" data-original-width="800" height="240" src="https://q-cf.bstatic.com/images/hotel/max1280x900/321/32136520.jpg" width="320" /></a>The next PyPy sprint will be in Leysin, Switzerland, for the fourteenth
time. This is a fully public sprint: newcomers and topics other than
those proposed below are welcome.<br />
<br />
<br />
<br />
<h3>
Goals and topics of the sprint</h3>
The list of topics is open. For reference, we would like to work at least partially on the following topics:<br />
<ul>
<li><a href="https://github.com/pyhandle/hpy">HPy</a> </li>
<li>Python 3.7 support (<a href="http://buildbot.pypy.org/summary?branch=py3.7">buildbot status</a>)</li>
</ul>
As usual, the main side goal is to have fun in winter sports :-)
We can take a day off (for ski or anything else).<br />
<br />
<h3>
Times and accomodation</h3>
The sprint will occur for one week starting on Saturday, the 29th of February, to Sunday, the 8th of March 2020 <b>(dates were pushed back one day!)</b> It will occur in <a href="https://www.booking.com/hotel/ch/les-airelles.html">Les Airelles</a>, a different bed-and-breakfast place from the traditional one in <span class="il">Leysin</span>. It is a nice old house at the top of the village.<br />
<br />
<strike>We have a 4- or 5-people room as well as up to three double-rooms. Please register early! These rooms are not booked for the sprint in advance, and might be already taken if you end up announcing yourself late.</strike> We have a big room for up to 7 people with nice view, which might be split in two or three sub-rooms; plus possibly separately-booked double rooms if needed. (But it is of course always possible to book at a different place in Leysin.)<br />
<br />
For more information, see our <a href="https://bitbucket.org/pypy/extradoc/src/extradoc/sprintinfo/leysin-winter-2020/">repository</a> or write to me directly at armin.rigo@gmail.com.Armin Rigohttp://www.blogger.com/profile/06300515270104686574noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-36140266200969636552019-12-24T14:55:00.004+01:002019-12-24T14:55:52.485+01:00PyPy 7.3.0 released<div dir="ltr" style="text-align: left;" trbidi="on">
The PyPy team is proud to release the version 7.3.0 of PyPy, which includes
two different interpreters:<br />
<ul class="simple">
<li>PyPy2.7, which is an interpreter supporting the syntax and the features of
Python 2.7 including the stdlib for CPython 2.7.13</li>
<li>PyPy3.6: which is an interpreter supporting the syntax and the features of
Python 3.6, including the stdlib for CPython 3.6.9.</li>
</ul>
<blockquote>
<div>
</div>
</blockquote>
The interpreters are based on much the same codebase, thus the double
release.<br />
<br />
We have worked with the python packaging group to support tooling around
building third party packages for python, so this release changes the ABI tag
for PyPy.<br />
<br />
Based on the great work done in <a class="reference external" href="https://github.com/squeaky-pl/portable-pypy">portable-pypy</a>, the linux downloads we
provide are now built on top of the <a href="https://github.com/pypa/manylinux"><span class="problematic" id="id11">manylinux2010</span></a> CentOS6 docker image.
The tarballs include the needed shared objects to run on any platform that
supports manylinux2010 wheels, which should include all supported versions of
debian- and RedHat-based distributions (including Ubuntu, CentOS, and Fedora).<br />
<br />
The <a class="reference external" href="http://cffi.readthedocs.io/">CFFI</a> backend has been updated to version 1.13.1. We recommend using CFFI
rather than c-extensions to interact with C.<br />
The built-in <a href="https://cppyy.readthedocs.io/en/latest/"><code class="docutils literal notranslate"><span class="pre">cppyy</span></code></a> module was upgraded to 1.10.6, which
provides, among others, better template resolution, stricter <code class="docutils literal notranslate"><span class="pre">enum</span></code> handling,
anonymous struct/unions, cmake fragments for distribution, optimizations for
PODs, and faster wrapper calls. We reccomend using <a class="reference external" href="https://cppyy.readthedocs.io/">cppyy</a> for performant
wrapping of C++ code for Python.<br />
<br />
The vendored pyrepl package for interaction inside the REPL was updated.<br />
<br />
Support for codepage encoding and decoding was added for Windows.<br />
<br />
As always, this release fixed several issues and bugs raised by the growing
community of PyPy users. We strongly recommend updating. Many of the fixes are
the direct result of end-user bug reports, so please continue reporting issues
as they crop up.<br />
You can download the v7.3 releases here:<br />
<blockquote>
<div>
<a class="reference external" href="http://pypy.org/download.html">http://pypy.org/download.html</a></div>
</blockquote>
We would like to thank our donors for the continued support of the PyPy
project. If PyPy is not quite good enough for your needs, we are available for
direct consulting work.<br />
<br />
We would also like to thank our contributors and encourage new people to join
the project. PyPy has many layers and we need help with all of them: <a class="reference external" href="http://doc.pypy.org/en/latest/index.html">PyPy</a>
and <a class="reference external" href="https://rpython.readthedocs.org/">RPython</a> documentation improvements, tweaking popular packages to run
on pypy, or general <a class="reference external" href="http://doc.pypy.org/en/latest/project-ideas.html">help</a> with making RPython’s JIT even better. Since the
previous release, we have accepted contributions from 3 new contributors,
thanks for pitching in.<br />
<br />
If you are a python library maintainer and use c-extensions, please consider making a cffi / cppyy version of your library that would be performant on PyPy. If you are stuck with using the C-API, you can use <a href="https://github.com/pypy/manylinux">docker images with PyPy built in</a> or the <a href="https://github.com/matthew-brett/multibuild/">multibuild system</a> to build wheels.<br />
<br />
<div class="section" id="what-is-pypy">
<h2 style="text-align: center;">
<span style="font-size: x-large;">What is PyPy?</span></h2>
PyPy is a very compliant Python interpreter, almost a drop-in replacement for
CPython 2.7, 3.6. It’s fast (<a class="reference external" href="http://speed.pypy.org/">PyPy and CPython 2.7.x</a> performance
comparison) due to its integrated tracing JIT compiler.<br />
<br />
We also welcome developers of other <a class="reference external" href="http://rpython.readthedocs.io/en/latest/examples.html">dynamic languages</a> to see what RPython
can do for them.<br />
<br />
This PyPy release supports:<br />
<ul style="text-align: left;">
<li><b>x86</b> machines on most common operating systems
(Linux 32/64 bit, Mac OS X 64-bit, Windows 32-bit, OpenBSD, FreeBSD)</li>
</ul>
</div>
<div class="section" id="what-is-pypy">
<ul style="text-align: left;">
<li>big- and little-endian variants of <b>PPC64</b> running Linux<b> </b></li>
</ul>
</div>
<div class="section" id="what-is-pypy">
<ul style="text-align: left;">
<li><b>s390x</b> running Linux</li>
</ul>
</div>
<div class="section" id="what-is-pypy">
<ul style="text-align: left;">
<li>64-bit <b>ARM</b> machines running Linux</li>
</ul>
Unfortunately at the moment of writing our ARM buildbots are out of service,
so for now we are <b>not</b> releasing any binary for the ARM architecture (32-bit), although PyPy does support ARM 32-bit processors.<br />
<br />
<div class="section" id="changelog">
<h2 style="text-align: center;">
<span style="font-size: x-large;">What else is new?</span></h2>
PyPy 7.2 was released in October, 2019.
There are many incremental improvements to RPython and PyPy, For more information about the 7.3.0 release, see the full <a href="https://pypy.readthedocs.io/en/latest/release-v7.3.0.html">changelog</a>.<br />
<br />
Please update, and continue to help us make PyPy better.<br />
<br />
Cheers,<br />
The PyPy team
</div>
<br />
<br /></div>
</div>
mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-18408293360924909382019-12-18T14:38:00.000+01:002019-12-18T15:13:12.048+01:00HPy kick-off sprint report<style type="text/css">
/*
:Author: David Goodger (goodger@python.org)
:Id: $Id: html4css1.css 7952 2016-07-26 18:15:59Z milde $
:Copyright: This stylesheet has been placed in the public domain.
Default cascading style sheet for the HTML output of Docutils.
See http://docutils.sf.net/docs/howto/html-stylesheets.html for how to
customize this style sheet.
*/
/* used to remove borders from tables and images */
.borderless, table.borderless td, table.borderless th {
border: 0 }
table.borderless td, table.borderless th {
/* Override padding for "table.docutils td" with "! important".
The right padding separates the table cells. */
padding: 0 0.5em 0 0 ! important }
.first {
/* Override more specific margin styles with "! important". */
margin-top: 0 ! important }
.last, .with-subtitle {
margin-bottom: 0 ! important }
.hidden {
display: none }
.subscript {
vertical-align: sub;
font-size: smaller }
.superscript {
vertical-align: super;
font-size: smaller }
a.toc-backref {
text-decoration: none ;
color: black }
blockquote.epigraph {
margin: 2em 5em ; }
dl.docutils dd {
margin-bottom: 0.5em }
object[type="image/svg+xml"], object[type="application/x-shockwave-flash"] {
overflow: hidden;
}
/* Uncomment (and remove this text!) to get bold-faced definition list terms
dl.docutils dt {
font-weight: bold }
*/
div.abstract {
margin: 2em 5em }
div.abstract p.topic-title {
font-weight: bold ;
text-align: center }
div.admonition, div.attention, div.caution, div.danger, div.error,
div.hint, div.important, div.note, div.tip, div.warning {
margin: 2em ;
border: medium outset ;
padding: 1em }
div.admonition p.admonition-title, div.hint p.admonition-title,
div.important p.admonition-title, div.note p.admonition-title,
div.tip p.admonition-title {
font-weight: bold ;
font-family: sans-serif }
div.attention p.admonition-title, div.caution p.admonition-title,
div.danger p.admonition-title, div.error p.admonition-title,
div.warning p.admonition-title, .code .error {
color: red ;
font-weight: bold ;
font-family: sans-serif }
/* Uncomment (and remove this text!) to get reduced vertical space in
compound paragraphs.
div.compound .compound-first, div.compound .compound-middle {
margin-bottom: 0.5em }
div.compound .compound-last, div.compound .compound-middle {
margin-top: 0.5em }
*/
div.dedication {
margin: 2em 5em ;
text-align: center ;
font-style: italic }
div.dedication p.topic-title {
font-weight: bold ;
font-style: normal }
div.figure {
margin-left: 2em ;
margin-right: 2em }
div.footer, div.header {
clear: both;
font-size: smaller }
div.line-block {
display: block ;
margin-top: 1em ;
margin-bottom: 1em }
div.line-block div.line-block {
margin-top: 0 ;
margin-bottom: 0 ;
margin-left: 1.5em }
div.sidebar {
margin: 0 0 0.5em 1em ;
border: medium outset ;
padding: 1em ;
background-color: #ffffee ;
width: 40% ;
float: right ;
clear: right }
div.sidebar p.rubric {
font-family: sans-serif ;
font-size: medium }
div.system-messages {
margin: 5em }
div.system-messages h1 {
color: red }
div.system-message {
border: medium outset ;
padding: 1em }
div.system-message p.system-message-title {
color: red ;
font-weight: bold }
div.topic {
margin: 2em }
h1.section-subtitle, h2.section-subtitle, h3.section-subtitle,
h4.section-subtitle, h5.section-subtitle, h6.section-subtitle {
margin-top: 0.4em }
h1.title {
text-align: center }
h2.subtitle {
text-align: center }
hr.docutils {
width: 75% }
img.align-left, .figure.align-left, object.align-left, table.align-left {
clear: left ;
float: left ;
margin-right: 1em }
img.align-right, .figure.align-right, object.align-right, table.align-right {
clear: right ;
float: right ;
margin-left: 1em }
img.align-center, .figure.align-center, object.align-center {
display: block;
margin-left: auto;
margin-right: auto;
}
table.align-center {
margin-left: auto;
margin-right: auto;
}
.align-left {
text-align: left }
.align-center {
clear: both ;
text-align: center }
.align-right {
text-align: right }
/* reset inner alignment in figures */
div.align-right {
text-align: inherit }
/* div.align-center * { */
/* text-align: left } */
.align-top {
vertical-align: top }
.align-middle {
vertical-align: middle }
.align-bottom {
vertical-align: bottom }
ol.simple, ul.simple {
margin-bottom: 1em }
ol.arabic {
list-style: decimal }
ol.loweralpha {
list-style: lower-alpha }
ol.upperalpha {
list-style: upper-alpha }
ol.lowerroman {
list-style: lower-roman }
ol.upperroman {
list-style: upper-roman }
p.attribution {
text-align: right ;
margin-left: 50% }
p.caption {
font-style: italic }
p.credits {
font-style: italic ;
font-size: smaller }
p.label {
white-space: nowrap }
p.rubric {
font-weight: bold ;
font-size: larger ;
color: maroon ;
text-align: center }
p.sidebar-title {
font-family: sans-serif ;
font-weight: bold ;
font-size: larger }
p.sidebar-subtitle {
font-family: sans-serif ;
font-weight: bold }
p.topic-title {
font-weight: bold }
pre.address {
margin-bottom: 0 ;
margin-top: 0 ;
font: inherit }
pre.literal-block, pre.doctest-block, pre.math, pre.code {
margin-left: 2em ;
margin-right: 2em }
pre.code .ln { color: grey; } /* line numbers */
pre.code, code { background-color: #eeeeee }
pre.code .comment, code .comment { color: #5C6576 }
pre.code .keyword, code .keyword { color: #3B0D06; font-weight: bold }
pre.code .literal.string, code .literal.string { color: #0C5404 }
pre.code .name.builtin, code .name.builtin { color: #352B84 }
pre.code .deleted, code .deleted { background-color: #DEB0A1}
pre.code .inserted, code .inserted { background-color: #A3D289}
span.classifier {
font-family: sans-serif ;
font-style: oblique }
span.classifier-delimiter {
font-family: sans-serif ;
font-weight: bold }
span.interpreted {
font-family: sans-serif }
span.option {
white-space: nowrap }
span.pre {
white-space: pre }
span.problematic {
color: red }
span.section-subtitle {
/* font-size relative to parent (h1..h6 element) */
font-size: 80% }
table.citation {
border-left: solid 1px gray;
margin-left: 1px }
table.docinfo {
margin: 2em 4em }
table.docutils {
margin-top: 0.5em ;
margin-bottom: 0.5em }
table.footnote {
border-left: solid 1px black;
margin-left: 1px }
table.docutils td, table.docutils th,
table.docinfo td, table.docinfo th {
padding-left: 0.5em ;
padding-right: 0.5em ;
vertical-align: top }
table.docutils th.field-name, table.docinfo th.docinfo-name {
font-weight: bold ;
text-align: left ;
white-space: nowrap ;
padding-left: 0 }
/* "booktabs" style (no vertical lines) */
table.docutils.booktabs {
border: 0px;
border-top: 2px solid;
border-bottom: 2px solid;
border-collapse: collapse;
}
table.docutils.booktabs * {
border: 0px;
}
table.docutils.booktabs th {
border-bottom: thin solid;
text-align: left;
}
h1 tt.docutils, h2 tt.docutils, h3 tt.docutils,
h4 tt.docutils, h5 tt.docutils, h6 tt.docutils {
font-size: 100% }
ul.auto-toc {
list-style-type: none }
</style>
<p>Recently Antonio, Armin and Ronan had a small internal sprint in the beautiful
city of Gdańsk to kick-off the development of HPy. Here is a brief report of
what was accomplished during the sprint.</p>
<div class="section" id="what-is-hpy">
<h2>What is HPy?</h2>
<p>The TL;DR answer is "a better way to write C extensions for Python".</p>
<p>The idea of HPy was born during EuroPython 2019 in Basel, where there was an
informal meeting which included core developers of PyPy, CPython (Victor
Stinner and Mark Shannon) and Cython (Stefan Behnel). The ideas were later also
discussed with Tim Felgentreff of <a class="reference external" href="https://github.com/graalvm/graalpython">GraalPython</a>, to make sure they would also be
applicable to this very different implementation, Windel Bouwman of <a class="reference external" href="https://github.com/RustPython/RustPython">RustPython</a>
is following the project as well.</p>
<p>All of us agreed that the current design of the CPython C API is problematic
for various reasons and, in particular, because it is too tied to the current
internal design of CPython. The end result is that:</p>
<ul class="simple">
<li>alternative implementations of Python (such as PyPy, but not only) have a
<a class="reference external" href="https://morepypy.blogspot.com/2018/09/inside-cpyext-why-emulating-cpython-c.html">hard time</a> loading and executing existing C extensions;</li>
<li>CPython itself is unable to change some of its internal implementation
details without breaking the world. For example, as of today it would be
impossible to switch from using reference counting to using a real GC,
which in turns make it hard for example to remove the GIL, as <a class="reference external" href="https://pythoncapi.readthedocs.io/gilectomy.html">gilectomy</a>
attempted.</li>
</ul>
<p>HPy tries to address these issues by following two major design guidelines:</p>
<ol class="arabic simple">
<li>objects are referenced and passed around using opaque handles, which are
similar to e.g., file descriptors in spirit. Multiple, different handles
can point to the same underlying object, handles can be duplicated and
each handle must be released independently of any other duplicate.</li>
<li>The internal data structures and C-level layout of objects are not
visible nor accessible using the API, so each implementation if free to
use what fits best.</li>
</ol>
<p>The other major design goal of HPy is to allow incremental transition and
porting, so existing modules can migrate their codebase one method at a time.
Moreover, Cython is considering to optionally generate HPy code, so extension
module written in Cython would be able to benefit from HPy automatically.</p>
<p>More details can be found in the README of the official <a class="reference external" href="https://github.com/pyhandle/hpy">HPy repository</a>.</p>
</div>
<div class="section" id="target-abi">
<h2>Target ABI</h2>
<p>When compiling an HPy extension you can choose one of two different target ABIs:</p>
<ul class="simple">
<li><strong>HPy/CPython ABI</strong>: in this case, <tt class="docutils literal">hpy.h</tt> contains a set of macros and
static inline functions. At compilation time this translates the HPy API
into the standard C-API. The compiled module will have no performance
penalty, and it will have a "standard" filename like
<tt class="docutils literal"><span class="pre">foo.cpython-37m-x86_64-linux-gnu.so</span></tt>.</li>
<li><strong>Universal HPy ABI</strong>: as the name implies, extension modules compiled
this way are "universal" and can be loaded unmodified by multiple Python
interpreters and versions. Moreover, it will be possible to dynamically
enable a special debug mode which will make it easy to find e.g., open
handles or memory leaks, <strong>without having to recompile the extension</strong>.</li>
</ul>
<p>Universal modules can <strong>also</strong> be loaded on CPython, thanks to the
<tt class="docutils literal">hpy_universal</tt> module which is under development. An extra layer of
indirection enables loading extensions compiled with the universal ABI. Users
of <tt class="docutils literal">hpy_universal</tt> will face a small performance penalty compared to the ones
using the HPy/CPython ABI.</p>
<p>This setup gives several benefits:</p>
<ul class="simple">
<li>Extension developers can use the extra debug features given by the
Universal ABI with no need to use a special debug version of Python.</li>
<li>Projects which need the maximum level of performance can compile their
extension for each relevant version of CPython, as they are doing now.</li>
<li>Projects for which runtime speed is less important will have the choice of
distributing a single binary which will work on any version and
implementation of Python.</li>
</ul>
</div>
<div class="section" id="a-simple-example">
<h2>A simple example</h2>
<p>The HPy repo contains a <a class="reference external" href="https://github.com/pyhandle/hpy/blob/master/proof-of-concept/pof.c">proof of concept</a> module. Here is a simplified
version which illustrates what a HPy module looks like:</p>
<pre class="code C literal-block">
<span class="comment preproc">#include</span> <span class="comment preprocfile">"hpy.h"</span><span class="comment preproc">
</span>
<span class="name">HPy_DEF_METH_VARARGS</span><span class="punctuation">(</span><span class="name">add_ints</span><span class="punctuation">)</span>
<span class="keyword">static</span> <span class="name">HPy</span> <span class="name">add_ints_impl</span><span class="punctuation">(</span><span class="name">HPyContext</span> <span class="name">ctx</span><span class="punctuation">,</span> <span class="name">HPy</span> <span class="name">self</span><span class="punctuation">,</span> <span class="name">HPy</span> <span class="operator">*</span><span class="name">args</span><span class="punctuation">,</span> <span class="name">HPy_ssize_t</span> <span class="name">nargs</span><span class="punctuation">)</span>
<span class="punctuation">{</span>
<span class="keyword type">long</span> <span class="name">a</span><span class="punctuation">,</span> <span class="name">b</span><span class="punctuation">;</span>
<span class="keyword">if</span> <span class="punctuation">(</span><span class="operator">!</span><span class="name">HPyArg_Parse</span><span class="punctuation">(</span><span class="name">ctx</span><span class="punctuation">,</span> <span class="name">args</span><span class="punctuation">,</span> <span class="name">nargs</span><span class="punctuation">,</span> <span class="literal string">"ll"</span><span class="punctuation">,</span> <span class="operator">&</span><span class="name">a</span><span class="punctuation">,</span> <span class="operator">&</span><span class="name">b</span><span class="punctuation">))</span>
<span class="keyword">return</span> <span class="name">HPy_NULL</span><span class="punctuation">;</span>
<span class="keyword">return</span> <span class="name function">HPyLong_FromLong</span><span class="punctuation">(</span><span class="name">ctx</span><span class="punctuation">,</span> <span class="name">a</span><span class="operator">+</span><span class="name">b</span><span class="punctuation">);</span>
<span class="punctuation">}</span>
<span class="keyword">static</span> <span class="name">HPyMethodDef</span> <span class="name">PofMethods</span><span class="punctuation">[]</span> <span class="operator">=</span> <span class="punctuation">{</span>
<span class="punctuation">{</span><span class="literal string">"add_ints"</span><span class="punctuation">,</span> <span class="name">add_ints</span><span class="punctuation">,</span> <span class="name">HPy_METH_VARARGS</span><span class="punctuation">,</span> <span class="literal string">""</span><span class="punctuation">},</span>
<span class="punctuation">{</span><span class="name builtin">NULL</span><span class="punctuation">,</span> <span class="name builtin">NULL</span><span class="punctuation">,</span> <span class="literal number integer">0</span><span class="punctuation">,</span> <span class="name builtin">NULL</span><span class="punctuation">}</span>
<span class="punctuation">};</span>
<span class="keyword">static</span> <span class="name">HPyModuleDef</span> <span class="name">moduledef</span> <span class="operator">=</span> <span class="punctuation">{</span>
<span class="name">HPyModuleDef_HEAD_INIT</span><span class="punctuation">,</span>
<span class="punctuation">.</span><span class="name">m_name</span> <span class="operator">=</span> <span class="literal string">"pof"</span><span class="punctuation">,</span>
<span class="punctuation">.</span><span class="name">m_doc</span> <span class="operator">=</span> <span class="literal string">"HPy Proof of Concept"</span><span class="punctuation">,</span>
<span class="punctuation">.</span><span class="name">m_size</span> <span class="operator">=</span> <span class="operator">-</span><span class="literal number integer">1</span><span class="punctuation">,</span>
<span class="punctuation">.</span><span class="name">m_methods</span> <span class="operator">=</span> <span class="name">PofMethods</span>
<span class="punctuation">};</span>
<span class="name">HPy_MODINIT</span><span class="punctuation">(</span><span class="name">pof</span><span class="punctuation">)</span>
<span class="keyword">static</span> <span class="name">HPy</span> <span class="name">init_pof_impl</span><span class="punctuation">(</span><span class="name">HPyContext</span> <span class="name">ctx</span><span class="punctuation">)</span>
<span class="punctuation">{</span>
<span class="name">HPy</span> <span class="name">m</span><span class="punctuation">;</span>
<span class="name">m</span> <span class="operator">=</span> <span class="name">HPyModule_Create</span><span class="punctuation">(</span><span class="name">ctx</span><span class="punctuation">,</span> <span class="operator">&</span><span class="name">moduledef</span><span class="punctuation">);</span>
<span class="keyword">if</span> <span class="punctuation">(</span><span class="name">HPy_IsNull</span><span class="punctuation">(</span><span class="name">m</span><span class="punctuation">))</span>
<span class="keyword">return</span> <span class="name">HPy_NULL</span><span class="punctuation">;</span>
<span class="keyword">return</span> <span class="name">m</span><span class="punctuation">;</span>
<span class="punctuation">}</span>
</pre>
<p>People who are familiar with the current C-API will surely notice many
similarities. The biggest differences are:</p>
<ul class="simple">
<li>Instead of <tt class="docutils literal">PyObject *</tt>, objects have the type <tt class="docutils literal">HPy</tt>, which as
explained above represents a handle.</li>
<li>You need to explicitly pass an <tt class="docutils literal">HPyContext</tt> around: the intent is
primary to be future-proof and make it easier to implement things like
sub- interpreters.</li>
<li><tt class="docutils literal">HPy_METH_VARARGS</tt> is implemented differently than CPython's
<tt class="docutils literal">METH_VARARGS</tt>: in particular, these methods receive an array of <tt class="docutils literal">HPy</tt>
and its length, instead of a fully constructed tuple: passing a tuple
makes sense on CPython where you have it anyway, but it might be an
unnecessary burden for alternate implementations. Note that this is
similar to the new <a class="reference external" href="https://www.python.org/dev/peps/pep-0580/">METH_FASTCALL</a> which was introduced in CPython.</li>
<li>HPy relies a lot on C macros, which most of the time are needed to support
the HPy/CPython ABI compilation mode. For example, <tt class="docutils literal">HPy_DEF_METH_VARARGS</tt>
expands into a trampoline which has the correct C signature that CPython
expects (i.e., <tt class="docutils literal">PyObject <span class="pre">(*)(PyObject</span> *self, *PyObject *args)</tt>) and
which calls <tt class="docutils literal">add_ints_impl</tt>.</li>
</ul>
</div>
<div class="section" id="sprint-report-and-current-status">
<h2>Sprint report and current status</h2>
<p>After this long preamble, here is a rough list of what we accomplished during
the week-long sprint and the days immediatly after.</p>
<p>On the HPy side, we kicked-off the code in the repo: at the moment of writing
the layout of the directories is a bit messy because we moved things around
several times, but we identified several main sections:</p>
<ol class="arabic">
<li><p class="first">A specification of the API which serves both as documentation and as an
input for parts of the projects which are automatically
generated. Currently, this lives in <a class="reference external" href="https://github.com/pyhandle/hpy/blob/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/tools/public_api.h">public_api.h</a>.</p>
</li>
<li><p class="first">A set of header files which can be used to compile extension modules:
depending on whether the flag <tt class="docutils literal"><span class="pre">-DHPY_UNIVERSAL_ABI</span></tt> is passed to the
compiler, the extension can target the <a class="reference external" href="https://github.com/pyhandle/hpy/blob/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/hpy-api/hpy_devel/include/cpython/hpy.h">HPy/CPython ABI</a> or the <a class="reference external" href="https://github.com/pyhandle/hpy/blob/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/hpy-api/hpy_devel/include/universal/hpy.h">HPy
Universal ABI</a></p>
</li>
<li><p class="first">A <a class="reference external" href="https://github.com/pyhandle/hpy/tree/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/cpython-universal/src">CPython extension module</a> called <tt class="docutils literal">hpy_universal</tt> which makes it
possible to import universal modules on CPython</p>
</li>
<li><p class="first">A set of <a class="reference external" href="https://github.com/pyhandle/hpy/tree/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/test">tests</a> which are independent of the implementation and are meant
to be an "executable specification" of the semantics. Currently, these
tests are run against three different implementations of the HPy API:</p>
<ul class="simple">
<li>the headers which implements the "HPy/CPython ABI"</li>
<li>the <tt class="docutils literal">hpy_universal</tt> module for CPython</li>
<li>the <tt class="docutils literal">hpy_universal</tt> module for PyPy (these tests are run in the PyPy repo)</li>
</ul>
</li>
</ol>
<p>Moreover, we started a <a class="reference external" href="https://bitbucket.org/pypy/pypy/src/hpy/pypy/module/hpy_universal/">PyPy branch</a> in which to implement the
<tt class="docutils literal">hpy_univeral</tt> module: at the moment of writing PyPy can pass all the HPy
tests apart the ones which allow conversion to and from <tt class="docutils literal">PyObject *</tt>.
Among the other things, this means that it is already possible to load the
very same binary module in both CPython and PyPy, which is impressive on its
own :).</p>
<p>Finally, we wanted a real-life use case to show how to port a module to HPy
and to do benchmarks. After some searching, we choose <a class="reference external" href="https://github.com/esnme/ultrajson">ultrajson</a>, for the
following reasons:</p>
<ul class="simple">
<li>it is a real-world extension module which was written with performance in
mind</li>
<li>when parsing a JSON file it does a lot of calls to the Python API to
construct the various parts of the result message</li>
<li>it uses only a small subset of the Python API</li>
</ul>
<p>This repo contains the <a class="reference external" href="https://github.com/pyhandle/ultrajson-hpy">HPy port of ultrajson</a>. This <a class="reference external" href="https://github.com/pyhandle/ultrajson-hpy/commit/efb35807afa8cf57db5df6a3dfd4b64c289fe907">commit</a> shows an example
of what the porting looks like.</p>
<p><tt class="docutils literal">ujson_hpy</tt> is also a very good example of incremental migration: so far
only <tt class="docutils literal">ujson.loads</tt> is implemented using the HPy API, while <tt class="docutils literal">ujson.dumps</tt>
is still implemented using the old C-API, and both can coexist nicely in the
same compiled module.</p>
</div>
<div class="section" id="benchmarks">
<h2>Benchmarks</h2>
<p>Once we have a fully working <tt class="docutils literal">ujson_hpy</tt> module, we can finally run
benchmarks! We tested several different versions of the module:</p>
<ul class="simple">
<li><tt class="docutils literal">ujson</tt>: this is the vanilla implementation of ultrajson using the
C-API. On PyPy this is executed by the infamous <tt class="docutils literal">cpyext</tt> compatibility
layer, so we expect it to be much slower than on CPython</li>
<li><tt class="docutils literal">ujson_hpy</tt>: our HPy port compiled to target the HPy/CPython ABI. We
expect it to be as fast as <tt class="docutils literal">ujson</tt></li>
<li><tt class="docutils literal">ujson_hpy_universal</tt>: same as above but compiled to target the
Universal HPy ABI. We expect it to be slightly slower than <tt class="docutils literal">ujson</tt> on
CPython, and much faster on PyPy.</li>
</ul>
<p>Finally, we also ran the benchmark using the builtin <tt class="docutils literal">json</tt> module. This is
not really relevant to HPy, but it might still be an interesting as a
reference data point.</p>
<p>The <a class="reference external" href="https://github.com/pyhandle/ultrajson-hpy/blob/hpy/benchmark/main.py">benchmark</a> is very simple and consists of parsing a <a class="reference external" href="https://github.com/pyhandle/ultrajson-hpy/blob/hpy/benchmark/download_data.sh">big JSON file</a> 100
times. Here is the average time per iteration (in milliseconds) using the
various versions of the module, CPython 3.7 and the latest version of the hpy
PyPy branch:</p>
<table border="1" class="docutils">
<colgroup>
<col width="55%" />
<col width="24%" />
<col width="21%" />
</colgroup>
<tbody valign="top">
<tr><td> </td>
<td>CPython</td>
<td>PyPy</td>
</tr>
<tr><td>ujson</td>
<td>154.32</td>
<td>633.97</td>
</tr>
<tr><td>ujson_hpy</td>
<td>152.19</td>
<td> </td>
</tr>
<tr><td>ujson_hpy_universal</td>
<td>168.78</td>
<td>207.68</td>
</tr>
<tr><td>json</td>
<td>224.59</td>
<td>135.43</td>
</tr>
</tbody>
</table>
<p>As expected, the benchmark proves that when targeting the HPy/CPython ABI, HPy
doesn't impose any performance penalty on CPython. The universal version is
~10% slower on CPython, but gives an impressive 3x speedup on PyPy! It it
worth noting that the PyPy hpy module is not fully optimized yet, and we
expect to be able to reach the same performance as CPython for this particular
example (or even more, thanks to our better GC).</p>
<p>All in all, not a bad result for two weeks of intense hacking :)</p>
<p>It is also worth noting than PyPy's builtin <tt class="docutils literal">json</tt> module does <strong>really</strong>
well in this benchmark, thanks to the recent optimizations that were described
in an <a class="reference external" href="https://morepypy.blogspot.com/2019/10/pypys-new-json-parser.html">earlier blog post</a>.</p>
</div>
<div class="section" id="conclusion-and-future-directions">
<h2>Conclusion and future directions</h2>
<p>We think we can be very satisfied about what we have got so far. The
development of HPy is quite new, but these early results seem to indicate that
we are on the right track to bring Python extensions into the future.</p>
<p>At the moment, we can anticipate some of the next steps in the development of
HPy:</p>
<ul class="simple">
<li>Think about a proper API design: what we have done so far has
been a "dumb" translation of the API we needed to run <tt class="docutils literal">ujson</tt>. However,
one of the declared goal of HPy is to improve the design of the API. There
will be a trade-off between the desire of having a clean, fresh new API
and the need to be not too different than the old one, to make porting
easier. Finding the sweet spot will not be easy!</li>
<li>Implement the "debug" mode, which will help developers to find
bugs such as leaking handles or using invalid handles.</li>
<li>Instruct Cython to emit HPy code on request.</li>
<li>Eventually, we will also want to try to port parts of <tt class="docutils literal">numpy</tt> to HPy to
finally solve the long-standing problem of sub-optimal <tt class="docutils literal">numpy</tt>
performance in PyPy.</li>
</ul>
<p>Stay tuned!</p>
</div>Antonio Cunihttp://www.blogger.com/profile/17017456817083804792noreply@blogger.com7tag:blogger.com,1999:blog-3971202189709462152.post-10904065567263134952019-10-14T19:46:00.000+02:002019-10-14T19:46:10.479+02:00PyPy v7.2 released<div dir="ltr" style="text-align: left;" trbidi="on">
The PyPy team is proud to release the version 7.2.0 of PyPy, which includes
two different interpreters:<br />
<ul style="text-align: left;">
<li>PyPy2.7, which is an interpreter supporting the syntax and the features of
Python 2.7 including the stdlib for CPython 2.7.13</li>
</ul>
<ul style="text-align: left;">
<li>PyPy3.6: which is an interpreter supporting the syntax and the features of
Python 3.6, including the stdlib for CPython 3.6.9.</li>
</ul>
<blockquote>
<div>
</div>
</blockquote>
The interpreters are based on much the same codebase, thus the double
release.<br />
<br />
As always, this release is 100% compatible with the previous one and fixed
several issues and bugs raised by the growing community of PyPy users.
We strongly recommend updating. Many of the fixes are the direct result of
end-user bug reports, so please continue reporting issues as they crop up.<br />
<br />
You can download the v7.2 releases here:<br />
<blockquote>
<div>
<a class="reference external" href="http://pypy.org/download.html">http://pypy.org/download.html</a></div>
</blockquote>
With the support of Arm Holdings Ltd. and <a class="reference external" href="https://crossbario.com/">Crossbar.io</a>, this release supports
the 64-bit <code class="docutils literal notranslate"><span class="pre">aarch64</span></code> ARM architecture. More about the work and the
performance data around this welcome development can be found in the <a class="reference external" href="https://morepypy.blogspot.com/2019/07/pypy-jit-for-aarch64.html">blog
post</a>.<br />
<br />
This release removes the “beta” tag from PyPy3.6. While there may still be some
small corner-case incompatibilities (around the exact error messages in
exceptions and the handling of faulty codec errorhandlers) we are happy with
the quality of the 3.6 series and are looking forward to working on a Python
3.7 interpreter.<br />
<br />
We updated our benchmark runner at <a class="reference external" href="https://speed.pypy.org/">https://speed.pypy.org</a> to a more modern
machine and updated the baseline python to CPython 2.7.11. Thanks to <a class="reference external" href="https://baroquesoftware.com/">Baroque
Software</a> for maintaining the benchmark runner.<br />
<br />
The CFFI-based <code class="docutils literal notranslate"><span class="pre">_ssl</span></code> module was backported to PyPy2.7 and updated to use
<a class="reference external" href="https://cryptography.io/en/latest">cryptography</a> version 2.7. Additionally, the <code class="docutils literal notranslate"><span class="pre">_hashlib</span></code>, and <code class="docutils literal notranslate"><span class="pre">crypt</span></code> (or
<code class="docutils literal notranslate"><span class="pre">_crypt</span></code> on Python3) modules were converted to CFFI. This has two
consequences: end users and packagers can more easily update these libraries
for their platform by executing <code class="docutils literal notranslate"><span class="pre">(cd</span> <span class="pre">lib_pypy;</span> <span class="pre">../bin/pypy</span> <span class="pre">_*_build.py)</span></code>.
More significantly, since PyPy itself links to fewer system shared objects
(DLLs), on platforms with a single runtime namespace like linux, different CFFI
and c-extension modules can load different versions of the same shared object
into PyPy without collision (<a class="reference external" href="https://bitbucket.com/pypy/pypy/issues/2617">issue 2617</a>).<br />
<br />
Until downstream providers begin to distribute c-extension builds with PyPy, we
have made packages for some common packages <a class="reference external" href="https://github.com/antocuni/pypy-wheels">available as wheels</a>.<br />
<br />
The <a class="reference external" href="http://cffi.readthedocs.io/">CFFI</a> backend has been updated to version 1.13.0. We recommend using CFFI
rather than c-extensions to interact with C, and <a class="reference external" href="https://cppyy.readthedocs.io/">cppyy</a> for interacting with
C++ code.<br />
<br />
Thanks to <a class="reference external" href="https://anvil.works/">Anvil</a>, we revived the <a class="reference external" href="https://morepypy.blogspot.com/2019/08">PyPy Sandbox</a>, (soon to be released) which allows total control
over a Python interpreter’s interactions with the external world.<br />
<br />
We implemented a new JSON decoder that is much faster, uses less memory, and
uses a JIT-friendly specialized dictionary. More about that in the recent <a href="https://morepypy.blogspot.com/2019/10/pypys-new-json-parser.html">blog post</a><br />
<br />
We would like to thank our donors for the continued support of the PyPy
project. If PyPy is not quite good enough for your needs, we are available for
direct consulting work.
<br />
We would also like to thank our contributors and encourage new people to join
the project. PyPy has many layers and we need help with all of them: <a class="reference external" href="https://pypy.readthedocs.io/en/latest/index.html">PyPy</a>
and <a class="reference external" href="https://rpython.readthedocs.org/">RPython</a> documentation improvements, tweaking popular modules to run
on PyPy, or general <a class="reference external" href="https://pypy.readthedocs.io/en/latest/project-ideas.html">help</a> with making RPython’s JIT even better. Since the
previous release, we have accepted contributions from 27 new contributors,
so thanks for pitching in.<br />
<br />
<div class="section" id="what-is-pypy">
<h2 style="text-align: center;">
<span style="font-size: x-large;">What is PyPy?</span></h2>
PyPy is a very compliant Python interpreter, almost a drop-in replacement for
CPython 2.7, 3.6. It’s fast (<a class="reference external" href="http://speed.pypy.org/">PyPy and CPython 2.7.x</a> performance
comparison) due to its integrated tracing JIT compiler.<br />
<br />
We also welcome developers of other <a class="reference external" href="http://rpython.readthedocs.io/en/latest/examples.html">dynamic languages</a> to see what RPython
can do for them.<br />
<br />
This PyPy release supports:<br />
<ul style="text-align: left;">
<li><strong>x86</strong> machines on most common operating systems
(Linux 32/64 bit, Mac OS X 64-bit, Windows 32-bit, OpenBSD, FreeBSD)</li>
</ul>
</div>
<div class="section" id="what-is-pypy">
<ul style="text-align: left;">
<li>big- and little-endian variants of <strong>PPC64</strong> running Linux<strong> </strong></li>
</ul>
</div>
<div class="section" id="what-is-pypy">
<ul style="text-align: left;">
<li><strong>s390x</strong> running Linux</li>
</ul>
</div>
<div class="section" id="what-is-pypy">
<ul style="text-align: left;">
<li>64-bit <strong>ARM</strong> machines running Linux</li>
</ul>
<blockquote>
<div>
</div>
</blockquote>
Unfortunately at the moment of writing our ARM buildbots are out of service,
so for now we are <strong>not</strong> releasing any binary for the ARM architecture (32-bit), although PyPy does support ARM 32-bit processors.<br />
<br />
<div class="section" id="changelog">
<h2 style="text-align: center;">
<span style="font-size: x-large;">What else is new?</span></h2>
PyPy 7.1 was released in March, 2019.
There are many incremental improvements to RPython and PyPy, For more information about the 7.2.0 release, see the full <a href="https://pypy.readthedocs.io/en/latest/release-v7.2.0.html">changelog</a>.<br />
<br />
Please update, and continue to help us make PyPy better.<br />
<br />
Cheers,<br />
The PyPy team
</div>
<br />
<br />
</div>
</div>
mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-4929117240843055012019-10-08T13:37:00.000+02:002019-10-08T21:19:28.431+02:00PyPy's new JSON parser<h2>
Introduction</h2>
In the last year or two I have worked on and off on making PyPy's
<a class="reference external" href="https://www.json.org/">JSON</a> faster, particularly when parsing large
JSON files. In this post I am going to document those techniques and
measure their performance impact. Note that I am quite a lot more
constrained in what optimizations I can apply here, compared to some of
the much more advanced approaches like
<a class="reference external" href="https://www.microsoft.com/en-us/research/publication/mison-fast-json-parser-data-analytics/">Mison</a>,
<a class="reference external" href="https://dawn.cs.stanford.edu/2018/08/07/sparser/">Sparser</a> or
<a class="reference external" href="https://arxiv.org/abs/1902.08318">SimdJSON</a> because I don't want to
change the <tt class="docutils literal">json.loads</tt> API that Python programs expect, and because I
don't want to only support CPUs with wide SIMD extensions. With a more
expressive API, more optimizations would be possible.<br />
There are a number of problems of working with huge JSON files:
deserialization takes a long time on the one hand, and the resulting
data structures often take a lot of memory (usually they can be many
times bigger than the size of the file they originated from). Of course
these problems are related, because allocating and initializing a big
data structure takes longer than a smaller data structure. Therefore I
always tried to attack both of these problems at the same time.<br />
One common theme of the techniques I am describing is that of optimizing
the parser for how JSON files are typically used, not how they could
theoretically be used. This is a similar approach to the way dynamic
languages are optimized more generally: most JITs will optimize for
typical patterns of usage, at the cost of less common usage patterns,
which might even become slower as a result of the optimizations.<br />
<h2>
Maps</h2>
The first technique I investigated is to use maps in the JSON parser.
Maps, also called hidden classes or shapes, are a fairly common way to
(generally, not just in the context of JSON parsing) <a class="reference external" href="https://morepypy.blogspot.com/2010/11/efficiently-implementing-python-objects.html">optimize instances
of
classes</a>
in dynamic language VMs. Maps exploit the fact that while it is in
theory possible to add arbitrary fields to an instance, in practice most
instances of a class are going to have the same set of fields (or one of
a small number of different sets). Since JSON dictionaries or objects
often come from serialized instances of some kind, this property often
holds in JSON files as well: dictionaries often have the same fields in
the same order, within a JSON file.<br />
This property can be exploited in two ways: on the one hand, it can be
used to again store the deserialized dictionaries in a more memory
efficient way by not using a hashmap in most cases, but instead
splitting the dictionary into a shared description of the set of keys
(the map) and an array of storage with the values. This makes the
deserialized dictionaries smaller if the same set of keys is repeated a
lot. This is completely transparent to the Python programmer, the
dictionary will look completely normal to the Python program but its
internal representation is different.<br />
One downside of using maps is that sometimes files will contain many
dictionaries that have unique key sets. Since maps themselves are quite
large data structures and since dictionaries that use maps contain an
extra level of indirection we want to fall back to using normal hashmaps
to represent the dictionaries where that is the case. To prevent this we
perform some statistics at runtime, how often every map (i.e. set of
keys) is used in the file. For uncommonly used maps, the map is
discarded and the dictionaries that used the map converted into using a
regular hashmap.<br />
<h3>
Using Maps to Speed up Parsing</h3>
Another benefit of using maps to store deserialized dictionaries is that
we can use them to speed up the parsing process itself. To see how this
works, we need to understand maps a bit better. All the maps produced as
a side-effect of parsing JSON form a tree. The tree root is a map that
describes the object without any attributes. From every tree node we
have a number of edges going to other nodes, each edge for a specific
new attribute added:<br />
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="423pt" height="503pt" viewBox="0 0 623 503" version="1.1">
<defs>
<g>
<symbol overflow="visible" id="glyph0-0">
<path style="stroke:none;" d="M 0.640625 2.265625 L 0.640625 -9.015625 L 7.03125 -9.015625 L 7.03125 2.265625 Z M 1.359375 1.546875 L 6.328125 1.546875 L 6.328125 -8.296875 L 1.359375 -8.296875 Z M 1.359375 1.546875 "/>
</symbol>
<symbol overflow="visible" id="glyph0-1">
<path style="stroke:none;" d="M 7.1875 -3.78125 L 7.1875 -3.21875 L 1.90625 -3.21875 C 1.957031 -2.425781 2.195312 -1.820312 2.625 -1.40625 C 3.050781 -1 3.644531 -0.796875 4.40625 -0.796875 C 4.84375 -0.796875 5.269531 -0.847656 5.6875 -0.953125 C 6.101562 -1.066406 6.515625 -1.226562 6.921875 -1.4375 L 6.921875 -0.359375 C 6.515625 -0.179688 6.09375 -0.046875 5.65625 0.046875 C 5.21875 0.140625 4.78125 0.1875 4.34375 0.1875 C 3.21875 0.1875 2.328125 -0.132812 1.671875 -0.78125 C 1.023438 -1.4375 0.703125 -2.320312 0.703125 -3.4375 C 0.703125 -4.582031 1.007812 -5.488281 1.625 -6.15625 C 2.25 -6.832031 3.085938 -7.171875 4.140625 -7.171875 C 5.078125 -7.171875 5.816406 -6.863281 6.359375 -6.25 C 6.910156 -5.644531 7.1875 -4.820312 7.1875 -3.78125 Z M 6.046875 -4.125 C 6.035156 -4.75 5.859375 -5.25 5.515625 -5.625 C 5.171875 -6 4.71875 -6.1875 4.15625 -6.1875 C 3.507812 -6.1875 2.992188 -6.003906 2.609375 -5.640625 C 2.222656 -5.285156 2 -4.78125 1.9375 -4.125 Z M 6.046875 -4.125 "/>
</symbol>
<symbol overflow="visible" id="glyph0-2">
<path style="stroke:none;" d="M 6.65625 -5.65625 C 6.9375 -6.164062 7.273438 -6.546875 7.671875 -6.796875 C 8.078125 -7.046875 8.550781 -7.171875 9.09375 -7.171875 C 9.820312 -7.171875 10.382812 -6.914062 10.78125 -6.40625 C 11.175781 -5.894531 11.375 -5.164062 11.375 -4.21875 L 11.375 0 L 10.21875 0 L 10.21875 -4.1875 C 10.21875 -4.851562 10.097656 -5.347656 9.859375 -5.671875 C 9.628906 -6.003906 9.269531 -6.171875 8.78125 -6.171875 C 8.1875 -6.171875 7.710938 -5.972656 7.359375 -5.578125 C 7.015625 -5.179688 6.84375 -4.640625 6.84375 -3.953125 L 6.84375 0 L 5.6875 0 L 5.6875 -4.1875 C 5.6875 -4.863281 5.566406 -5.363281 5.328125 -5.6875 C 5.097656 -6.007812 4.734375 -6.171875 4.234375 -6.171875 C 3.648438 -6.171875 3.179688 -5.96875 2.828125 -5.5625 C 2.484375 -5.164062 2.3125 -4.628906 2.3125 -3.953125 L 2.3125 0 L 1.15625 0 L 1.15625 -7 L 2.3125 -7 L 2.3125 -5.90625 C 2.582031 -6.332031 2.898438 -6.648438 3.265625 -6.859375 C 3.628906 -7.066406 4.0625 -7.171875 4.5625 -7.171875 C 5.070312 -7.171875 5.503906 -7.039062 5.859375 -6.78125 C 6.222656 -6.519531 6.488281 -6.144531 6.65625 -5.65625 Z M 6.65625 -5.65625 "/>
</symbol>
<symbol overflow="visible" id="glyph0-3">
<path style="stroke:none;" d="M 2.3125 -1.046875 L 2.3125 2.65625 L 1.15625 2.65625 L 1.15625 -7 L 2.3125 -7 L 2.3125 -5.9375 C 2.5625 -6.351562 2.867188 -6.660156 3.234375 -6.859375 C 3.597656 -7.066406 4.039062 -7.171875 4.5625 -7.171875 C 5.40625 -7.171875 6.09375 -6.832031 6.625 -6.15625 C 7.15625 -5.476562 7.421875 -4.59375 7.421875 -3.5 C 7.421875 -2.394531 7.15625 -1.503906 6.625 -0.828125 C 6.09375 -0.148438 5.40625 0.1875 4.5625 0.1875 C 4.039062 0.1875 3.597656 0.0859375 3.234375 -0.109375 C 2.867188 -0.316406 2.5625 -0.628906 2.3125 -1.046875 Z M 6.234375 -3.5 C 6.234375 -4.34375 6.054688 -5.003906 5.703125 -5.484375 C 5.359375 -5.960938 4.882812 -6.203125 4.28125 -6.203125 C 3.664062 -6.203125 3.179688 -5.960938 2.828125 -5.484375 C 2.484375 -5.003906 2.3125 -4.34375 2.3125 -3.5 C 2.3125 -2.644531 2.484375 -1.976562 2.828125 -1.5 C 3.179688 -1.019531 3.664062 -0.78125 4.28125 -0.78125 C 4.882812 -0.78125 5.359375 -1.019531 5.703125 -1.5 C 6.054688 -1.976562 6.234375 -2.644531 6.234375 -3.5 Z M 6.234375 -3.5 "/>
</symbol>
<symbol overflow="visible" id="glyph0-4">
<path style="stroke:none;" d="M 2.34375 -8.984375 L 2.34375 -7 L 4.71875 -7 L 4.71875 -6.109375 L 2.34375 -6.109375 L 2.34375 -2.3125 C 2.34375 -1.738281 2.421875 -1.367188 2.578125 -1.203125 C 2.734375 -1.046875 3.050781 -0.96875 3.53125 -0.96875 L 4.71875 -0.96875 L 4.71875 0 L 3.53125 0 C 2.644531 0 2.03125 -0.164062 1.6875 -0.5 C 1.351562 -0.832031 1.1875 -1.4375 1.1875 -2.3125 L 1.1875 -6.109375 L 0.34375 -6.109375 L 0.34375 -7 L 1.1875 -7 L 1.1875 -8.984375 Z M 2.34375 -8.984375 "/>
</symbol>
<symbol overflow="visible" id="glyph0-5">
<path style="stroke:none;" d="M 4.125 0.65625 C 3.789062 1.488281 3.46875 2.03125 3.15625 2.28125 C 2.851562 2.53125 2.445312 2.65625 1.9375 2.65625 L 1.015625 2.65625 L 1.015625 1.703125 L 1.6875 1.703125 C 2 1.703125 2.242188 1.625 2.421875 1.46875 C 2.597656 1.320312 2.789062 0.96875 3 0.40625 L 3.21875 -0.109375 L 0.375 -7 L 1.59375 -7 L 3.78125 -1.53125 L 5.96875 -7 L 7.1875 -7 Z M 4.125 0.65625 "/>
</symbol>
<symbol overflow="visible" id="glyph0-6">
<path style="stroke:none;" d="M 4.390625 -3.515625 C 3.460938 -3.515625 2.816406 -3.40625 2.453125 -3.1875 C 2.097656 -2.976562 1.921875 -2.617188 1.921875 -2.109375 C 1.921875 -1.703125 2.050781 -1.378906 2.3125 -1.140625 C 2.582031 -0.898438 2.953125 -0.78125 3.421875 -0.78125 C 4.054688 -0.78125 4.566406 -1.003906 4.953125 -1.453125 C 5.335938 -1.910156 5.53125 -2.515625 5.53125 -3.265625 L 5.53125 -3.515625 Z M 6.671875 -4 L 6.671875 0 L 5.53125 0 L 5.53125 -1.0625 C 5.269531 -0.632812 4.941406 -0.316406 4.546875 -0.109375 C 4.160156 0.0859375 3.679688 0.1875 3.109375 0.1875 C 2.390625 0.1875 1.816406 -0.015625 1.390625 -0.421875 C 0.972656 -0.828125 0.765625 -1.363281 0.765625 -2.03125 C 0.765625 -2.820312 1.023438 -3.414062 1.546875 -3.8125 C 2.078125 -4.21875 2.867188 -4.421875 3.921875 -4.421875 L 5.53125 -4.421875 L 5.53125 -4.53125 C 5.53125 -5.0625 5.351562 -5.46875 5 -5.75 C 4.65625 -6.039062 4.171875 -6.1875 3.546875 -6.1875 C 3.140625 -6.1875 2.75 -6.140625 2.375 -6.046875 C 2 -5.953125 1.632812 -5.8125 1.28125 -5.625 L 1.28125 -6.671875 C 1.695312 -6.835938 2.101562 -6.960938 2.5 -7.046875 C 2.894531 -7.128906 3.28125 -7.171875 3.65625 -7.171875 C 4.675781 -7.171875 5.429688 -6.90625 5.921875 -6.375 C 6.421875 -5.851562 6.671875 -5.0625 6.671875 -4 Z M 6.671875 -4 "/>
</symbol>
<symbol overflow="visible" id="glyph0-7">
<path style="stroke:none;" d="M 5.8125 -5.9375 L 5.8125 -9.71875 L 6.953125 -9.71875 L 6.953125 0 L 5.8125 0 L 5.8125 -1.046875 C 5.570312 -0.628906 5.265625 -0.316406 4.890625 -0.109375 C 4.523438 0.0859375 4.082031 0.1875 3.5625 0.1875 C 2.71875 0.1875 2.03125 -0.148438 1.5 -0.828125 C 0.96875 -1.503906 0.703125 -2.394531 0.703125 -3.5 C 0.703125 -4.59375 0.96875 -5.476562 1.5 -6.15625 C 2.03125 -6.832031 2.71875 -7.171875 3.5625 -7.171875 C 4.082031 -7.171875 4.523438 -7.066406 4.890625 -6.859375 C 5.265625 -6.660156 5.570312 -6.351562 5.8125 -5.9375 Z M 1.890625 -3.5 C 1.890625 -2.644531 2.0625 -1.976562 2.40625 -1.5 C 2.757812 -1.019531 3.238281 -0.78125 3.84375 -0.78125 C 4.457031 -0.78125 4.9375 -1.019531 5.28125 -1.5 C 5.632812 -1.976562 5.8125 -2.644531 5.8125 -3.5 C 5.8125 -4.34375 5.632812 -5.003906 5.28125 -5.484375 C 4.9375 -5.960938 4.457031 -6.203125 3.84375 -6.203125 C 3.238281 -6.203125 2.757812 -5.960938 2.40625 -5.484375 C 2.0625 -5.003906 1.890625 -4.34375 1.890625 -3.5 Z M 1.890625 -3.5 "/>
</symbol>
<symbol overflow="visible" id="glyph0-8">
<path style="stroke:none;" d=""/>
</symbol>
<symbol overflow="visible" id="glyph0-9">
<path style="stroke:none;" d="M 1.5 -1.59375 L 2.8125 -1.59375 L 2.8125 -0.515625 L 1.796875 1.484375 L 0.984375 1.484375 L 1.5 -0.515625 Z M 1.5 -1.59375 "/>
</symbol>
<symbol overflow="visible" id="glyph0-10">
<path style="stroke:none;" d="M 6.234375 -3.5 C 6.234375 -4.34375 6.054688 -5.003906 5.703125 -5.484375 C 5.359375 -5.960938 4.882812 -6.203125 4.28125 -6.203125 C 3.664062 -6.203125 3.179688 -5.960938 2.828125 -5.484375 C 2.484375 -5.003906 2.3125 -4.34375 2.3125 -3.5 C 2.3125 -2.644531 2.484375 -1.976562 2.828125 -1.5 C 3.179688 -1.019531 3.664062 -0.78125 4.28125 -0.78125 C 4.882812 -0.78125 5.359375 -1.019531 5.703125 -1.5 C 6.054688 -1.976562 6.234375 -2.644531 6.234375 -3.5 Z M 2.3125 -5.9375 C 2.5625 -6.351562 2.867188 -6.660156 3.234375 -6.859375 C 3.597656 -7.066406 4.039062 -7.171875 4.5625 -7.171875 C 5.40625 -7.171875 6.09375 -6.832031 6.625 -6.15625 C 7.15625 -5.476562 7.421875 -4.59375 7.421875 -3.5 C 7.421875 -2.394531 7.15625 -1.503906 6.625 -0.828125 C 6.09375 -0.148438 5.40625 0.1875 4.5625 0.1875 C 4.039062 0.1875 3.597656 0.0859375 3.234375 -0.109375 C 2.867188 -0.316406 2.5625 -0.628906 2.3125 -1.046875 L 2.3125 0 L 1.15625 0 L 1.15625 -9.71875 L 2.3125 -9.71875 Z M 2.3125 -5.9375 "/>
</symbol>
<symbol overflow="visible" id="glyph0-11">
<path style="stroke:none;" d="M 6.25 -6.734375 L 6.25 -5.65625 C 5.914062 -5.832031 5.585938 -5.960938 5.265625 -6.046875 C 4.941406 -6.140625 4.613281 -6.1875 4.28125 -6.1875 C 3.53125 -6.1875 2.945312 -5.953125 2.53125 -5.484375 C 2.125 -5.015625 1.921875 -4.351562 1.921875 -3.5 C 1.921875 -2.644531 2.125 -1.976562 2.53125 -1.5 C 2.945312 -1.03125 3.53125 -0.796875 4.28125 -0.796875 C 4.613281 -0.796875 4.941406 -0.835938 5.265625 -0.921875 C 5.585938 -1.015625 5.914062 -1.148438 6.25 -1.328125 L 6.25 -0.265625 C 5.925781 -0.117188 5.59375 -0.0078125 5.25 0.0625 C 4.90625 0.144531 4.539062 0.1875 4.15625 0.1875 C 3.09375 0.1875 2.25 -0.144531 1.625 -0.8125 C 1.007812 -1.476562 0.703125 -2.375 0.703125 -3.5 C 0.703125 -4.632812 1.015625 -5.53125 1.640625 -6.1875 C 2.273438 -6.84375 3.132812 -7.171875 4.21875 -7.171875 C 4.570312 -7.171875 4.914062 -7.132812 5.25 -7.0625 C 5.59375 -6.988281 5.925781 -6.878906 6.25 -6.734375 Z M 6.25 -6.734375 "/>
</symbol>
<symbol overflow="visible" id="glyph0-12">
<path style="stroke:none;" d="M 4.75 -9.71875 L 4.75 -8.765625 L 3.65625 -8.765625 C 3.238281 -8.765625 2.945312 -8.679688 2.78125 -8.515625 C 2.625 -8.347656 2.546875 -8.046875 2.546875 -7.609375 L 2.546875 -7 L 4.4375 -7 L 4.4375 -6.109375 L 2.546875 -6.109375 L 2.546875 0 L 1.390625 0 L 1.390625 -6.109375 L 0.296875 -6.109375 L 0.296875 -7 L 1.390625 -7 L 1.390625 -7.484375 C 1.390625 -8.265625 1.570312 -8.832031 1.9375 -9.1875 C 2.300781 -9.539062 2.875 -9.71875 3.65625 -9.71875 Z M 4.75 -9.71875 "/>
</symbol>
<symbol overflow="visible" id="glyph0-13">
<path style="stroke:none;" d="M 7.015625 -7 L 4.5 -3.59375 L 7.15625 0 L 5.796875 0 L 3.765625 -2.75 L 1.71875 0 L 0.375 0 L 3.09375 -3.65625 L 0.59375 -7 L 1.953125 -7 L 3.8125 -4.5 L 5.671875 -7 Z M 7.015625 -7 "/>
</symbol>
<symbol overflow="visible" id="glyph0-14">
<path style="stroke:none;" d="M 3.921875 -6.1875 C 3.304688 -6.1875 2.816406 -5.945312 2.453125 -5.46875 C 2.097656 -4.988281 1.921875 -4.332031 1.921875 -3.5 C 1.921875 -2.65625 2.097656 -1.992188 2.453125 -1.515625 C 2.804688 -1.035156 3.296875 -0.796875 3.921875 -0.796875 C 4.535156 -0.796875 5.019531 -1.035156 5.375 -1.515625 C 5.726562 -2.003906 5.90625 -2.664062 5.90625 -3.5 C 5.90625 -4.320312 5.726562 -4.972656 5.375 -5.453125 C 5.019531 -5.941406 4.535156 -6.1875 3.921875 -6.1875 Z M 3.921875 -7.171875 C 4.921875 -7.171875 5.703125 -6.84375 6.265625 -6.1875 C 6.835938 -5.539062 7.125 -4.644531 7.125 -3.5 C 7.125 -2.351562 6.835938 -1.453125 6.265625 -0.796875 C 5.703125 -0.140625 4.921875 0.1875 3.921875 0.1875 C 2.910156 0.1875 2.117188 -0.140625 1.546875 -0.796875 C 0.984375 -1.453125 0.703125 -2.351562 0.703125 -3.5 C 0.703125 -4.644531 0.984375 -5.539062 1.546875 -6.1875 C 2.117188 -6.84375 2.910156 -7.171875 3.921875 -7.171875 Z M 3.921875 -7.171875 "/>
</symbol>
<symbol overflow="visible" id="glyph0-15">
<path style="stroke:none;" d="M 7.015625 -4.21875 L 7.015625 0 L 5.875 0 L 5.875 -4.1875 C 5.875 -4.851562 5.742188 -5.347656 5.484375 -5.671875 C 5.222656 -6.003906 4.835938 -6.171875 4.328125 -6.171875 C 3.703125 -6.171875 3.207031 -5.972656 2.84375 -5.578125 C 2.488281 -5.179688 2.3125 -4.640625 2.3125 -3.953125 L 2.3125 0 L 1.15625 0 L 1.15625 -7 L 2.3125 -7 L 2.3125 -5.90625 C 2.59375 -6.332031 2.914062 -6.648438 3.28125 -6.859375 C 3.65625 -7.066406 4.085938 -7.171875 4.578125 -7.171875 C 5.378906 -7.171875 5.984375 -6.921875 6.390625 -6.421875 C 6.804688 -5.921875 7.015625 -5.1875 7.015625 -4.21875 Z M 7.015625 -4.21875 "/>
</symbol>
<symbol overflow="visible" id="glyph0-16">
<path style="stroke:none;" d="M 5.265625 -5.921875 C 5.128906 -5.992188 4.984375 -6.046875 4.828125 -6.078125 C 4.679688 -6.117188 4.519531 -6.140625 4.34375 -6.140625 C 3.6875 -6.140625 3.179688 -5.925781 2.828125 -5.5 C 2.484375 -5.082031 2.3125 -4.476562 2.3125 -3.6875 L 2.3125 0 L 1.15625 0 L 1.15625 -7 L 2.3125 -7 L 2.3125 -5.90625 C 2.5625 -6.332031 2.878906 -6.648438 3.265625 -6.859375 C 3.648438 -7.066406 4.117188 -7.171875 4.671875 -7.171875 C 4.753906 -7.171875 4.84375 -7.164062 4.9375 -7.15625 C 5.03125 -7.144531 5.132812 -7.128906 5.25 -7.109375 Z M 5.265625 -5.921875 "/>
</symbol>
<symbol overflow="visible" id="glyph0-17">
<path style="stroke:none;" d="M 5.671875 -6.796875 L 5.671875 -5.703125 C 5.347656 -5.867188 5.007812 -5.992188 4.65625 -6.078125 C 4.300781 -6.160156 3.9375 -6.203125 3.5625 -6.203125 C 3 -6.203125 2.570312 -6.113281 2.28125 -5.9375 C 2 -5.769531 1.859375 -5.507812 1.859375 -5.15625 C 1.859375 -4.882812 1.957031 -4.671875 2.15625 -4.515625 C 2.363281 -4.367188 2.773438 -4.226562 3.390625 -4.09375 L 3.78125 -4 C 4.601562 -3.832031 5.1875 -3.585938 5.53125 -3.265625 C 5.875 -2.941406 6.046875 -2.5 6.046875 -1.9375 C 6.046875 -1.28125 5.785156 -0.757812 5.265625 -0.375 C 4.753906 0 4.050781 0.1875 3.15625 0.1875 C 2.78125 0.1875 2.390625 0.148438 1.984375 0.078125 C 1.578125 0.00390625 1.144531 -0.101562 0.6875 -0.25 L 0.6875 -1.4375 C 1.113281 -1.21875 1.53125 -1.050781 1.9375 -0.9375 C 2.351562 -0.832031 2.765625 -0.78125 3.171875 -0.78125 C 3.710938 -0.78125 4.128906 -0.875 4.421875 -1.0625 C 4.710938 -1.25 4.859375 -1.507812 4.859375 -1.84375 C 4.859375 -2.15625 4.753906 -2.394531 4.546875 -2.5625 C 4.335938 -2.726562 3.875 -2.890625 3.15625 -3.046875 L 2.765625 -3.140625 C 2.046875 -3.285156 1.53125 -3.515625 1.21875 -3.828125 C 0.90625 -4.140625 0.75 -4.566406 0.75 -5.109375 C 0.75 -5.765625 0.976562 -6.269531 1.4375 -6.625 C 1.90625 -6.988281 2.570312 -7.171875 3.4375 -7.171875 C 3.851562 -7.171875 4.25 -7.140625 4.625 -7.078125 C 5 -7.015625 5.347656 -6.921875 5.671875 -6.796875 Z M 5.671875 -6.796875 "/>
</symbol>
<symbol overflow="visible" id="glyph0-18">
<path style="stroke:none;" d="M 1.203125 -7 L 2.359375 -7 L 2.359375 0 L 1.203125 0 Z M 1.203125 -9.71875 L 2.359375 -9.71875 L 2.359375 -8.265625 L 1.203125 -8.265625 Z M 1.203125 -9.71875 "/>
</symbol>
<symbol overflow="visible" id="glyph0-19">
<path style="stroke:none;" d="M 1.203125 -9.71875 L 2.359375 -9.71875 L 2.359375 0 L 1.203125 0 Z M 1.203125 -9.71875 "/>
</symbol>
<symbol overflow="visible" id="glyph1-0">
<path style="stroke:none;" d="M 1.078125 3.84375 L 1.078125 -15.359375 L 11.96875 -15.359375 L 11.96875 3.84375 Z M 2.3125 2.640625 L 10.765625 2.640625 L 10.765625 -14.140625 L 2.3125 -14.140625 Z M 2.3125 2.640625 "/>
</symbol>
<symbol overflow="visible" id="glyph1-1">
<path style="stroke:none;" d="M 2.140625 -15.875 L 12.171875 -15.875 L 12.171875 -14.078125 L 4.28125 -14.078125 L 4.28125 -9.375 L 11.84375 -9.375 L 11.84375 -7.5625 L 4.28125 -7.5625 L 4.28125 -1.8125 L 12.375 -1.8125 L 12.375 0 L 2.140625 0 Z M 2.140625 -15.875 "/>
</symbol>
<symbol overflow="visible" id="glyph1-2">
<path style="stroke:none;" d="M 11.953125 -11.90625 L 7.640625 -6.109375 L 12.171875 0 L 9.875 0 L 6.40625 -4.671875 L 2.9375 0 L 0.625 0 L 5.25 -6.234375 L 1.015625 -11.90625 L 3.328125 -11.90625 L 6.484375 -7.671875 L 9.640625 -11.90625 Z M 11.953125 -11.90625 "/>
</symbol>
<symbol overflow="visible" id="glyph1-3">
<path style="stroke:none;" d="M 7.46875 -5.984375 C 5.882812 -5.984375 4.785156 -5.800781 4.171875 -5.4375 C 3.566406 -5.082031 3.265625 -4.46875 3.265625 -3.59375 C 3.265625 -2.894531 3.492188 -2.34375 3.953125 -1.9375 C 4.410156 -1.53125 5.03125 -1.328125 5.8125 -1.328125 C 6.894531 -1.328125 7.765625 -1.710938 8.421875 -2.484375 C 9.078125 -3.253906 9.40625 -4.273438 9.40625 -5.546875 L 9.40625 -5.984375 Z M 11.375 -6.796875 L 11.375 0 L 9.40625 0 L 9.40625 -1.8125 C 8.96875 -1.082031 8.414062 -0.546875 7.75 -0.203125 C 7.082031 0.140625 6.265625 0.3125 5.296875 0.3125 C 4.078125 0.3125 3.109375 -0.03125 2.390625 -0.71875 C 1.671875 -1.40625 1.3125 -2.320312 1.3125 -3.46875 C 1.3125 -4.8125 1.757812 -5.820312 2.65625 -6.5 C 3.550781 -7.175781 4.890625 -7.515625 6.671875 -7.515625 L 9.40625 -7.515625 L 9.40625 -7.703125 C 9.40625 -8.609375 9.109375 -9.304688 8.515625 -9.796875 C 7.929688 -10.296875 7.101562 -10.546875 6.03125 -10.546875 C 5.351562 -10.546875 4.691406 -10.460938 4.046875 -10.296875 C 3.398438 -10.128906 2.78125 -9.882812 2.1875 -9.5625 L 2.1875 -11.375 C 2.894531 -11.644531 3.585938 -11.847656 4.265625 -11.984375 C 4.941406 -12.128906 5.597656 -12.203125 6.234375 -12.203125 C 7.953125 -12.203125 9.238281 -11.753906 10.09375 -10.859375 C 10.945312 -9.960938 11.375 -8.609375 11.375 -6.796875 Z M 11.375 -6.796875 "/>
</symbol>
<symbol overflow="visible" id="glyph1-4">
<path style="stroke:none;" d="M 11.328125 -9.625 C 11.816406 -10.5 12.398438 -11.144531 13.078125 -11.5625 C 13.765625 -11.988281 14.566406 -12.203125 15.484375 -12.203125 C 16.722656 -12.203125 17.675781 -11.765625 18.34375 -10.890625 C 19.019531 -10.023438 19.359375 -8.789062 19.359375 -7.1875 L 19.359375 0 L 17.40625 0 L 17.40625 -7.125 C 17.40625 -8.269531 17.203125 -9.117188 16.796875 -9.671875 C 16.390625 -10.222656 15.769531 -10.5 14.9375 -10.5 C 13.925781 -10.5 13.125 -10.160156 12.53125 -9.484375 C 11.945312 -8.804688 11.65625 -7.890625 11.65625 -6.734375 L 11.65625 0 L 9.6875 0 L 9.6875 -7.125 C 9.6875 -8.269531 9.484375 -9.117188 9.078125 -9.671875 C 8.671875 -10.222656 8.046875 -10.5 7.203125 -10.5 C 6.210938 -10.5 5.421875 -10.160156 4.828125 -9.484375 C 4.242188 -8.804688 3.953125 -7.890625 3.953125 -6.734375 L 3.953125 0 L 1.984375 0 L 1.984375 -11.90625 L 3.953125 -11.90625 L 3.953125 -10.0625 C 4.390625 -10.789062 4.921875 -11.328125 5.546875 -11.671875 C 6.171875 -12.023438 6.914062 -12.203125 7.78125 -12.203125 C 8.644531 -12.203125 9.378906 -11.976562 9.984375 -11.53125 C 10.585938 -11.09375 11.035156 -10.457031 11.328125 -9.625 Z M 11.328125 -9.625 "/>
</symbol>
<symbol overflow="visible" id="glyph1-5">
<path style="stroke:none;" d="M 3.953125 -1.78125 L 3.953125 4.53125 L 1.984375 4.53125 L 1.984375 -11.90625 L 3.953125 -11.90625 L 3.953125 -10.109375 C 4.359375 -10.816406 4.875 -11.34375 5.5 -11.6875 C 6.125 -12.03125 6.875 -12.203125 7.75 -12.203125 C 9.195312 -12.203125 10.375 -11.625 11.28125 -10.46875 C 12.1875 -9.320312 12.640625 -7.8125 12.640625 -5.9375 C 12.640625 -4.070312 12.1875 -2.5625 11.28125 -1.40625 C 10.375 -0.257812 9.195312 0.3125 7.75 0.3125 C 6.875 0.3125 6.125 0.140625 5.5 -0.203125 C 4.875 -0.546875 4.359375 -1.070312 3.953125 -1.78125 Z M 10.609375 -5.9375 C 10.609375 -7.382812 10.3125 -8.515625 9.71875 -9.328125 C 9.125 -10.148438 8.3125 -10.5625 7.28125 -10.5625 C 6.238281 -10.5625 5.421875 -10.148438 4.828125 -9.328125 C 4.242188 -8.515625 3.953125 -7.382812 3.953125 -5.9375 C 3.953125 -4.5 4.242188 -3.367188 4.828125 -2.546875 C 5.421875 -1.734375 6.238281 -1.328125 7.28125 -1.328125 C 8.3125 -1.328125 9.125 -1.734375 9.71875 -2.546875 C 10.3125 -3.367188 10.609375 -4.5 10.609375 -5.9375 Z M 10.609375 -5.9375 "/>
</symbol>
<symbol overflow="visible" id="glyph1-6">
<path style="stroke:none;" d="M 2.046875 -16.546875 L 4.015625 -16.546875 L 4.015625 0 L 2.046875 0 Z M 2.046875 -16.546875 "/>
</symbol>
<symbol overflow="visible" id="glyph1-7">
<path style="stroke:none;" d="M 12.234375 -6.4375 L 12.234375 -5.484375 L 3.25 -5.484375 C 3.332031 -4.140625 3.734375 -3.113281 4.453125 -2.40625 C 5.179688 -1.695312 6.195312 -1.34375 7.5 -1.34375 C 8.25 -1.34375 8.972656 -1.4375 9.671875 -1.625 C 10.378906 -1.8125 11.082031 -2.085938 11.78125 -2.453125 L 11.78125 -0.609375 C 11.082031 -0.304688 10.363281 -0.078125 9.625 0.078125 C 8.882812 0.234375 8.132812 0.3125 7.375 0.3125 C 5.476562 0.3125 3.972656 -0.238281 2.859375 -1.34375 C 1.753906 -2.457031 1.203125 -3.957031 1.203125 -5.84375 C 1.203125 -7.789062 1.726562 -9.335938 2.78125 -10.484375 C 3.832031 -11.628906 5.253906 -12.203125 7.046875 -12.203125 C 8.640625 -12.203125 9.898438 -11.6875 10.828125 -10.65625 C 11.765625 -9.625 12.234375 -8.21875 12.234375 -6.4375 Z M 10.28125 -7.015625 C 10.269531 -8.085938 9.972656 -8.941406 9.390625 -9.578125 C 8.804688 -10.222656 8.03125 -10.546875 7.0625 -10.546875 C 5.96875 -10.546875 5.09375 -10.234375 4.4375 -9.609375 C 3.78125 -8.992188 3.40625 -8.128906 3.3125 -7.015625 Z M 10.28125 -7.015625 "/>
</symbol>
<symbol overflow="visible" id="glyph1-8">
<path style="stroke:none;" d=""/>
</symbol>
<symbol overflow="visible" id="glyph1-9">
<path style="stroke:none;" d="M 3.984375 -15.296875 L 3.984375 -11.90625 L 8.015625 -11.90625 L 8.015625 -10.390625 L 3.984375 -10.390625 L 3.984375 -3.921875 C 3.984375 -2.953125 4.113281 -2.328125 4.375 -2.046875 C 4.644531 -1.773438 5.191406 -1.640625 6.015625 -1.640625 L 8.015625 -1.640625 L 8.015625 0 L 6.015625 0 C 4.503906 0 3.457031 -0.28125 2.875 -0.84375 C 2.300781 -1.40625 2.015625 -2.429688 2.015625 -3.921875 L 2.015625 -10.390625 L 0.578125 -10.390625 L 0.578125 -11.90625 L 2.015625 -11.90625 L 2.015625 -15.296875 Z M 3.984375 -15.296875 "/>
</symbol>
<symbol overflow="visible" id="glyph1-10">
<path style="stroke:none;" d="M 8.953125 -10.078125 C 8.734375 -10.210938 8.492188 -10.304688 8.234375 -10.359375 C 7.972656 -10.421875 7.6875 -10.453125 7.375 -10.453125 C 6.269531 -10.453125 5.421875 -10.09375 4.828125 -9.375 C 4.242188 -8.65625 3.953125 -7.625 3.953125 -6.28125 L 3.953125 0 L 1.984375 0 L 1.984375 -11.90625 L 3.953125 -11.90625 L 3.953125 -10.0625 C 4.359375 -10.78125 4.890625 -11.316406 5.546875 -11.671875 C 6.210938 -12.023438 7.015625 -12.203125 7.953125 -12.203125 C 8.085938 -12.203125 8.234375 -12.191406 8.390625 -12.171875 C 8.554688 -12.148438 8.738281 -12.125 8.9375 -12.09375 Z M 8.953125 -10.078125 "/>
</symbol>
<symbol overflow="visible" id="glyph1-11">
<path style="stroke:none;" d="M 11.953125 -7.1875 L 11.953125 0 L 10 0 L 10 -7.125 C 10 -8.25 9.773438 -9.09375 9.328125 -9.65625 C 8.890625 -10.21875 8.234375 -10.5 7.359375 -10.5 C 6.304688 -10.5 5.472656 -10.160156 4.859375 -9.484375 C 4.253906 -8.804688 3.953125 -7.890625 3.953125 -6.734375 L 3.953125 0 L 1.984375 0 L 1.984375 -11.90625 L 3.953125 -11.90625 L 3.953125 -10.0625 C 4.410156 -10.78125 4.957031 -11.316406 5.59375 -11.671875 C 6.226562 -12.023438 6.960938 -12.203125 7.796875 -12.203125 C 9.160156 -12.203125 10.191406 -11.773438 10.890625 -10.921875 C 11.597656 -10.078125 11.953125 -8.832031 11.953125 -7.1875 Z M 11.953125 -7.1875 "/>
</symbol>
<symbol overflow="visible" id="glyph1-12">
<path style="stroke:none;" d="M 9.640625 -11.5625 L 9.640625 -9.703125 C 9.085938 -9.992188 8.515625 -10.207031 7.921875 -10.34375 C 7.328125 -10.488281 6.710938 -10.5625 6.078125 -10.5625 C 5.097656 -10.5625 4.363281 -10.410156 3.875 -10.109375 C 3.394531 -9.816406 3.15625 -9.375 3.15625 -8.78125 C 3.15625 -8.320312 3.328125 -7.960938 3.671875 -7.703125 C 4.023438 -7.441406 4.726562 -7.195312 5.78125 -6.96875 L 6.4375 -6.8125 C 7.832031 -6.519531 8.820312 -6.101562 9.40625 -5.5625 C 9.988281 -5.019531 10.28125 -4.257812 10.28125 -3.28125 C 10.28125 -2.175781 9.84375 -1.300781 8.96875 -0.65625 C 8.09375 -0.0078125 6.890625 0.3125 5.359375 0.3125 C 4.722656 0.3125 4.054688 0.25 3.359375 0.125 C 2.671875 0 1.945312 -0.1875 1.1875 -0.4375 L 1.1875 -2.453125 C 1.90625 -2.078125 2.613281 -1.796875 3.3125 -1.609375 C 4.019531 -1.421875 4.71875 -1.328125 5.40625 -1.328125 C 6.320312 -1.328125 7.03125 -1.484375 7.53125 -1.796875 C 8.03125 -2.117188 8.28125 -2.566406 8.28125 -3.140625 C 8.28125 -3.671875 8.097656 -4.078125 7.734375 -4.359375 C 7.378906 -4.640625 6.59375 -4.910156 5.375 -5.171875 L 4.703125 -5.34375 C 3.484375 -5.59375 2.601562 -5.984375 2.0625 -6.515625 C 1.53125 -7.046875 1.265625 -7.769531 1.265625 -8.6875 C 1.265625 -9.8125 1.660156 -10.675781 2.453125 -11.28125 C 3.242188 -11.894531 4.375 -12.203125 5.84375 -12.203125 C 6.5625 -12.203125 7.238281 -12.144531 7.875 -12.03125 C 8.519531 -11.925781 9.109375 -11.769531 9.640625 -11.5625 Z M 9.640625 -11.5625 "/>
</symbol>
<symbol overflow="visible" id="glyph1-13">
<path style="stroke:none;" d="M 2.046875 -11.90625 L 4.015625 -11.90625 L 4.015625 0 L 2.046875 0 Z M 2.046875 -16.546875 L 4.015625 -16.546875 L 4.015625 -14.078125 L 2.046875 -14.078125 Z M 2.046875 -16.546875 "/>
</symbol>
<symbol overflow="visible" id="glyph1-14">
<path style="stroke:none;" d="M 6.671875 -10.546875 C 5.617188 -10.546875 4.785156 -10.132812 4.171875 -9.3125 C 3.566406 -8.488281 3.265625 -7.363281 3.265625 -5.9375 C 3.265625 -4.519531 3.566406 -3.398438 4.171875 -2.578125 C 4.773438 -1.753906 5.609375 -1.34375 6.671875 -1.34375 C 7.710938 -1.34375 8.535156 -1.753906 9.140625 -2.578125 C 9.753906 -3.398438 10.0625 -4.519531 10.0625 -5.9375 C 10.0625 -7.351562 9.753906 -8.472656 9.140625 -9.296875 C 8.535156 -10.128906 7.710938 -10.546875 6.671875 -10.546875 Z M 6.671875 -12.203125 C 8.367188 -12.203125 9.703125 -11.644531 10.671875 -10.53125 C 11.648438 -9.425781 12.140625 -7.894531 12.140625 -5.9375 C 12.140625 -3.988281 11.648438 -2.457031 10.671875 -1.34375 C 9.703125 -0.238281 8.367188 0.3125 6.671875 0.3125 C 4.960938 0.3125 3.625 -0.238281 2.65625 -1.34375 C 1.6875 -2.457031 1.203125 -3.988281 1.203125 -5.9375 C 1.203125 -7.894531 1.6875 -9.425781 2.65625 -10.53125 C 3.625 -11.644531 4.960938 -12.203125 6.671875 -12.203125 Z M 6.671875 -12.203125 "/>
</symbol>
</g>
</defs>
<g id="surface53632">
<rect x="0" y="0" width="623" height="503" style="fill:rgb(100%,100%,100%);fill-opacity:1;stroke:none;"/>
<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 21 3 L 29 3 L 29 5 L 21 5 Z M 21 3 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-1" x="381.472656" y="65.154405"/>
<use xlink:href="#glyph0-2" x="389.347548" y="65.154405"/>
<use xlink:href="#glyph0-3" x="401.816081" y="65.154405"/>
<use xlink:href="#glyph0-4" x="409.941081" y="65.154405"/>
<use xlink:href="#glyph0-5" x="414.959798" y="65.154405"/>
</g>
<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 14 10 L 22 10 L 22 12 L 14 12 Z M 14 10 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="258.09375" y="205.154405"/>
</g>
<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 25 5 L 18.396094 9.716992 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 18.091016 9.934961 L 18.352539 9.441016 L 18.396094 9.716992 L 18.643164 9.847852 Z M 18.091016 9.934961 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="302" y="121.255968"/>
<use xlink:href="#glyph0-7" x="309.843696" y="121.255968"/>
<use xlink:href="#glyph0-7" x="317.968696" y="121.255968"/>
<use xlink:href="#glyph0-8" x="326.093696" y="121.255968"/>
<use xlink:href="#glyph0-6" x="330.162435" y="121.255968"/>
</g>
<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 14 17 L 22 17 L 22 19 L 14 19 Z M 14 17 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="249.949219" y="345.154405"/>
<use xlink:href="#glyph0-9" x="257.792914" y="345.154405"/>
<use xlink:href="#glyph0-8" x="261.861654" y="345.154405"/>
<use xlink:href="#glyph0-10" x="265.930393" y="345.154405"/>
</g>
<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 21 24 L 29 24 L 29 26 L 21 26 Z M 21 24 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="382.371094" y="485.154405"/>
<use xlink:href="#glyph0-9" x="390.214789" y="485.154405"/>
<use xlink:href="#glyph0-8" x="394.283529" y="485.154405"/>
<use xlink:href="#glyph0-10" x="398.352268" y="485.154405"/>
<use xlink:href="#glyph0-9" x="406.477268" y="485.154405"/>
<use xlink:href="#glyph0-8" x="410.546007" y="485.154405"/>
<use xlink:href="#glyph0-11" x="414.614746" y="485.154405"/>
</g>
<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 8 24 L 16 24 L 16 26 L 8 26 Z M 8 24 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="123.640625" y="485.154405"/>
<use xlink:href="#glyph0-9" x="131.484321" y="485.154405"/>
<use xlink:href="#glyph0-8" x="135.55306" y="485.154405"/>
<use xlink:href="#glyph0-10" x="139.621799" y="485.154405"/>
<use xlink:href="#glyph0-9" x="147.746799" y="485.154405"/>
<use xlink:href="#glyph0-8" x="151.815538" y="485.154405"/>
<use xlink:href="#glyph0-12" x="155.884277" y="485.154405"/>
</g>
<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 18 12 L 18 16.513281 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 18 16.888281 L 17.75 16.388281 L 18 16.513281 L 18.25 16.388281 Z M 18 16.888281 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-dasharray:1,0.266667,0.1,0.266667,0.1,0.266667;stroke-miterlimit:10;" d="M 18 19 L 12.374023 23.688281 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 12.085938 23.928516 L 12.309961 23.416211 L 12.374023 23.688281 L 12.630078 23.800391 Z M 12.085938 23.928516 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 18 19 L 24.603906 23.716992 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 24.908984 23.934961 L 24.356836 23.847852 L 24.603906 23.716992 L 24.647461 23.441016 Z M 24.908984 23.934961 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="219" y="281.255968"/>
<use xlink:href="#glyph0-7" x="226.843696" y="281.255968"/>
<use xlink:href="#glyph0-7" x="234.968696" y="281.255968"/>
<use xlink:href="#glyph0-8" x="243.093696" y="281.255968"/>
<use xlink:href="#glyph0-10" x="247.162435" y="281.255968"/>
</g>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="161" y="412.255968"/>
<use xlink:href="#glyph0-7" x="168.843696" y="412.255968"/>
<use xlink:href="#glyph0-7" x="176.968696" y="412.255968"/>
<use xlink:href="#glyph0-8" x="185.093696" y="412.255968"/>
<use xlink:href="#glyph0-12" x="189.162435" y="412.255968"/>
</g>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="341" y="410.255968"/>
<use xlink:href="#glyph0-7" x="348.843696" y="410.255968"/>
<use xlink:href="#glyph0-7" x="356.968696" y="410.255968"/>
<use xlink:href="#glyph0-8" x="365.093696" y="410.255968"/>
<use xlink:href="#glyph0-11" x="369.162435" y="410.255968"/>
</g>
<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 28 10 L 36 10 L 36 12 L 28 12 Z M 28 10 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-13" x="538.230469" y="205.154405"/>
</g>
<path style="fill-rule:evenodd;fill:rgb(100%,100%,100%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 28 17 L 36 17 L 36 19 L 28 19 Z M 28 17 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-13" x="530.359375" y="345.154405"/>
<use xlink:href="#glyph0-9" x="537.934245" y="345.154405"/>
<use xlink:href="#glyph0-8" x="542.002984" y="345.154405"/>
<use xlink:href="#glyph0-5" x="546.071723" y="345.154405"/>
</g>
<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-dasharray:1,0.266667,0.1,0.266667,0.1,0.266667;stroke-miterlimit:10;" d="M 25 5 L 31.603906 9.716992 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 31.908984 9.934961 L 31.356836 9.847852 L 31.603906 9.716992 L 31.647461 9.441016 Z M 31.908984 9.934961 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 32 12 L 32 16.513281 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 32 16.888281 L 31.75 16.388281 L 32 16.513281 L 32.25 16.388281 Z M 32 16.888281 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="482" y="122.255968"/>
<use xlink:href="#glyph0-7" x="489.843696" y="122.255968"/>
<use xlink:href="#glyph0-7" x="497.968696" y="122.255968"/>
<use xlink:href="#glyph0-8" x="506.093696" y="122.255968"/>
<use xlink:href="#glyph0-13" x="510.162435" y="122.255968"/>
</g>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-6" x="551" y="281.255968"/>
<use xlink:href="#glyph0-7" x="558.843696" y="281.255968"/>
<use xlink:href="#glyph0-7" x="566.968696" y="281.255968"/>
<use xlink:href="#glyph0-8" x="575.093696" y="281.255968"/>
<use xlink:href="#glyph0-5" x="579.162435" y="281.255968"/>
</g>
<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 5 4 L 9.513281 4 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 9.888281 4 L 9.388281 4.25 L 9.513281 4 L 9.388281 3.75 Z M 9.888281 4 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill:none;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-dasharray:1,0.266667,0.1,0.266667,0.1,0.266667;stroke-miterlimit:10;" d="M 5 7 L 9.513281 7 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<path style="fill-rule:evenodd;fill:rgb(0%,0%,0%);fill-opacity:1;stroke-width:0.1;stroke-linecap:butt;stroke-linejoin:miter;stroke:rgb(0%,0%,0%);stroke-opacity:1;stroke-miterlimit:10;" d="M 9.888281 7 L 9.388281 7.25 L 9.513281 7 L 9.388281 6.75 Z M 9.888281 7 " transform="matrix(20,0,0,20,-98,-18.75)"/>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-11" x="2" y="81.255968"/>
<use xlink:href="#glyph0-14" x="9.037489" y="81.255968"/>
<use xlink:href="#glyph0-2" x="16.868707" y="81.255968"/>
<use xlink:href="#glyph0-2" x="29.33724" y="81.255968"/>
<use xlink:href="#glyph0-14" x="41.805773" y="81.255968"/>
<use xlink:href="#glyph0-15" x="49.63699" y="81.255968"/>
<use xlink:href="#glyph0-8" x="57.74924" y="81.255968"/>
<use xlink:href="#glyph0-4" x="61.81798" y="81.255968"/>
<use xlink:href="#glyph0-16" x="66.836697" y="81.255968"/>
<use xlink:href="#glyph0-6" x="72.099013" y="81.255968"/>
<use xlink:href="#glyph0-15" x="79.942708" y="81.255968"/>
<use xlink:href="#glyph0-17" x="88.054959" y="81.255968"/>
<use xlink:href="#glyph0-18" x="94.723524" y="81.255968"/>
<use xlink:href="#glyph0-4" x="98.279839" y="81.255968"/>
<use xlink:href="#glyph0-18" x="103.298557" y="81.255968"/>
<use xlink:href="#glyph0-14" x="106.854872" y="81.255968"/>
<use xlink:href="#glyph0-15" x="114.686089" y="81.255968"/>
</g>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph0-19" x="2" y="141.255968"/>
<use xlink:href="#glyph0-1" x="5.556315" y="141.255968"/>
<use xlink:href="#glyph0-17" x="13.431207" y="141.255968"/>
<use xlink:href="#glyph0-17" x="20.099772" y="141.255968"/>
<use xlink:href="#glyph0-8" x="26.768338" y="141.255968"/>
<use xlink:href="#glyph0-11" x="30.837077" y="141.255968"/>
<use xlink:href="#glyph0-14" x="37.874566" y="141.255968"/>
<use xlink:href="#glyph0-2" x="45.705783" y="141.255968"/>
<use xlink:href="#glyph0-2" x="58.174316" y="141.255968"/>
<use xlink:href="#glyph0-14" x="70.642849" y="141.255968"/>
<use xlink:href="#glyph0-15" x="78.474067" y="141.255968"/>
<use xlink:href="#glyph0-8" x="86.586317" y="141.255968"/>
<use xlink:href="#glyph0-4" x="90.655056" y="141.255968"/>
<use xlink:href="#glyph0-16" x="95.673774" y="141.255968"/>
<use xlink:href="#glyph0-6" x="100.936089" y="141.255968"/>
<use xlink:href="#glyph0-15" x="108.779785" y="141.255968"/>
<use xlink:href="#glyph0-17" x="116.892036" y="141.255968"/>
<use xlink:href="#glyph0-18" x="123.560601" y="141.255968"/>
<use xlink:href="#glyph0-4" x="127.116916" y="141.255968"/>
<use xlink:href="#glyph0-18" x="132.135634" y="141.255968"/>
<use xlink:href="#glyph0-14" x="135.691949" y="141.255968"/>
<use xlink:href="#glyph0-15" x="143.523166" y="141.255968"/>
</g>
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
<use xlink:href="#glyph1-1" x="242" y="21.26709"/>
<use xlink:href="#glyph1-2" x="255.758409" y="21.26709"/>
<use xlink:href="#glyph1-3" x="268.644965" y="21.26709"/>
<use xlink:href="#glyph1-4" x="281.988878" y="21.26709"/>
<use xlink:href="#glyph1-5" x="303.200629" y="21.26709"/>
<use xlink:href="#glyph1-6" x="317.022786" y="21.26709"/>
<use xlink:href="#glyph1-7" x="323.072591" y="21.26709"/>
<use xlink:href="#glyph1-8" x="336.469672" y="21.26709"/>
<use xlink:href="#glyph1-4" x="343.39133" y="21.26709"/>
<use xlink:href="#glyph1-3" x="364.603082" y="21.26709"/>
<use xlink:href="#glyph1-5" x="377.946994" y="21.26709"/>
<use xlink:href="#glyph1-8" x="391.769151" y="21.26709"/>
<use xlink:href="#glyph1-9" x="398.690809" y="21.26709"/>
<use xlink:href="#glyph1-10" x="407.228678" y="21.26709"/>
<use xlink:href="#glyph1-3" x="416.181315" y="21.26709"/>
<use xlink:href="#glyph1-11" x="429.525228" y="21.26709"/>
<use xlink:href="#glyph1-12" x="443.326226" y="21.26709"/>
<use xlink:href="#glyph1-13" x="454.67117" y="21.26709"/>
<use xlink:href="#glyph1-9" x="460.720974" y="21.26709"/>
<use xlink:href="#glyph1-13" x="469.258843" y="21.26709"/>
<use xlink:href="#glyph1-14" x="475.308648" y="21.26709"/>
<use xlink:href="#glyph1-11" x="488.631131" y="21.26709"/>
<use xlink:href="#glyph1-8" x="502.432129" y="21.26709"/>
<use xlink:href="#glyph1-9" x="509.353787" y="21.26709"/>
<use xlink:href="#glyph1-10" x="517.891656" y="21.26709"/>
<use xlink:href="#glyph1-7" x="526.365777" y="21.26709"/>
<use xlink:href="#glyph1-7" x="539.762858" y="21.26709"/>
</g>
</g>
</svg>
<br />
This map tree is the result of parsing a file that has dictionaries with
the keys a, b, c many times, the keys a, b, f less often, and also some
objects with the keys x, y.<br />
When parsing a dictionary we traverse this tree from the root, according
to the keys that we see in the input file. While doing this, we
potentially add new nodes, if we get key combinations that we have never
seen before. The set of keys of a dictionary parsed so far are
represented by the current tree node, while we can store the values into
an array. We can use the tree of nodes to speed up parsing. A lot of the
nodes only have one child, because after reading the first few keys of
an object, the remaining ones are often uniquely determined in a given
file. If we have only one child map node, we can speculatively parse the
next key by doing a <tt class="docutils literal">memcmp</tt> between the key that the map tree says is
likely to come next and the characters that follow the ',' that started
the next entry in the dictionary. If the <tt class="docutils literal">memcmp</tt> returns true this
means that the speculation paid off, and we can transition to the new map
that the edge points to, and parse the corresponding value. If not, we
fall back to general code that parses the string, handles escaping rules
etc. This trick was explained to me by some V8 engineers, the same trick
is supposedly used <a class="reference external" href="https://github.com/v8/v8/blob/master/src/json/json-parser.cc">as part of the V8 JSON parser</a>.<br />
This scheme doesn't immediately work for map tree nodes that have more
than one child. However, since we keep statistics anyway about how often
each map is used as the map of a parsed dictionary, we can speculate
that the most common map transition is taken more often than the others
in the future, and use that as the speculated next node.<br />
So for the example transition tree shown in the figure above the key
speculation would succeed for objects with keys <tt class="docutils literal">a, b, c</tt>. For objects
with keys <tt class="docutils literal">a, b, f</tt> the speculation would succeed for the first two
keys, but not for the third key <tt class="docutils literal">f</tt>. For objects with the keys
<tt class="docutils literal">x, y</tt> the speculation would fail for the first key <tt class="docutils literal">x</tt> but succeed
for the second key <tt class="docutils literal">y</tt>.<br />
For real-world datasets these transition trees can become a lot more
complicated, for example here is a visualization of a part of the
transition tree generated for parsing a New York Times dataset:<br />
<a href="https://2.bp.blogspot.com/-Jv_p8rFIq8Y/XZubp3WVptI/AAAAAAAAwuY/rhTCqXJpoMUdtEf22tzqK64dcEJmBE6fwCPcBGAYYCw/s1600/2019_json_nytimes.png" imageanchor="1"><img border="0" data-original-height="906" data-original-width="1169" height="310" src="https://2.bp.blogspot.com/-Jv_p8rFIq8Y/XZubp3WVptI/AAAAAAAAwuY/rhTCqXJpoMUdtEf22tzqK64dcEJmBE6fwCPcBGAYYCw/s400/2019_json_nytimes.png" width="400" /></a>
<br />
<h2>
Caching Strings</h2>
A rather obvious observation we can use to improve performance of the
parser is the fact that string values repeat a lot in most JSON files.
For strings that are used as dictionary keys this is pretty obvious.
However it happens also for strings that are used as values in
dictionaries (or are stored in lists). We can use this fact to
intern/memoize strings and save memory. This is an approach that many
JSON parsers use, including
<a class="reference external" href="https://github.com/python/cpython/blob/3.7/Modules/_json.c#L749">CPython's</a>.
To do this, I keep a dictionary of strings that we have seen so far
during parsing and look up new strings that are deserialized. If we have
seen the string before, we can re-use the deserialized previous string.
Right now I only consider utf-8 strings for caching that do not contain
any escapes (whether stuff like <tt class="docutils literal">\", \n</tt> or escaped unicode chars).<br />
This simple approach works extremely well for dictionary keys, but needs
a number of improvements to be a win in general. The first observation
is that computing the hash to look up the string in the dictionary of
strings we've seen so far is basically free. We can compute the hash
while scanning the input for the end of the string we are currently
deserializing. Computing the hash while scanning doesn't increase the
time spent scanning much. This is not a new idea, I am sure many other
parsers do the same thing (but CPython doesn't seem to).<br />
Another improvement follows from the observation that inserting every
single deserialized non-key string into a hashmap is too expensive.
Instead, we insert strings into the cache more conservatively, by
keeping a small ring buffer of hashes of recently deserialized strings.
The hash is looked for in the ring buffer, and only if the hash is
present we insert the string into the memoization hashmap. This has the
effect of only inserting strings into the memoization hashmap that
re-occur a second time not too far into the file. This seems to give a
good trade-off between still re-using a lot of strings but keeping the
time spent updating and the size of the memoization hashmap low.<br />
Another twist is that in a lot of situations caching strings is not
useful at all, because it will almost never succeed. Examples of this
are UUIDs (which are unique), or the content of a tweet in a JSON file
with many tweets (which is usually unique). However, in the same file it
might be useful to cache e.g. the user name of the Twitter user, because
many tweets from the same person could be in such a file. Therefore the
usefulness of the string cache depends on which fields of objects we are
deserializing the value off. Therefore we keep statistics per map field
and disable string memoization per individual field if the cache hit
rate falls below a certain threshold. This gives the best of both
worlds: in the cases where string values repeat a lot in certain fields
we use the cache to save time and memory. But for those fields that
mostly contain unique strings we don't waste time looking up and adding
strings in the memoization table. Strings outside of dictionaries are
quite rare anyway, so we just always try to use the cache for them.<br />
The following pseudocode sketches the code to deserialize a string in
the input at a given position. The function also takes a map, which is
the point in the map tree that we are currently deserializing a field
off (if we are deserializing a string in another context, some kind of
dummy map can be used there).<br />
<pre><code class="python">
def deserialize_string(pos, input, map):
# input is the input string, pos is the position of the starting " of
# the string
# find end of string, check whether it contains escape codes,
# compute hash, all at the same time
end, escapes, hash = find_end_of_string(pos + 1, input)
if end == -1:
raise ParseError
if escapes:
# need to be much more careful with escaping
return deserialize_string_escapes(pos, input)
# should we cache at all?
if map.cache_disabled():
return input[pos + 1:end]
# if string is in cache, return it
if hash in cache:
map.cache_hit += 1
return cache[hash]
result = input[pos + 1:end]
map.cache_miss += 1
# if hash is in the ring buffer of recently seen hashes,
# add the string to the cache
if hash in ring_buffer:
cache[hash] = result
else:
ring_buffer.write(hash)
return result
</code>
</pre>
<h2>
Evaluation</h2>
To find out how much the various techniques help, I implemented a number
of JSON parsers in PyPy with different combinations of the techniques
enabled. I compared the numbers with the JSON parser of CPython 3.7.3
(simplejson), with ujson, with the JSON parser of Node 12.11.1 (V8) and with
RapidJSON (in DOM mode).<br />
I collected a number of medium-to-large JSON files to try the JSON
parsers on:<br />
<ul class="simple">
<li><a class="reference external" href="https://censys.io/data">Censys</a>: A subset of the Censys port and
protocol scan data for websites in the Alexa top million domains</li>
<li><a class="reference external" href="https://www.gharchive.org/">Gharchive</a>: Github activity from
January 15-23, 2015 from Github Archive</li>
<li><a class="reference external" href="https://files.pushshift.io/reddit/comments/">Reddit</a>: Reddit
comments from May 2009</li>
<li>Rosie: The nested matches produced using the <a class="reference external" href="https://rosie-lang.org/">Rosie pattern
language</a> <tt class="docutils literal">all.things</tt> pattern on a log
file</li>
<li>Nytimes: Metadata of a collection of New York Times articles</li>
<li>Tpch: The TPC-H database benchmark's deals table as a JSON file</li>
<li>Twitter: A JSON export of the @pypyproject Twitter account data</li>
<li>Wikidata: A file storing a subset of the Wikidata fact dump from Nov
11, 2014</li>
<li><a class="reference external" href="https://www.yelp.com/dataset/download">Yelp</a>: A file of yelp
businesses</li>
</ul>
Here are the file sizes of the benchmarks:<br />
<table class="dataframe">
<thead>
<tr style="text-align: right;">
<th>Benchmark</th>
<th>File Size [MiB]</th>
</tr>
</thead>
<tbody>
<tr style="text-align: right;">
<td>Censys</td>
<td>898.45</td>
</tr>
<tr style="text-align: right;">
<td>Gharchive</td>
<td>276.34</td>
</tr>
<tr style="text-align: right;">
<td>NYTimes</td>
<td>12.98</td>
</tr>
<tr style="text-align: right;">
<td>Reddit</td>
<td>931.65</td>
</tr>
<tr style="text-align: right;">
<td>Rosie</td>
<td>388.88</td>
</tr>
<tr style="text-align: right;">
<td>TPCH</td>
<td>173.86</td>
</tr>
<tr style="text-align: right;">
<td>Wikidata</td>
<td>119.75</td>
</tr>
<tr style="text-align: right;">
<td>Yelp</td>
<td>167.61</td>
</tr>
</tbody>
</table>
I measured the times of each benchmark with a number of variations
of the improved PyPy algorithms:<br />
<ul class="simple">
<li>PyPyBaseline: The PyPy JSON parser as it was before my work with JSON
parsing started (PyPy version 5.8)</li>
<li>PyPyKeyStringCaching: Memoizing the key strings of dictionaries, but
not the other strings in a json file, and not using maps to represent
dictionaries (this is the JSON parser that PyPy has been shipping since
version 5.9, in the benchmarks I used 7.1).</li>
<li>PyPyMapNoCache: Like PyPyKeyStringCaching, but using maps to
represent dictionaries. This includes speculatively parsing the next
key using memcmp, but does not use string caching of non-key strings.</li>
<li>PyPyFull: Like PyPyMapNoCache but uses a string cache for all
strings, not just keys. This is equivalent to what will be released soon as part of PyPy 7.2</li>
</ul>
In addition to wall clock time of parsing, I also measured the increase
in memory use of each implementation after the input string has been
deserialized, i.e. the size of the in-memory representation of every
JSON file.<br />
<h3>
Contributions of Individual Optimizations</h3>
Let's first look at the contributions of the individual optimizations to the
overall performance and memory usage.<br />
<img src="https://docs.google.com/uc?id=1oqsebLsZH8pj4exAVbDthaRhERXqg8mN" width="800/" />
<img src="https://docs.google.com/uc?id=1_3UuTihT0A6wfM-F3sj8j9M9IoQuvcCt" width="800/" />
<br />
All the benchmarks were run 30 times in new processes, all the numbers are
normalized to PyPyFull.<br />
The biggest individual improvement to both parsing time and memory used comes
from caching just the keys in parsed dictionaries. This is the optimization in
PyPy's JSON parser that has been implemented for a while already. To understand
why this optimization is so useful, let's look at some numbers about each
benchmark, namely the number of total keys across all dictionaries in each
file, as well as the number of unique keys. As we can see, for all benchmarks
the number of unique keys is significantly smaller than the number of keys in
total.<br />
<table class="dataframe" style="tr: nth-child(even) {background: #DDD};">
<thead>
<tr style="text-align: right;">
<th>Benchmark</th>
<th>Number of keys</th>
<th>Number of unique keys</th>
</tr>
</thead>
<tbody>
<tr style="text-align: right;">
<td>Censys</td>
<td>14 404 234</td>
<td>163</td>
</tr>
<tr style="text-align: right;">
<td>Gharchive</td>
<td>6 637 881</td>
<td>169</td>
</tr>
<tr style="text-align: right;">
<td>NYTimes</td>
<td>417 337</td>
<td>60</td>
</tr>
<tr style="text-align: right;">
<td>Reddit</td>
<td>25 226 397</td>
<td>21</td>
</tr>
<tr style="text-align: right;">
<td>Rosie</td>
<td>28 500 101</td>
<td>5</td>
</tr>
<tr style="text-align: right;">
<td>TPCH</td>
<td>6 700 000</td>
<td>45</td>
</tr>
<tr style="text-align: right;">
<td>Wikidata</td>
<td>6 235 088</td>
<td>1 602</td>
</tr>
<tr style="text-align: right;">
<td>Yelp</td>
<td>5 133 914</td>
<td>61</td>
</tr>
</tbody>
</table>
The next big jump in deserialization time and memory comes from introducing
maps to represent deserialized dictionaries. With PyPyMapNoCache
deserialization time goes down because it's much cheaper to walk the tree
of maps and store all deserialized objects into an array of values than to
build hashmaps with the same keys again and again. Memory use goes down
for the same reason: it takes a lot less memory to store the shared
structure of each set of keys in the map, as opposed to repeating it again
and again in every hashmap.<br />
We can look at some numbers about every benchmark again. The table shows how
many map-based dictionaries are deserialized for every benchmark, and how many
hashmap-backed dictionaries. We see that the number of hashmap-backed
dictionaries is often zero, or at most a small percentage of all dictionaries
in each benchmark. Yelp has the biggest number of hashmap-backed dictionaries.
The reason for this is that the input file contains hashmaps that store
combinations of various features of Yelp businesses, and a lot of these
combinations are totally unique to a business. Therefore the heuristics
determine that it's better to store these using hashmaps.<br />
<table class="dataframe">
<thead>
<tr style="text-align: right;">
<th>Benchmark</th>
<th>Map Dicts</th>
<th>Regular Dicts</th>
<th>% Regular Dicts</th>
</tr>
</thead>
<tbody>
<tr style="text-align: right">
<td>Censys</td>
<td>4 049 235</td>
<td>1 042</td>
<td>0.03</td>
</tr>
<tr style="text-align: right">
<td>Gharchive</td>
<td>955 301</td>
<td>0</td>
<td>0.00</td>
</tr>
<tr style="text-align: right">
<td>NYTimes</td>
<td>80 393</td>
<td>0</td>
<td>0.00</td>
</tr>
<tr style="text-align: right">
<td>Reddit</td>
<td>1 201 257</td>
<td>0</td>
<td>0.00</td>
</tr>
<tr style="text-align: right">
<td>Rosie</td>
<td>6 248 966</td>
<td>0</td>
<td>0.00</td>
</tr>
<tr style="text-align: right">
<td>TPCH</td>
<td>1 000 000</td>
<td>0</td>
<td>0.00</td>
</tr>
<tr style="text-align: right">
<td>Wikidata</td>
<td>1 923 460</td>
<td>46 905</td>
<td>2.38</td>
</tr>
<tr style="text-align: right">
<td>Yelp</td>
<td>443 140</td>
<td>52 051</td>
<td>10.51</td>
</tr>
</tbody>
</table>
We can also look at numbers about how often the memcmp-based speculative
parsing of the next key of a given map succeeds. Looking at statistics
about each benchmark, we can see that the speculation of what key we
expect next pays off in a significant percentage of cases, between 63% for
Wikidata where the dictionary structures are quite irregular, and 99% for
Reddit, where all the dictionaries have the same set of keys.<br />
<table class="dataframe">
<thead>
<tr style="text-align: right;">
<th>Benchmark</th>
<th>Number of Keys</th>
<th>Map Transitions</th>
<th>% Successful Speculation</th>
</tr>
</thead>
<tbody>
<tr style="text-align: right;">
<td>Censys</td>
<td>14 404 234</td>
<td>14 403 243</td>
<td>65.79</td>
</tr>
<tr style="text-align: right;">
<td>Gharchive</td>
<td>6 637 881</td>
<td>6 637 881</td>
<td>86.71</td>
</tr>
<tr style="text-align: right;">
<td>NYTimes</td>
<td>417 337</td>
<td>417 337</td>
<td>79.85</td>
</tr>
<tr style="text-align: right;">
<td>Reddit</td>
<td>25 226 397</td>
<td>25 226 397</td>
<td>100.00</td>
</tr>
<tr style="text-align: right;">
<td>Rosie</td>
<td>28 500 101</td>
<td>28 500 101</td>
<td>90.37</td>
</tr>
<tr style="text-align: right;">
<td>TPCH</td>
<td>6 700 000</td>
<td>6 700 000</td>
<td>86.57</td>
</tr>
<tr style="text-align: right;">
<td>Wikidata</td>
<td>6 235 088</td>
<td>5 267 744</td>
<td>63.68</td>
</tr>
<tr style="text-align: right;">
<td>Yelp</td>
<td>5 133 914</td>
<td>4 593 980</td>
<td>90.43</td>
</tr>
<tr style="text-align: right;">
<td>geomean</td>
<td></td>
<td></td>
<td>82.04</td>
</tr>
</tbody>
</table>
General string caching is the most unclear optimization. On the one hand its
impact on memory usage is quite substantial, leading to a 20% reduction for
Gharchive and Reddit, up to a 2× improvement for Yelp. On the other hand, the
effect on performance is less clear, since it even leads to a slowdown in
Gharchive and Reddit, and generally only a small improvement. Choosing the
right heuristic for when to disable the cache also has somewhat unclear effects
and is definitely a topic worthy of further investigation.<br />
<h3>
Comparison against other JSON Decoders</h3>
To get a more general feeling of the performance and memory usage of the
improved PyPy parser, we compare it against CPython's built-in json
parser, ujson for CPython, Node's (V8) JSON parser and RapidJSON. For
better context for the memory usage I also show the file size of the input
files.<br />
These benchmarks are not really an apples-to-apple comparison. All of the
implementations use different in-memory representations of strings in
the deserialized data-structure (Node uses two bytes per character in
a string, <a href="https://www.python.org/dev/peps/pep-0393/">in CPython it
depends</a> but 4 bytes on my
machine), PyPyBaseline uses four bytes, PyPy and RapidJSON use utf-8). But
it's still interesting to get some ballpark numbers. The results are as
follows:<br />
<img src="https://docs.google.com/uc?id=1Q-aFNXE-sWJi5kSKTwmQNLz5LI3DDgtm" width="800/" />
<img src="https://docs.google.com/uc?id=1sgGyqp93_czrxN4IYkXeZ-jFWCAs37bu" width="800/" />
<br />
As we can see, PyPyFull handily beats CPython and ujson, with a geometric
mean of the improvement of about 2.5×. The memory improvement can be even
more extreme, with an improvement of over 4× against CPython/ujson in some
cases (CPython gives better memory sizes, because its parser caches the
keys of dictionaries as well). Node is often more than 50% slower, whereas
RapidJSON beats us easily, by a factor of 2× on average.<br />
<h2>
Conclusions</h2>
While the speedup I managed to achieve over the course of this project is
nice and I am certainly happy to beat both CPython and Node, I am
ultimately still annoyed that RapidJSON manages to maintain such a clear
lead over PyPyFull, and would like to get closer to it. One problem that
PyPy suffers compared to RapidJSON is the overhead of garbage collection.
Deserializing large JSON files is pretty much the worst case for the
generational GC that PyPy uses, since none of the deserialized objects die
young (and the GC expects that most objects do). That means that a lot of
the deserialization time of PyPy is wasted allocating the resulting
objects in the nursery, and then copying them into the old generation.
Somehow, this should be done in better ways, but all my attempts to not
have to do the copy did not seem to help much. So maybe more improvements
are possible, if I can come up with more ideas.<br />
On the memory side of things, Node/V8 is beating PyPy clearly which might
indicate more general problems in how we represent Python objects in
memory. On the other hand, I think it's cool that we are competitive with
RapidJSON in terms of memory and often within 2× of the file size.<br />
An effect that I didn't consider at all in this blog post is the fact that
accessing the deserialized objects with constants strings is <i>also</i> faster
than with regular dictionaries, due to them being represented with maps.
More benchmarking work to do in the future!<br />
If you have your own programs that run on PyPy and use the json parser
a lot, please measure them on the new code and let me know whether you see
any difference!
Carl Friedrich Bolz-Tereickhttp://www.blogger.com/profile/00518922641059511014noreply@blogger.com4tag:blogger.com,1999:blog-3971202189709462152.post-68487267294762453902019-08-07T19:31:00.001+02:002019-08-07T19:31:55.351+02:00A second life for the SandboxHi all,<br />
<br />
<a href="https://anvil.works/" target="_blank">Anvil</a> is a UK-based company sponsoring one month of work to revive PyPy's
"sandbox" mode and upgrade it to PyPy3. Thanks to them, sandboxing will be
given a second life!<br />
<br />
The <a href="http://doc.pypy.org/en/latest/sandbox.html">sandboxed PyPy</a> is a special version of PyPy that runs
fully isolated. It gives a safe way to execute arbitrary Python
programs (<i>whole</i> programs, not small bits of code inside your larger Python
program). Such scripts can be fully untrusted, and they can try to do
anything—there are no syntax-based restrictions, for example—but whatever
they do, any communication with the external world is not actually done but
delegated to the parent process. This is similar but much more flexible than
Linux's Seccomp approach, and it is more lightweight than setting up a full
virtual machine. It also works without operating system support.<br />
<br />
However, during the course of the years the sandbox mode of PyPy has been
mostly unmaintained and unsupported by the core developers, mostly because of
a lack of interest by users and because it took too much effort to maintain
it.<br />
<br />
Now we have found that we have an actual user, <a href="https://anvil.works/" target="_blank">Anvil</a>. As far as I can tell
they are still using a very old version of PyPy, the last one that supported
sandboxing. This is where this contract comes from: the goal is to modernize sandboxing and port it to PyPy3.<br />
<br />
Part of my motivation for accepting this work is that I may have found a way to
tweak the protocol on the pipe between the sandboxed PyPy and the parent
controller process. This should make the sandboxed PyPy more resilient against
future developments and easier to maintain; at most, in the future some tweaks will be needed in the
controller process but hopefully not deep inside the guts of the sandboxed
PyPy. Among the advantages, such a more robust solution should mean that we
can actually get a working sandboxed PyPy—or sandboxed PyPy3 or sandboxed
version of <a href="https://rpython.readthedocs.io/en/latest/examples.html">any other interpreter written in RPython</a>—with just an extra
argument when calling <span style="font-family: "Courier New", Courier, monospace;">rpython</span> to translate this interpreter. If everything
works as planned, sandboxing may be given a second life.<br />
<br />
Armin RigoArmin Rigohttp://www.blogger.com/profile/06300515270104686574noreply@blogger.com1tag:blogger.com,1999:blog-3971202189709462152.post-71615234032471180062019-07-25T15:41:00.001+02:002019-07-25T15:41:57.601+02:00PyPy JIT for Aarch64<div dir="ltr" style="text-align: left;" trbidi="on">
<p>Hello everyone.</p>
<p>We are pleased to announce the availability of the new PyPy for AArch64. This
port brings PyPy's high-performance just-in-time compiler to the AArch64
platform, also known as 64-bit ARM. With the addition of AArch64, PyPy now
supports a total of 6 architectures: x86 (32 & 64bit), ARM (32 & 64bit), PPC64,
and s390x. The AArch64 work was funded by ARM Holdings Ltd. and Crossbar.io.</p>
<p>PyPy has a good record of boosting the performance of Python programs on the
existing platforms. To show how well the new PyPy port performs, we compare the
performance of PyPy against CPython on a set of benchmarks. As a point of
comparison, we include the results of PyPy on x86_64.</p>
<p>Note, however, that the results presented here were measured on a Graviton A1
machine from AWS, which comes with a very serious word of warning: Graviton A1's
are virtual machines, and, as such, they are not suitable for benchmarking. If
someone has access to a beefy enough (16G) ARM64 server and is willing to give
us access to it, we are happy to redo the benchmarks on a real machine. One
major concern is that while a virtual CPU is 1-to-1 with a real CPU, it is not
clear to us how CPU caches are shared across virtual CPUs. Also, note that by no
means is this benchmark suite representative enough to average the results. Read
the numbers individually per benchmark.</p>
<p>The following graph shows the speedups on AArch64 of PyPy (hg id 2417f925ce94) compared to
CPython (2.7.15), as well as the speedups on a x86_64 Linux laptop
comparing the most recent release, PyPy 7.1.1, to CPython 2.7.16.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-zC5JsKK5msM/XTmxQdJawEI/AAAAAAAAJgY/mDR_IbpJOAEImVSkGtVb2V5snEtqZcdnQCLcBGAs/s1600/2019-07-arm64-speedups.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-zC5JsKK5msM/XTmxQdJawEI/AAAAAAAAJgY/mDR_IbpJOAEImVSkGtVb2V5snEtqZcdnQCLcBGAs/s400/2019-07-arm64-speedups.png" width="400" height="231" data-original-width="925" data-original-height="535" /></a></div>
<p>In the majority of benchmarks, the speedups achieved on AArch64 match those
achieved on the x86_64 laptop. Over CPython, PyPy on AArch64 achieves speedups
between 0.6x to 44.9x. These speedups are comparable to x86_64, where the
numbers are between 0.6x and 58.9x.</p>
<p>The next graph compares between the speedups achieved on AArch64 to the speedups
achieved on x86_64, i.e., how great the speedup is on AArch64 vs. the same
benchmark on x86_64. This comparison should give a rough idea about the
quality of the generated code for the new platform.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-29YGxYG1SLU/XTmxbjoz9nI/AAAAAAAAJgc/efNeh3P4guwHtgqKXjyMgfwfUbMFl3eDACLcBGAs/s1600/2019-07-arm64-relative.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-29YGxYG1SLU/XTmxbjoz9nI/AAAAAAAAJgc/efNeh3P4guwHtgqKXjyMgfwfUbMFl3eDACLcBGAs/s400/2019-07-arm64-relative.png" width="400" height="133" data-original-width="925" data-original-height="307" /></a></div>
<p>Note that we see a large variance: There are generally three groups of
benchmarks - those that run at more or less the same speed, those that
run at 2x the speed, and those that run at 0.5x the speed of x86_64.</p>
<p>The variance and disparity are likely related to a variety of issues, mostly due
to differences in architecture. What <em>is</em> however interesting is that, compared
to measurements performed on older ARM boards, the branch predictor on the
Graviton A1 machine appears to have improved. As a result, the speedups achieved
by PyPy over CPython are smaller than on older ARM boards: sufficiently branchy
code, like CPython itself, simply runs a lot faster. Hence, the advantage
of the non-branchy code generated by PyPy's just-in-time compiler is smaller.</p>
<p>One takeaway here is that many possible improvements for PyPy have yet to be
implemented. This is true for both of the above platforms, but probably more so
for AArch64, which comes with a large number of CPU registers. The PyPy backend
was written with x86 (the 32-bit variant) in mind, which has a really low number
of registers. We think that we can improve in the area of emitting more modern
machine code, which may have a higher impact on AArch64 than on x86_64. There is
also a number of missing features in the AArch64 backend. These features are
currently implemented as expensive function calls instead of inlined native
instructions, something we intend to improve.</p>
<p>Best,</p>
<p>Maciej Fijalkowski, Armin Rigo and the PyPy team</p>
<br /></div>
Maciej Fijalkowskihttp://www.blogger.com/profile/11410841070239382771noreply@blogger.com5tag:blogger.com,1999:blog-3971202189709462152.post-65390236309912173672019-04-18T16:24:00.002+02:002019-04-18T16:24:25.530+02:00PyPy 7.1.1 Bug Fix Release<div dir="ltr" style="text-align: left;" trbidi="on">
The PyPy team is proud to release a bug-fix release version 7.1.1 of PyPy, which
includes two different interpreters:<br />
<ul style="text-align: left;">
<li>PyPy2.7, which is an interpreter supporting the syntax and the features of
Python 2.</li>
<li>PyPy3.6-beta: the second official release of PyPy to support 3.6
features.</li>
</ul>
<blockquote>
<div>
</div>
</blockquote>
The interpreters are based on much the same codebase, thus the double
release.<br />
<br />
This bugfix fixes bugs related to large lists, dictionaries, and sets, some corner cases with unicode, and <a href="https://www.python.org/dev/peps/pep-3118/">PEP 3118</a> memory views of ctype structures. It also fixes a few issues related to the ARM 32-bit backend. For the complete list see the <a href="http://doc.pypy.org/en/latest/release-v7.1.1.html">changelog.</a><br />
<br />
You can download the v7.1.1 releases here:<br />
<blockquote>
<div>
<a class="reference external" href="http://pypy.org/download.html">http://pypy.org/download.html</a></div>
</blockquote>
<br />
As always, this release is 100% compatible with the previous one and fixed
several issues and bugs raised by the growing community of PyPy users.
We strongly recommend updating.<br />
<br />
The PyPy3.6 release is rapidly maturing, but is still considered beta-quality.<br />
<br />
The PyPy team </div>
mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-47795480533593862842019-04-04T22:26:00.002+02:002019-04-04T22:26:37.325+02:00An RPython JIT for LPegs<p>The following is a guest post by Stefan Troost, he describes the work he did in his bachelor thesis:</p>
<p>In this project we have used the RPython infrastructure to generate an RPython
JIT for a
less-typical use-case: string pattern matching. The work in this project is
based on <a href="bford.info/pub/lang/peg.pdf">Parsing Expression Grammars</a> and
<a href="www.inf.puc-rio.br/~roberto/docs/peg.pdf">LPeg</a>, an implementation of PEGs
designed to be used in Lua. In this post I will showcase some of the work that
went into this project, explain PEGs in general and LPeg in particular, and
show some benchmarking results.</p>
<h1><a id="Parsing_Expression_Grammars_12"></a>Parsing Expression Grammars</h1>
<p>Parsing Expression Grammas (PEGs) are a type of formal grammar similar to
context-free grammars, with the main difference being that they are unambiguous.
This is achieved by redefining the ambiguous choice operator of CFGs (usually
noted as <code>|</code>) as an <em>ordered</em> choice operator. In practice this means that if a
rule in a PEG presents a choice, a PEG parser should prioritize the leftmost
choice. Practical uses include parsing and pattern-searching. In comparison to
regular expressions PEGs stand out as being able to be parsed in linear time,
being strictly more powerful than REs, as well as being arguably more readable.</p>
<h1><a id="LPeg_24"></a>LPeg</h1>
<p>LPeg is an implementation of PEGs written in C to be used in the Lua
programming language. A crucial detail of this implementation is that it parses
high level function calls, translating them to bytecode, and interpreting that
bytecode. Therefore, we are able to improve that implementation by replacing
LPegs C-interpreter with an RPython JIT. I use a modified version of LPeg to
parse PEGs and pass the generated Intermediate Representation, the LPeg
bytecode, to my VM.</p>
<h1><a id="The_LPeg_Library_35"></a>The LPeg Library</h1>
<p>The LPeg Interpreter executes bytecodes created by parsing a string of commands
using the LPeg library. Our JIT supports a subset of the LPeg library, with
some of the more advanced or obscure features being left out. Note that this
subset is still powerful enough to do things like parse JSON.</p>
<table class="table table-striped table-bordered">
<thead>
<tr>
<th>Operator</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>lpeg.P(string)</td>
<td>Matches string literally</td>
</tr>
<tr>
<td>lpeg.P(n)</td>
<td>Matches exactly n characters</td>
</tr>
<tr>
<td>lpeg.P(-n)</td>
<td>Matches at most n characters</td>
</tr>
<tr>
<td>lpeg.S(string)</td>
<td>Matches any character in string (Set)</td>
</tr>
<tr>
<td>lpeg.R(“xy”)</td>
<td>Matches any character between x and y (Range)</td>
</tr>
<tr>
<td>pattern^n</td>
<td>Matches at least n repetitions of pattern</td>
</tr>
<tr>
<td>pattern^-n</td>
<td>Matches at most n repetitions of pattern</td>
</tr>
<tr>
<td>pattern1 * pattern2</td>
<td>Matches pattern1 followed by pattern2</td>
</tr>
<tr>
<td>pattern1 + pattern2</td>
<td>Matches pattern1 or pattern2 (ordered choice)</td>
</tr>
<tr>
<td>pattern1 - pattern2</td>
<td>Matches pattern1 if pattern2 does not match</td>
</tr>
<tr>
<td>-pattern</td>
<td>Equivalent to ("" - pattern)</td>
</tr>
</tbody>
</table>
<p>As a simple example, the pattern <code>lpeg.P"ab"+lpeg.P"cd"</code> would match either the
string <code>ab</code> or the string <code>cd</code>.</p>
<p>To extract semantic information from a pattern, captures are needed. These are
the following operations supported for capture creation.</p>
<table class="table table-striped table-bordered">
<thead>
<tr>
<th>Operation</th>
<th>What it produces</th>
</tr>
</thead>
<tbody>
<tr>
<td>lpeg.C(pattern)</td>
<td>the match for patten plus all captures made by pattern</td>
</tr>
<tr>
<td>lpeg.Cp()</td>
<td>the current position (matches the empty string)</td>
</tr>
</tbody>
</table>
<p>(tables taken from the <a href="http://www.inf.puc-rio.br/~roberto/lpeg/">LPeg documentation</a>)</p>
<p>These patterns are translated into bytecode by LPeg, at which point we are able
to pass them into our own VM.</p>
<h1><a id="The_VM_73"></a>The VM</h1>
<p>The state of the VM at any point is defined by the following variables:</p>
<ul>
<li><code>PC</code>: program counter indicating the current instruction</li>
<li><code>fail</code>: an indicator that some match failed and the VM must backtrack</li>
<li><code>index</code>: counter indicating the current character of the input string</li>
<li><code>stackentries</code>: stack of return addresses and choice points</li>
<li><code>captures</code>: stack of capture objects</li>
</ul>
<p>The execution of bytecode manipulates the values of these variables in order to
produce some output. How that works and what that output looks like will be
explained now.</p>
<h1><a id="The_Bytecode_88"></a>The Bytecode</h1>
<p>For simplicity’s sake I will not go over every individual bytecode, but instead
choose some that exemplify the core concepts of the bytecode set.</p>
<h2><a id="generic_character_matching_bytecodes_93"></a>generic character matching bytecodes</h2>
<ul>
<li>
<p><code>any</code>: Checks if there’s any characters left in the inputstring. If it succeeds
it advances the index and PC by 1, if not the bytecode fails.</p>
</li>
<li>
<p><code>char c</code>: Checks if there is another bytecode in the input and if that
character is equal to <code>c</code>. Otherwise the bytecode fails.</p>
</li>
<li>
<p><code>set c1-c2</code>: Checks if there is another bytecode in the input and if that
character is between (including) c1 and c2. Otherwise the bytecode fails.</p>
</li>
</ul>
<p>These bytecodes are the easiest to understand with very little impact on the
VM. What it means for a bytecode to fail will be explained when
we get to control flow bytecodes.</p>
<p>To get back to the example, the first half of the pattern <code>lpeg.P"ab"</code> could be
compiled to the following bytecodes:</p>
<pre><code>char a
char b
</code></pre>
<h2><a id="control_flow_bytecodes_117"></a>control flow bytecodes</h2>
<ul>
<li>
<p><code>jmp n</code>: Sets <code>PC</code> to <code>n</code>, effectively jumping to the n’th bytecode. Has no defined
failure case.</p>
</li>
<li>
<p><code>testchar c n</code>: This is a lookahead bytecode. If the current character is equal
to <code>c</code> it advances the <code>PC</code> but not the index. Otherwise it jumps to <code>n</code>.</p>
</li>
<li>
<p><code>call n</code>: Puts a return address (the current <code>PC + 1</code>) on the <code>stackentries</code> stack
and sets the <code>PC</code> to <code>n</code>. Has no defined failure case.</p>
</li>
<li>
<p><code>ret</code>: Opposite of <code>call</code>. Removes the top value of the <code>stackentries</code> stack (if
the string of bytecodes is valid this will always be a return address) and
sets the <code>PC</code> to the removed value. Has no defined failure case.</p>
</li>
<li>
<p><code>choice n</code>: Puts a choice point on the <code>stackentries</code> stack. Has no defined
failure case.</p>
</li>
<li>
<p><code>commit n</code>: Removes the top value of the <code>stackentries</code> stack (if the string of
bytecodes is valid this will always be a choice point) and jumps to <code>n</code>. Has no
defined failure case.</p>
</li>
</ul>
<p>Using <code>testchar</code> we can implement the full pattern <code>lpeg.P"ab"+lpeg.P"cd"</code> with
bytecode as follows:</p>
<pre><code>testchar a -> L1
any
char b
end
any
L1: char c
char d
end
</code></pre>
<p>The <code>any</code> bytecode is needed because <code>testchar</code> does not consume a character
from the input.</p>
<h2><a id="Failure_Handling_Backtracking_and_Choice_Points_158"></a>Failure Handling, Backtracking and Choice Points</h2>
<p>A choice point consist of the VM’s current <code>index</code> and <code>capturestack</code> as well as a
<code>PC</code>. This is not the VM’s <code>PC</code> at the time of creating the
choicepoint, but rather the <code>PC</code> where we should continue trying to find
matches when a failure occurs later.</p>
<p>Now that we have talked about choice points, we can talk about how the VM
behaves in the fail state. If the VM is in the fail state, it removed entries
from the stackentries stack until it finds a choice point. Then it backtracks
by restoring the VM to the state defined by the choice point. If no choice
point is found this way, no match was found in the string and the VM halts.</p>
<p>Using choice points we could implement the example <code>lpeg.P"ab" + lpeg.P"cd"</code> in
bytecodes in a different way (LPEG uses the simpler way shown above, but for
more complex patterns it can’t use the lookahead solution using <code>testchar</code>):</p>
<pre><code>choice L1
char a
char b
commit
end
L1: char c
char d
end
</code></pre>
<h2><a id="Captures_188"></a>Captures</h2>
<p>Some patterns require the VM to produce more output than just “the pattern
matched” or “the pattern did not match”. Imagine searching a document for an
IPv4 address and all your program responded was “I found one”. In order to
recieve additional information about our inputstring, captures are used.</p>
<h3><a id="The_capture_object_195"></a>The capture object</h3>
<p>In my VM, two types of capture objects are supported, one of them being the
position capture. It consists of a single index referencing the point in the
inputstring where the object was created.</p>
<p>The other type of capture object is called simplecapture. It consists of an
index and a size value, which are used to reference a substring of the
inputstring. In addition, simplecaptures have a variable status indicating they
are either open or full. If a simplecapture object is open, that means that its
size is not yet determined, since the pattern we are capturing is of variable
length.</p>
<p>Capture objects are created using the following bytecodes:</p>
<ul>
<li>
<p><code>Fullcapture Position</code>: Pushes a positioncapture object with the current index
value to the capture stack.</p>
</li>
<li>
<p><code>Fullcapture Simple n</code>: Pushes a simplecapture object with current index value
and size=n to the capture stack.</p>
</li>
<li>
<p><code>Opencapture Simple</code>: Pushes an open simplecapture object with current index
value and undetermined size to the capture stack.</p>
</li>
<li>
<p><code>closecapture</code>: Sets the top element of the capturestack to full and sets its
size value using the difference between the current index and the index of
the capture object.</p>
</li>
</ul>
<h1><a id="The_RPython_Implementation_224"></a>The RPython Implementation</h1>
<p>These, and many more bytecodes were implemented in an RPython-interpreter.
By adding jit hints, we were able to generate an efficient JIT.
We will now take a closer look at some implementations of bytecodes.</p>
<pre><code class="language-python">...
<span class="hljs-keyword">elif</span> instruction.name == <span class="hljs-string">"any"</span>:
<span class="hljs-keyword">if</span> index >= len(inputstring):
fail = <span class="hljs-keyword">True</span>
<span class="hljs-keyword">else</span>:
pc += <span class="hljs-number">1</span>
index += <span class="hljs-number">1</span>
...
</code></pre>
<p>The code for the <code>any</code>-bytecode is relatively straight-forward. It either
advances the <code>pc</code> and <code>index</code> or sets the VM into the fail state,
depending on whether the end of the inputstring has been reached or not.</p>
<pre><code class="language-python">...
<span class="hljs-keyword">if</span> instruction.name == <span class="hljs-string">"char"</span>:
<span class="hljs-keyword">if</span> index >= len(inputstring):
fail = <span class="hljs-keyword">True</span>
<span class="hljs-keyword">elif</span> instruction.character == inputstring[index]:
pc += <span class="hljs-number">1</span>
index += <span class="hljs-number">1</span>
<span class="hljs-keyword">else</span>:
fail = <span class="hljs-keyword">True</span>
...
</code></pre>
<p>The <code>char</code>-bytecode also looks as one would expect. If the VM’s string index is
out of range or the character comparison fails, the VM is put into the
fail state, otherwise the <code>pc</code> and <code>index</code> are advanced by 1. As you can see, the
character we’re comparing the current inputstring to is stored in the
instruction object (note that this code-example has been simplified for
clarity, since the actual implementation includes a jit-optimization that
allows the VM to execute multiple successive char-bytecodes at once).</p>
<pre><code class="language-python">...
<span class="hljs-keyword">elif</span> instruction.name == <span class="hljs-string">"jmp"</span>:
pc = instruction.goto
...
</code></pre>
<p>The <code>jmp</code>-bytecode comes with a <code>goto</code> value which is a <code>pc</code> that we want
execution to continue at.</p>
<pre><code class="language-python">...
<span class="hljs-keyword">elif</span> instruction.name == <span class="hljs-string">"choice"</span>:
pc += <span class="hljs-number">1</span>
choice_points = choice_points.push_choice_point(
instruction.goto, index, captures)
...
</code></pre>
<p>As we can see here, the <code>choice</code>-bytecode puts a choice point onto the stack that
may be backtracked to if the VM is in the fail-state. This choice point
consists of a pc to jump to which is determined by the bytecode.
But it also includes the current <code>index</code> and <code>captures</code> values at the time the choice
point was created. An ongoing topic of jit optimization is which data structure
is best suited to store choice points and return addresses. Besides naive
implementations of stacks and single-linked lists, more case-specific
structures are also being tested for performance.</p>
<h1><a id="Benchmarking_Result_299"></a>Benchmarking Result</h1>
<p>In order to find out how much it helps to JIT LPeg patterns we ran a small
number of benchmarks. We used an otherwise idle Intel Core i5-2430M CPU with
3072 KiB of cache and 8 GiB of RAM, running with 2.40GHz. The machine was
running Ubuntu 14.04 LTS, Lua 5.2.3 and we used GNU grep 2.16 as a point of
comparison for one of the benchmarks. The benchmarks were run 100 times in
a new process each. We measured the full runtime of the called process,
including starting the process.</p>
<p>Now we will take a look at some plots generated by measuring the runtime of
different iterations of my JIT compared to lua and using bootstrapping to
generate a sampling distribution of mean values. The plots contain a few different
variants of pypeg, only the one called "fullops" is important for this blog post, however.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-Qv3aZapdMOk/XKXMDhTGujI/AAAAAAAAsNo/b7QShypeeV8mvePwTjPgmDSzUVB6EsiaACLcBGAs/s1600/rawplot_100_kb_urlinput.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-Qv3aZapdMOk/XKXMDhTGujI/AAAAAAAAsNo/b7QShypeeV8mvePwTjPgmDSzUVB6EsiaACLcBGAs/s400/rawplot_100_kb_urlinput.png" width="400" height="300" data-original-width="800" data-original-height="600" /></a></div>
<p>This is the plot for a search pattern that searches a text file for valid URLs.
As we can see, if the input file is as small as 100 kb, the benefits of JIT
optimizations do not outweigh the time required to generate the
machine code. As a result, all of our attempts perform significantly slower
than LPeg.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-mTry3w1vSFA/XKXMNoaeHOI/AAAAAAAAsNs/YhdGWoGmyjU3yxqFgcePBklGv-qw13wXgCLcBGAs/s1600/rawplot_500_kb_urlinput.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-mTry3w1vSFA/XKXMNoaeHOI/AAAAAAAAsNs/YhdGWoGmyjU3yxqFgcePBklGv-qw13wXgCLcBGAs/s400/rawplot_500_kb_urlinput.png" width="400" height="300" data-original-width="800" data-original-height="600" /></a></div>
<p>This is the plot for the same search pattern on a larger input file. As we can
see, for input files as small as 500 kb our VM already outperforms LPeg’s. An
ongoing goal of continued development is to get this lower boundary as small as
possible.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-Cr4BE9Cejg8/XKXMUXamP3I/AAAAAAAAsN0/t5PTo0Q4vPMLwL12bdQ93Q4bAMIjJTEVACLcBGAs/s1600/rawplot_5_mb_urlinput.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-Cr4BE9Cejg8/XKXMUXamP3I/AAAAAAAAsN0/t5PTo0Q4vPMLwL12bdQ93Q4bAMIjJTEVACLcBGAs/s400/rawplot_5_mb_urlinput.png" width="400" height="300" data-original-width="800" data-original-height="600" /></a></div>
<p>The benefits of a JIT compared to an Interpreter become more and more relevant
for larger input files. Searching a file as large as 5 MB makes this fairly
obvious and is exactly the behavior we expect.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-uIoguDb7ApE/XKXMngYEnSI/AAAAAAAAsOA/zdv2WAfdRwwruS1yOdX7jFz0nB_PPQqRACLcBGAs/s1600/rawplot_50_kb_jsoninput.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-uIoguDb7ApE/XKXMngYEnSI/AAAAAAAAsOA/zdv2WAfdRwwruS1yOdX7jFz0nB_PPQqRACLcBGAs/s400/rawplot_50_kb_jsoninput.png" width="400" height="300" data-original-width="800" data-original-height="600" /></a></div>
<p>This time we are looking at a different more complicated pattern, one that parses JSON used on a
50 kb input file. As expected, LPeg outperforms us, however, something
unexpected happens as we increase the filesize.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-r1-Aq39Oe9I/XKXMuQlcB6I/AAAAAAAAsOE/Eqmj7i3JKz0zdTK6Cd1ai11aZCf-EZkVwCLcBGAs/s1600/rawplot_100_kb_jsoninput.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-r1-Aq39Oe9I/XKXMuQlcB6I/AAAAAAAAsOE/Eqmj7i3JKz0zdTK6Cd1ai11aZCf-EZkVwCLcBGAs/s400/rawplot_100_kb_jsoninput.png" width="400" height="300" data-original-width="800" data-original-height="600" /></a></div>
<p>Since LPeg has a defined maximum depth of 400 for the choicepoints and
returnaddresses Stack, LPeg by default refuses to parse files as small as
100kb. This raises the question if LPeg was intended to be used for parsing.
Until a way to increase LPeg’s maximum stack depth is found, no comparisons to
LPeg can be performed at this scale. This has been a low priority in the past
but may be addressed in the future.</p>
<p>To conclude, we see that at sufficiently high filesizes, our JIT outperforms
the native LPeg-interpreter. This lower boundary is currently as low as 100kb
in filesize.</p>
<h1><a id="Conclusion_353"></a>Conclusion</h1>
<p>Writing a JIT for PEG’s has proven itself to be a challenge worth pursuing, as
the expected benefits of a JIT compared to an Interpreter have been achieved.
Future goals include getting LPeg to be able to use parsing patterns on larger
files, further increasing the performance of our JIT and comparing it to other
well-known programs serving a similar purpose, like grep.</p>
<p>The prototype implementation that I described in this post can be found
<a href="https://github.com/sktroost/PyPeg/tree/master/pypeg">on Github</a>
(it's a bit of a hack in some places, though).</p>
Carl Friedrich Bolz-Tereickhttp://www.blogger.com/profile/00518922641059511014noreply@blogger.com0tag:blogger.com,1999:blog-3971202189709462152.post-4513240880287929122019-03-24T20:14:00.000+01:002019-03-24T20:14:11.851+01:00PyPy v7.1 released; now uses utf-8 internally for unicode strings<div dir="ltr" style="text-align: left;" trbidi="on">
The PyPy team is proud to release version 7.1.0 of PyPy, which includes
two different interpreters:<br />
<blockquote>
<div>
<ul class="simple">
<li>PyPy2.7, which is an interpreter supporting the syntax and the features of
Python 2.7</li>
<li>PyPy3.6-beta: this is the second official release of PyPy to support 3.6
features, although it is still considered beta quality.</li>
</ul>
</div>
</blockquote>
The interpreters are based on much the same codebase, thus the double
release.<br />
<br />
This release, coming fast on the heels of 7.0 in February, finally merges the
internal refactoring of unicode representation as UTF-8. Removing the
conversions from strings to unicode internally lead to a nice speed bump. We merged the utf-8 changes to the py3.5 branch (Python3.5.3) but will concentrate on 3.6 going forward.<br />
<br />
We also improved the ability to use the buffer protocol with ctype structures
and arrays.<br />
<br />
The <a class="reference external" href="http://cffi.readthedocs.io/">CFFI</a> backend has been updated to version 1.12.2. We recommend using CFFI
rather than c-extensions to interact with C, and <a class="reference external" href="https://cppyy.readthedocs.io/">cppyy</a> for interacting with
C++ code.<br />
You can download the v7.1 releases here:<br />
<blockquote>
<div>
<a class="reference external" href="http://pypy.org/download.html">http://pypy.org/download.html</a></div>
</blockquote>
We would like to thank our donors for the continued support of the PyPy
project. If PyPy is not quite good enough for your needs, we are available for
direct consulting work.<br />
<br />
We would also like to thank our contributors and encourage new people to join
the project. PyPy has many layers and we need help with all of them: <a class="reference external" href="http://doc.pypy.org/en/latest/index.html">PyPy</a>
and <a class="reference external" href="https://rpython.readthedocs.org/">RPython</a> documentation improvements, tweaking popular modules to run
on pypy, or general <a class="reference external" href="http://doc.pypy.org/en/latest/project-ideas.html">help</a> with making RPython’s JIT even better.<br />
<div class="section" id="what-is-pypy">
<h2 style="text-align: center;">
<span style="font-size: x-large;">What is PyPy?</span></h2>
PyPy is a very compliant Python interpreter, almost a drop-in replacement for
CPython 2.7, 3.6. It’s fast (<a class="reference external" href="http://speed.pypy.org/">PyPy and CPython 2.7.x</a> performance
comparison) due to its integrated tracing JIT compiler.<br />
<br />
We also welcome developers of other <a class="reference external" href="http://rpython.readthedocs.io/en/latest/examples.html">dynamic languages</a> to see what RPython
can do for them.<br />
This PyPy release supports:<br />
<strong> </strong></div>
<div class="section" id="what-is-pypy">
<ul class="simple">
<li><strong>x86</strong> machines on most common operating systems
(Linux 32/64 bits, Mac OS X 64 bits, Windows 32 bits, OpenBSD, FreeBSD)</li>
<li>big- and little-endian variants of <strong>PPC64</strong> running Linux</li>
<li> <b>ARM32 </b>although we do not supply downloadable binaries at this time</li>
<li><strong>s390x</strong> running Linux</li>
</ul>
<div class="section" id="changelog">
<h2 style="text-align: center;">
<span style="font-size: x-large;">What else is new?</span></h2>
PyPy 7.0 was released in February, 2019.
There are many incremental improvements to RPython and PyPy, for more information see the <a class="reference external" href="http://doc.pypy.org/en/latest/release-v7.1.0.html#changelog">changelog</a>.<br />
<br />
Please update, and continue to help us make PyPy better.<br />
<br />
<br />
Cheers, The PyPy team
</div>
<br />
</div>
</div>
mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com9tag:blogger.com,1999:blog-3971202189709462152.post-6068753333561560762019-02-11T11:55:00.001+01:002019-02-11T11:56:02.177+01:00PyPy v7.0.0: triple release of 2.7, 3.5 and 3.6-alpha<style type="text/css">
/*
:Author: David Goodger (goodger@python.org)
:Id: $Id: html4css1.css 7952 2016-07-26 18:15:59Z milde $
:Copyright: This stylesheet has been placed in the public domain.
Default cascading style sheet for the HTML output of Docutils.
See http://docutils.sf.net/docs/howto/html-stylesheets.html for how to
customize this style sheet.
*/
/* used to remove borders from tables and images */
.borderless, table.borderless td, table.borderless th {
border: 0 }
table.borderless td, table.borderless th {
/* Override padding for "table.docutils td" with "! important".
The right padding separates the table cells. */
padding: 0 0.5em 0 0 ! important }
.first {
/* Override more specific margin styles with "! important". */
margin-top: 0 ! important }
.last, .with-subtitle {
margin-bottom: 0 ! important }
.hidden {
display: none }
.subscript {
vertical-align: sub;
font-size: smaller }
.superscript {
vertical-align: super;
font-size: smaller }
a.toc-backref {
text-decoration: none ;
color: black }
blockquote.epigraph {
margin: 2em 5em ; }
dl.docutils dd {
margin-bottom: 0.5em }
object[type="image/svg+xml"], object[type="application/x-shockwave-flash"] {
overflow: hidden;
}
/* Uncomment (and remove this text!) to get bold-faced definition list terms
dl.docutils dt {
font-weight: bold }
*/
div.abstract {
margin: 2em 5em }
div.abstract p.topic-title {
font-weight: bold ;
text-align: center }
div.admonition, div.attention, div.caution, div.danger, div.error,
div.hint, div.important, div.note, div.tip, div.warning {
margin: 2em ;
border: medium outset ;
padding: 1em }
div.admonition p.admonition-title, div.hint p.admonition-title,
div.important p.admonition-title, div.note p.admonition-title,
div.tip p.admonition-title {
font-weight: bold ;
font-family: sans-serif }
div.attention p.admonition-title, div.caution p.admonition-title,
div.danger p.admonition-title, div.error p.admonition-title,
div.warning p.admonition-title, .code .error {
color: red ;
font-weight: bold ;
font-family: sans-serif }
/* Uncomment (and remove this text!) to get reduced vertical space in
compound paragraphs.
div.compound .compound-first, div.compound .compound-middle {
margin-bottom: 0.5em }
div.compound .compound-last, div.compound .compound-middle {
margin-top: 0.5em }
*/
div.dedication {
margin: 2em 5em ;
text-align: center ;
font-style: italic }
div.dedication p.topic-title {
font-weight: bold ;
font-style: normal }
div.figure {
margin-left: 2em ;
margin-right: 2em }
div.footer, div.header {
clear: both;
font-size: smaller }
div.line-block {
display: block ;
margin-top: 1em ;
margin-bottom: 1em }
div.line-block div.line-block {
margin-top: 0 ;
margin-bottom: 0 ;
margin-left: 1.5em }
div.sidebar {
margin: 0 0 0.5em 1em ;
border: medium outset ;
padding: 1em ;
background-color: #ffffee ;
width: 40% ;
float: right ;
clear: right }
div.sidebar p.rubric {
font-family: sans-serif ;
font-size: medium }
div.system-messages {
margin: 5em }
div.system-messages h1 {
color: red }
div.system-message {
border: medium outset ;
padding: 1em }
div.system-message p.system-message-title {
color: red ;
font-weight: bold }
div.topic {
margin: 2em }
h1.section-subtitle, h2.section-subtitle, h3.section-subtitle,
h4.section-subtitle, h5.section-subtitle, h6.section-subtitle {
margin-top: 0.4em }
h1.title {
text-align: center }
h2.subtitle {
text-align: center }
hr.docutils {
width: 75% }
img.align-left, .figure.align-left, object.align-left, table.align-left {
clear: left ;
float: left ;
margin-right: 1em }
img.align-right, .figure.align-right, object.align-right, table.align-right {
clear: right ;
float: right ;
margin-left: 1em }
img.align-center, .figure.align-center, object.align-center {
display: block;
margin-left: auto;
margin-right: auto;
}
table.align-center {
margin-left: auto;
margin-right: auto;
}
.align-left {
text-align: left }
.align-center {
clear: both ;
text-align: center }
.align-right {
text-align: right }
/* reset inner alignment in figures */
div.align-right {
text-align: inherit }
/* div.align-center * { */
/* text-align: left } */
.align-top {
vertical-align: top }
.align-middle {
vertical-align: middle }
.align-bottom {
vertical-align: bottom }
ol.simple, ul.simple {
margin-bottom: 1em }
ol.arabic {
list-style: decimal }
ol.loweralpha {
list-style: lower-alpha }
ol.upperalpha {
list-style: upper-alpha }
ol.lowerroman {
list-style: lower-roman }
ol.upperroman {
list-style: upper-roman }
p.attribution {
text-align: right ;
margin-left: 50% }
p.caption {
font-style: italic }
p.credits {
font-style: italic ;
font-size: smaller }
p.label {
white-space: nowrap }
p.rubric {
font-weight: bold ;
font-size: larger ;
color: maroon ;
text-align: center }
p.sidebar-title {
font-family: sans-serif ;
font-weight: bold ;
font-size: larger }
p.sidebar-subtitle {
font-family: sans-serif ;
font-weight: bold }
p.topic-title {
font-weight: bold }
pre.address {
margin-bottom: 0 ;
margin-top: 0 ;
font: inherit }
pre.literal-block, pre.doctest-block, pre.math, pre.code {
margin-left: 2em ;
margin-right: 2em }
pre.code .ln { color: grey; } /* line numbers */
pre.code, code { background-color: #eeeeee }
pre.code .comment, code .comment { color: #5C6576 }
pre.code .keyword, code .keyword { color: #3B0D06; font-weight: bold }
pre.code .literal.string, code .literal.string { color: #0C5404 }
pre.code .name.builtin, code .name.builtin { color: #352B84 }
pre.code .deleted, code .deleted { background-color: #DEB0A1}
pre.code .inserted, code .inserted { background-color: #A3D289}
span.classifier {
font-family: sans-serif ;
font-style: oblique }
span.classifier-delimiter {
font-family: sans-serif ;
font-weight: bold }
span.interpreted {
font-family: sans-serif }
span.option {
white-space: nowrap }
span.pre {
white-space: pre }
span.problematic {
color: red }
span.section-subtitle {
/* font-size relative to parent (h1..h6 element) */
font-size: 80% }
table.citation {
border-left: solid 1px gray;
margin-left: 1px }
table.docinfo {
margin: 2em 4em }
table.docutils {
margin-top: 0.5em ;
margin-bottom: 0.5em }
table.footnote {
border-left: solid 1px black;
margin-left: 1px }
table.docutils td, table.docutils th,
table.docinfo td, table.docinfo th {
padding-left: 0.5em ;
padding-right: 0.5em ;
vertical-align: top }
table.docutils th.field-name, table.docinfo th.docinfo-name {
font-weight: bold ;
text-align: left ;
white-space: nowrap ;
padding-left: 0 }
/* "booktabs" style (no vertical lines) */
table.docutils.booktabs {
border: 0px;
border-top: 2px solid;
border-bottom: 2px solid;
border-collapse: collapse;
}
table.docutils.booktabs * {
border: 0px;
}
table.docutils.booktabs th {
border-bottom: thin solid;
text-align: left;
}
h1 tt.docutils, h2 tt.docutils, h3 tt.docutils,
h4 tt.docutils, h5 tt.docutils, h6 tt.docutils {
font-size: 100% }
ul.auto-toc {
list-style-type: none }
</style>
<br />
<div class="document" id="pypy-v7-0-0-triple-release-of-2-7-3-5-and-3-6-alpha">
The PyPy team is proud to release the version 7.0.0 of PyPy, which includes
three different interpreters:<br />
<blockquote>
<ul class="simple">
<li>PyPy2.7, which is an interpreter supporting the syntax and the features of
Python 2.7</li>
<li>PyPy3.5, which supports Python 3.5</li>
<li>PyPy3.6-alpha: this is the first official release of PyPy to support 3.6
features, although it is still considered alpha quality.</li>
</ul>
</blockquote>
All the interpreters are based on much the same codebase, thus the triple
release.<br />
Until we can work with downstream providers to distribute builds with PyPy, we
have made packages for some common packages <a class="reference external" href="https://github.com/antocuni/pypy-wheels">available as wheels</a>.<br />
The <a class="reference external" href="http://doc.pypy.org/en/latest/gc_info.html#semi-manual-gc-management">GC hooks</a> , which can be used to gain more insights into its
performance, has been improved and it is now possible to manually manage the
GC by using a combination of <tt class="docutils literal">gc.disable</tt> and <tt class="docutils literal">gc.collect_step</tt>. See the
<a class="reference external" href="https://morepypy.blogspot.com/2019/01/pypy-for-low-latency-systems.html">GC blog post</a>.<br />
We updated the <a class="reference external" href="http://cffi.readthedocs.io/">cffi</a> module included in PyPy to version 1.12, and the
<a class="reference external" href="https://cppyy.readthedocs.io/">cppyy</a> backend to 1.4. Please use these to wrap your C and C++ code,
respectively, for a JIT friendly experience.<br />
As always, this release is 100% compatible with the previous one and fixed
several issues and bugs raised by the growing community of PyPy users.
We strongly recommend updating.<br />
The PyPy3.6 release and the Windows PyPy3.5 release are still not production
quality so your mileage may vary. There are open issues with incomplete
compatibility and c-extension support.<br />
The utf8 branch that changes internal representation of unicode to utf8 did not
make it into the release, so there is still more goodness coming.
You can download the v7.0 releases here:<br />
<blockquote>
<a class="reference external" href="http://pypy.org/download.html">http://pypy.org/download.html</a></blockquote>
We would like to thank our donors for the continued support of the PyPy
project. If PyPy is not quite good enough for your needs, we are available for
direct consulting work.<br />
We would also like to thank our contributors and encourage new people to join
the project. PyPy has many layers and we need help with all of them: <a class="reference external" href="https://www.blogger.com/index.html">PyPy</a>
and <a class="reference external" href="https://rpython.readthedocs.org/">RPython</a> documentation improvements, tweaking popular modules to run
on pypy, or general <a class="reference external" href="https://www.blogger.com/project-ideas.html">help</a> with making RPython's JIT even better.<br />
<div class="section" id="what-is-pypy">
<h1>
What is PyPy?</h1>
PyPy is a very compliant Python interpreter, almost a drop-in replacement for
CPython 2.7, 3.5 and 3.6. It's fast (<a class="reference external" href="http://speed.pypy.org/">PyPy and CPython 2.7.x</a> performance
comparison) due to its integrated tracing JIT compiler.<br />
We also welcome developers of other <a class="reference external" href="http://rpython.readthedocs.io/en/latest/examples.html">dynamic languages</a> to see what RPython
can do for them.<br />
The PyPy release supports:<br />
<blockquote>
<ul class="simple">
<li><strong>x86</strong> machines on most common operating systems
(Linux 32/64 bits, Mac OS X 64 bits, Windows 32 bits, OpenBSD, FreeBSD)</li>
<li>big- and little-endian variants of <strong>PPC64</strong> running Linux,</li>
<li><strong>s390x</strong> running Linux</li>
</ul>
</blockquote>
Unfortunately at the moment of writing our ARM buildbots are out of service,
so for now we are <strong>not</strong> releasing any binary for the ARM architecture.</div>
<div class="section" id="changelog">
<h1>
What else is new?</h1>
PyPy 6.0 was released in April, 2018.
There are many incremental improvements to RPython and PyPy, the complete listing is <a class="reference external" href="http://doc.pypy.org/en/latest/release-v7.0.0.html">here</a>.<br />
<br />
Please update, and continue to help us make PyPy better.<br />
<br />
<br />
Cheers, The PyPy team
</div>
</div>
Antonio Cunihttp://www.blogger.com/profile/17017456817083804792noreply@blogger.com4tag:blogger.com,1999:blog-3971202189709462152.post-61076236549163139052019-02-09T17:13:00.000+01:002019-02-09T17:13:16.887+01:00Düsseldorf Sprint Report 2019<p>Hello everyone!</p>
<p>We are happy to report a successful and well attended sprint that is wrapping up
in Düsseldorf, Germany. In the last week we had eighteen people sprinting
at the Heinrich-Heine-Universität Düsseldorf on various topics.</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-ZSjAODMWBmc/XF77I34X4TI/AAAAAAAAqgw/lc1uNqH5a30efONmAGKH8-wikbX0R47NwCLcBGAs/s1600/DypAt1VXcAA3HwD.jpeg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-ZSjAODMWBmc/XF77I34X4TI/AAAAAAAAqgw/lc1uNqH5a30efONmAGKH8-wikbX0R47NwCLcBGAs/s400/DypAt1VXcAA3HwD.jpeg" width="400" height="300" data-original-width="1600" data-original-height="1200" /></a><p><em>Totally serious work going on here constantly.</em></p></div>
<p>A big
chunk of the sprint was dedicated to various discussions, since we did not
manage to gather the core developers in one room in quite a while.
Discussion topics included:</p>
<ul class="simple">
<li>Funding and general sustainability of open source.</li>
<li>Catching up with CPython 3.7/3.8 – we are planning to release 3.6 some time
in the next few months and we will continue working on 3.7/3.8.</li>
<li>What to do with VMprof</li>
<li>How can we support Cython inside PyPy in a way that will be understood
by the JIT, hence fast.</li>
<li>The future of supporting the numeric stack on pypy – we have made significant
progress in the past few years and most of the numeric stack works out of the box,
but deployment and performance remain problems. Improving on those problems
remains a very important focus for PyPy as a project.</li>
<li>Using the presence of a CPython developer (Łukasz Langa) and a Graal Python developer
(Tim Felgentreff) we discussed ways to collaborate in order to improve Python
ecosystem across implementations.</li>
<li>Pierre-Yves David and Georges Racinet from octobus gave us an exciting demo
on <a class="reference external" href="https://heptapod.net">Heptapod</a>, which adds mercurial support to gitlab.</li>
<li>Maciej and Armin gave demos of their current (non-PyPy-related) project <a href="https://vrsketch.eu/">VRSketch</a>.</li>
</ul>
<div class="separator" style="clear: both; text-align: center;"><a href="https://2.bp.blogspot.com/-v0NI3qhGGo8/XF77mi713yI/AAAAAAAAqg4/pFvMbwjYn3MzUrnlazn_NyHdJv3cIpoDgCLcBGAs/s1600/DSC06342.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-v0NI3qhGGo8/XF77mi713yI/AAAAAAAAqg4/pFvMbwjYn3MzUrnlazn_NyHdJv3cIpoDgCLcBGAs/s400/DSC06342.JPG" width="400" height="267" data-original-width="1600" data-original-height="1067" /></a>
<p><em>Visiting the <a href="https://en.wikipedia.org/wiki/Landschaftspark_Duisburg-Nord">Landschaftspark Duisburg Nord</a> on the break day</em></p></div>
<p>Some highlights of the coding tasks worked on:</p>
<ul class="simple">
<li>Aarch64 (ARM64) JIT backend work has been started, we are able to run the first
test! Tobias Oberstein from Crossbar GmbH and Rodolph Perfetta from ARM joined the
sprint to help kickstart the project.</li>
<li>The long running math-improvements branch that was started by Stian Andreassen got merged
after bugfixes done by Alexander Schremmer. It should improve operations on large integers.</li>
<li>The arcane art of necromancy was used to revive long dormant regalloc branch started
and nearly finished by Carl Friedrich Bolz-Tereick. The branch got merged and gives
some modest speedups across the board.</li>
<li>Andrew Lawrence worked on MSI installer for PyPy on windows.</li>
<li>Łukasz worked on improving failing tests on the PyPy 3.6 branch. He knows very obscure
details of CPython (e.g. how pickling works), hence we managed to progress very quickly.</li>
<li>Matti Picus set up a new benchmarking server for PyPy 3 branches.</li>
<li>The Utf8 branch, which changes the internal representation of unicode might be finally
merged at some point very soon. We discussed and improved upon the last few
blockers. It gives significant speedups in a lot of cases handling strings.</li>
<li>Zlib was missing couple methods, which were added by Ronan Lamy and Julian Berman.</li>
<li>Manuel Jacob fixed RevDB failures.</li>
<li>Antonio Cuni and Matti Picus worked on 7.0 release which should happen in a few days.</li>
</ul>
<p>Now we are all quite exhausted, and are looking forward to catching up on sleep.</p>
<p>Best regards,
Maciej Fijałkowski, Carl Friedrich Bolz-Tereick and the whole PyPy team.</p>
Carl Friedrich Bolz-Tereickhttp://www.blogger.com/profile/00518922641059511014noreply@blogger.com4tag:blogger.com,1999:blog-3971202189709462152.post-6131653933014019652019-01-03T15:21:00.001+01:002019-01-03T15:23:22.387+01:00PyPy for low-latency systems<h1 class="title">
PyPy for low-latency systems</h1>
Recently I have merged the gc-disable branch, introducing a couple of features
which are useful when you need to respond to certain events with the lowest
possible latency. This work has been kindly sponsored by <a class="reference external" href="https://www.gambitresearch.com/">Gambit Research</a>
(which, by the way, is a very cool and geeky place where to <a class="reference external" href="https://www.gambitresearch.com/jobs.html">work</a>, in case you
are interested). Note also that this is a very specialized use case, so these
features might not be useful for the average PyPy user, unless you have the
same problems as described here.<br />
<br />
The PyPy VM manages memory using a generational, moving Garbage Collector.
Periodically, the GC scans the whole heap to find unreachable objects and
frees the corresponding memory. Although at a first look this strategy might
sound expensive, in practice the total cost of memory management is far less
than e.g. on CPython, which is based on reference counting. While maybe
counter-intuitive, the main advantage of a non-refcount strategy is
that allocation is very fast (especially compared to malloc-based allocators),
and deallocation of objects which die young is basically for free. More
information about the PyPy GC is available <a class="reference external" href="https://pypy.readthedocs.io/en/latest/gc_info.html#incminimark">here</a>.<br />
<br />
As we said, the total cost of memory managment is less on PyPy than on
CPython, and it's one of the reasons why PyPy is so fast. However, one big
disadvantage is that while on CPython the cost of memory management is spread
all over the execution of the program, on PyPy it is concentrated into GC
runs, causing observable pauses which interrupt the execution of the user
program.<br />
To avoid excessively long pauses, the PyPy GC has been using an <a class="reference external" href="https://morepypy.blogspot.com/2013/10/incremental-garbage-collector-in-pypy.html">incremental
strategy</a> since 2013. The GC runs as a series of "steps", letting the user
program to progress between each step.<br />
<br />
The following chart shows the behavior of a real-world, long-running process:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://3.bp.blogspot.com/-44yKwUVK3BE/XC4X9XL4BII/AAAAAAAABbE/XdTCIoyA-eYxvxIgJhFHaKnzxjhoWStHQCEwYBhgL/s1600/gc-timing.png" imageanchor="1" style="margin-right: 1em;"><img border="0" data-original-height="620" data-original-width="1600" height="246" src="https://3.bp.blogspot.com/-44yKwUVK3BE/XC4X9XL4BII/AAAAAAAABbE/XdTCIoyA-eYxvxIgJhFHaKnzxjhoWStHQCEwYBhgL/s640/gc-timing.png" width="640" /></a></div>
<br />
<br />
The orange line shows the total memory used by the program, which
increases linearly while the program progresses. Every ~5 minutes, the GC
kicks in and the memory usage drops from ~5.2GB to ~2.8GB (this ratio is controlled
by the <a class="reference external" href="https://pypy.readthedocs.io/en/latest/gc_info.html#environment-variables">PYPY_GC_MAJOR_COLLECT</a> env variable).<br />
The purple line shows aggregated data about the GC timing: the whole
collection takes ~1400 individual steps over the course of ~1 minute: each
point represent the <strong>maximum</strong> time a single step took during the past 10
seconds. Most steps take ~10-20 ms, although we see a horrible peak of ~100 ms
towards the end. We have not investigated yet what it is caused by, but we
suspect it is related to the deallocation of raw objects.<br />
<br />
These multi-millesecond pauses are a problem for systems where it is important
to respond to certain events with a latency which is both low and consistent.
If the GC kicks in at the wrong time, it might causes unacceptable pauses during
the collection cycle.<br />
<br />
Let's look again at our real-world example. This is a system which
continuously monitors an external stream; when a certain event occurs, we want
to take an action. The following chart shows the maximum time it takes to
complete one of such actions, aggregated every minute:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://4.bp.blogspot.com/-FO9uFHSqZzU/XC4YC8LZUpI/AAAAAAAABa8/B8ZOrEgbVJUHoO65wxvCMVpvciO_d_0TwCLcBGAs/s1600/normal-max.png" imageanchor="1" style="margin-right: 1em;"><img border="0" data-original-height="604" data-original-width="1600" height="240" src="https://4.bp.blogspot.com/-FO9uFHSqZzU/XC4YC8LZUpI/AAAAAAAABa8/B8ZOrEgbVJUHoO65wxvCMVpvciO_d_0TwCLcBGAs/s640/normal-max.png" width="640" /></a></div>
<br />
You can clearly see that the baseline response time is around ~20-30
ms. However, we can also see periodic spikes around ~50-100 ms, with peaks up
to ~350-450 ms! After a bit of investigation, we concluded that most (although
not all) of the spikes were caused by the GC kicking in at the wrong time.<br />
<br />
The work I did in the <tt class="docutils literal"><span class="pre">gc-disable</span></tt> branch aims to fix this problem by
introducing <a class="reference external" href="https://pypy.readthedocs.io/en/latest/gc_info.html#semi-manual-gc-management">two new features</a> to the <tt class="docutils literal">gc</tt> module:<br />
<blockquote>
<ul class="simple">
<li><tt class="docutils literal">gc.disable()</tt>, which previously only inhibited the execution of
finalizers without actually touching the GC, now disables the GC major
collections. After a call to it, you will see the memory usage grow
indefinitely.</li>
<li><tt class="docutils literal">gc.collect_step()</tt> is a new function which you can use to manually
execute a single incremental GC collection step.</li>
</ul>
</blockquote>
It is worth to specify that <tt class="docutils literal">gc.disable()</tt> disables <strong>only</strong> the major
collections, while minor collections still runs. Moreover, thanks to the
JIT's virtuals, many objects with a short and predictable lifetime are not
allocated at all. The end result is that most objects with short lifetime are
still collected as usual, so the impact of <tt class="docutils literal">gc.disable()</tt> on memory growth
is not as bad as it could sound.<br />
<br />
Combining these two functions, it is possible to take control of the GC to
make sure it runs only when it is acceptable to do so. For an example of
usage, you can look at the implementation of a <a class="reference external" href="https://bitbucket.org/antocuni/pypytools/src/0273afc3e8bedf0eb1ef630c3bc69e8d9dd661fe/pypytools/gc/custom.py?at=default&fileviewer=file-view-default">custom GC</a> inside <a class="reference external" href="https://pypi.org/project/pypytools/">pypytools</a>.
The peculiarity is that it also defines a "<tt class="docutils literal">with <span class="pre">nogc():"</span></tt> context manager
which you can use to mark performance-critical sections where the GC is not
allowed to run.<br />
<br />
The following chart compares the behavior of the default PyPy GC and the new
custom GC, after a careful placing of <tt class="docutils literal">nogc()</tt> sections:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-bGqs0WrOEBk/XC4YJN0uZfI/AAAAAAAABbA/4EXOASvy830IKBoTFtrnmY22Vyd_api-ACLcBGAs/s1600/nogc-max.png" imageanchor="1" style="margin-right: 1em;"><img border="0" data-original-height="606" data-original-width="1600" height="242" src="https://1.bp.blogspot.com/-bGqs0WrOEBk/XC4YJN0uZfI/AAAAAAAABbA/4EXOASvy830IKBoTFtrnmY22Vyd_api-ACLcBGAs/s640/nogc-max.png" width="640" /></a></div>
<br />
The yellow line is the same as before, while the purple line shows the new
system: almost all spikes have gone, and the baseline performance is about 10%
better. There is still one spike towards the end, but after some investigation
we concluded that it was <strong>not</strong> caused by the GC.<br />
<br />
Note that this does <strong>not</strong> mean that the whole program became magically
faster: we simply moved the GC pauses in some other place which is <strong>not</strong>
shown in the graph: in this specific use case this technique was useful
because it allowed us to shift the GC work in places where pauses are more
acceptable.<br />
<br />
All in all, a pretty big success, I think. These functionalities are already
available in the nightly builds of PyPy, and will be included in the next
release: take this as a New Year present :)<br />
<br />
Antonio Cuni and the PyPy teamAntonio Cunihttp://www.blogger.com/profile/17017456817083804792noreply@blogger.com3tag:blogger.com,1999:blog-3971202189709462152.post-71991104984515740742018-12-17T12:40:00.000+01:002019-02-04T10:38:42.919+01:00PyPy Winter Sprint Feb 4-9 in Düsseldorf<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: left;">
</div>
<h2 style="text-align: center;">
PyPy Sprint February 4th-9th 2019 in Düsseldorf</h2>
<div style="text-align: left;">
<br />
The next PyPy sprint will be held in the Computer Science department of<b> </b>Heinrich-Heine Universität Düsseldorf from the 4th to the 9st of February 2019 (nine years after the <a href="https://morepypy.blogspot.com/2010/10/dusseldorf-sprint-report-2010.html">last sprint there).</a> This is a fully public sprint, everyone is welcome to join us.</div>
<h3 style="text-align: center;">
Topics and goals</h3>
<div style="text-align: left;">
</div>
<ul style="text-align: left;">
<li>improve Python 3.6 support</li>
<li>discuss benchmarking situation</li>
<li>progress on utf-8 branches</li>
<li>cpyext performance and completeness</li>
<li>packaging: are we ready to upload to PyPI?</li>
<ul>
<li><a href="https://bitbucket.org/pypy/pypy/issues/2617">issue 2617</a> - we expose too many functions from lib-pypy.so</li>
<li><a href="https://github.com/pypa/manylinux/issues/179">manylinux2010</a> - will it solve our build issues?</li>
<li>formulate an ABI name and upgrade policy</li>
</ul>
</ul>
<ul style="text-align: left;">
<li><a href="https://bitbucket.org/pypy/pypy/issues/2930">memoryview(ctypes.Structure)</a> does not create the correct format string</li>
<li>discussing the state and future of PyPy and the wider Python ecosystem</li>
</ul>
<div style="text-align: left;">
</div>
<h3 style="text-align: center;">
Location</h3>
<div style="text-align: left;">
The sprint will take place in seminar room 25.12.02.55 of the computer science department. It is in the building 25.12 of the university campus, second floor. <a href="https://www.cs.hhu.de/en/research-groups/software-engineering-and-programming-languages/service-pages/contact/location-and-how-to-get-here.html">Travel instructions</a><br />
</div>
<h3 style="text-align: center;">
Exact times</h3>
<div style="text-align: left;">
Work days: starting February 4th (10:00), ending February 9th (~afternoon). The break day will probably be Thursday.</div>
<h3 style="text-align: center;">
Registration</h3>
<div style="text-align: left;">
<br />
Please register by Mercurial::<br />
https://bitbucket.org/pypy/extradoc/</div>
<div style="text-align: left;">
<a href="https://bitbucket.org/pypy/extradoc/src/extradoc/sprintinfo/ddorf2019/people.txt">https://bitbucket.org/pypy/extradoc/src/extradoc/sprintinfo/ddorf2019/people.txt</a><br />
<br />
or on the pypy-dev mailing list if you do not yet have check-in rights:<br />
</div>
<div style="text-align: left;">
<a href="http://mail.python.org/mailman/listinfo/pypy-dev">http://mail.python.org/mailman/listinfo/pypy-dev</a></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
Looking forward to seeing everyone there!</div>
</div>
mattiphttp://www.blogger.com/profile/07336549270776418081noreply@blogger.com2tag:blogger.com,1999:blog-3971202189709462152.post-53365579467985830632018-11-29T13:09:00.000+01:002018-11-29T20:50:18.964+01:00Funding for 64-bit Armv8-a support in PyPy<div dir="ltr" style="text-align: left;" trbidi="on">
<p>Hello everyone</p>
<p>At PyPy we are trying to support a relatively wide range of platforms. We have PyPy working on OS X, Windows and various flavors of linux (and unofficially various flavors of BSD) on the software side, with hardware side having x86, x86_64, PPC, 32-bit Arm (v7) and even zarch. This is harder than for other projects, since PyPy emits assembler on the fly from the just in time compiler and it requires significant amount of work to port it to a new platform.</p>
<p>We are pleased to inform that <a href="https://www.arm.com/">Arm Limited</a>, together with <a href="https://crossbario.com/">Crossbar.io GmbH</a>, are sponsoring the development of 64-bit Armv8-a architecture support through <a href="https://baroquesoftware.com">Baroque Software OU</a>, which would allow PyPy to run on a new variety of low-power, high-density servers with that architecture. We believe this will be beneficial for the funders, for the PyPy project as well as to the wider community.</p>
<p>The work will commence soon and will be done some time early next year with expected speedups either comparable to x86 speedups or, if our <a href="https://morepypy.blogspot.com/2013/05/pypy-20-alpha-for-arm.html">current experience with ARM holds</a>, more significant than x86 speedups.</p>
<p>Best,<br>
Maciej Fijalkowski and the PyPy team</p>
<br /></div>
Maciej Fijalkowskihttp://www.blogger.com/profile/11410841070239382771noreply@blogger.com2tag:blogger.com,1999:blog-3971202189709462152.post-62714835146750068462018-11-15T09:06:00.001+01:002018-11-15T09:06:29.125+01:00Guest Post: Implementing a Calculator REPL in RPython
<p>This is a tutorial style post that walks through using the RPython translation
toolchain to create a REPL that executes basic math expressions. </p>
<p>We will do that by scanning the user's input into tokens, compiling those
tokens into bytecode and running that bytecode in our own virtual machine. Don't
worry if that sounds horribly complicated, we are going to explain it step by
step. </p>
<p>This post is a bit of a diversion while on my journey to create a compliant
<a href="http://www.craftinginterpreters.com/the-lox-language.html">lox</a> implementation
using the <a href="https://rpython.readthedocs.io">RPython translation toolchain</a>. The
majority of this work is a direct RPython translation of the low level C
guide from Bob Nystrom (<a href="https://twitter.com/munificentbob">@munificentbob</a>) in the
excellent book <a href="https://www.craftinginterpreters.com">craftinginterpreters.com</a>
specifically the chapters 14 – 17.</p>
<h2 id="theroadahead">The road ahead</h2>
<p>As this post is rather long I'll break it into a few major sections. In each section we will
have something that translates with RPython, and at the end it all comes together. </p>
<ul>
<li><a href="#arepl">REPL</a></li>
<li><a href="#avirtualmachine">Virtual Machine</a></li>
<li><a href="#scanningthesource">Scanning the source</a></li>
<li><a href="#compilingexpressions">Compiling Expressions</a></li>
<li><a href="#endtoend">End to end</a></li>
</ul>
<h2 id="arepl">A REPL</h2>
<p>So if you're a Python programmer you might be thinking this is pretty trivial right?</p>
<p>I mean if we ignore input errors, injection attacks etc couldn't we just do something
like this:</p>
<pre><code class="python">"""
A pure python REPL that can parse simple math expressions
"""
while True:
print(eval(raw_input("> ")))
</code></pre>
<p>Well it does appear to do the trick:</p>
<pre><code class="nohighlight">$ python2 section-1-repl/main.py
> 3 + 4 * ((1.0/(2 * 3 * 4)) + (1.0/(4 * 5 * 6)) - (1.0/(6 * 7 * 8)))
3.1880952381
</code></pre>
<p>So can we just ask RPython to translate this into a binary that runs magically
faster?</p>
<p>Let's see what happens. We need to add two functions for RPython to
get its bearings (<code>entry_point</code> and <code>target</code>) and call the file <code>targetXXX</code>:</p>
<p><a href="https://github.com/hardbyte/rpython-post/blob/master/section-1-repl/targetrepl1.py"><code>targetrepl1.py</code></a></p>
<pre><code class="python language-python">def repl():
while True:
print eval(raw_input('> '))
def entry_point(argv):
repl()
return 0
def target(driver, *args):
return entry_point, None
</code></pre>
<p>Which at translation time gives us this admonishment that accurately tells us
we are trying to call a Python built-in <code>raw_input</code> that is unfortunately not
valid RPython.</p>
<pre><code class="nohighlight">$ rpython ./section-1-repl/targetrepl1.py
...SNIP...
[translation:ERROR] AnnotatorError:
object with a __call__ is not RPython: <built-in function raw_input>
Processing block:
block@18 is a <class 'rpython.flowspace.flowcontext.SpamBlock'>
in (target1:2)repl
containing the following operations:
v0 = simple_call((builtin_function raw_input), ('> '))
v1 = simple_call((builtin_function eval), v0)
v2 = str(v1)
v3 = simple_call((function rpython_print_item), v2)
v4 = simple_call((function rpython_print_newline))
</code></pre>
<p>Ok so we can't use <code>raw_input</code> or <code>eval</code> but that doesn't faze us. Let's get
the input from a stdin stream and just print it out (no evaluation).</p>
<p><a href="https://github.com/hardbyte/rpython-post/blob/master/section-1-repl/targetrepl2.py"><code>targetrepl2.py</code></a></p>
<pre><code class="python language-python">from rpython.rlib import rfile
LINE_BUFFER_LENGTH = 1024
def repl(stdin):
while True:
print "> ",
line = stdin.readline(LINE_BUFFER_LENGTH)
print line
def entry_point(argv):
stdin, stdout, stderr = rfile.create_stdio()
try:
repl(stdin)
except:
return 0
def target(driver, *args):
return entry_point, None
</code></pre>
<p>Translate <code>targetrepl2.py</code> – we can add an optimization level if we
are so inclined:</p>
<pre><code class="nohighlight">$ rpython --opt=2 section-1-repl/targetrepl2.py
...SNIP...
[Timer] Timings:
[Timer] annotate --- 1.2 s
[Timer] rtype_lltype --- 0.9 s
[Timer] backendopt_lltype --- 0.6 s
[Timer] stackcheckinsertion_lltype --- 0.0 s
[Timer] database_c --- 15.0 s
[Timer] source_c --- 1.6 s
[Timer] compile_c --- 1.9 s
[Timer] =========================================
[Timer] Total: --- 21.2 s
</code></pre>
<p>No errors!? Let's try it out:</p>
<pre><code class="nohighlight">$ ./target2-c
1 + 2
> 1 + 2
^C
</code></pre>
<p>Ahh our first success – let's quickly deal with the flushing fail by using the
stdout stream directly as well. Let's print out the input in quotes:</p>
<pre><code class="python language-python">from rpython.rlib import rfile
LINE_BUFFER_LENGTH = 1024
def repl(stdin, stdout):
while True:
stdout.write("> ")
line = stdin.readline(LINE_BUFFER_LENGTH)
print '"%s"' % line.strip()
def entry_point(argv):
stdin, stdout, stderr = rfile.create_stdio()
try:
repl(stdin, stdout)
except:
pass
return 0
def target(driver, *args):
return entry_point, None
</code></pre>
<p>Translation works, and the test run too:</p>
<pre><code class="nohighlight">$ ./target3-c
> hello this seems better
"hello this seems better"
> ^C
</code></pre>
<p>So we are in a good place with taking user input and printing output... What about
the whole math evaluation thing we were promised? For that we are can probably leave
our RPython REPL behind for a while and connect it up at the end.</p>
<h2 id="avirtualmachine">A virtual machine</h2>
<p>A virtual machine is the execution engine of our basic math interpreter. It will be very simple,
only able to do simple tasks like addition. I won't go into any depth to describe why we want
a virtual machine, but it is worth noting that many languages including Java and Python make
this decision to compile to an intermediate bytecode representation and then execute that with
a virtual machine. Alternatives are compiling directly to native machine code like (earlier versions of) the V8
JavaScript engine, or at the other end of the spectrum executing an abstract syntax tree –
which is what the <a href="https://blog.plan99.net/graal-truffle-134d8f28fb69">Truffle approach to building VMs</a> is based on. </p>
<p>We are going to keep things very simple. We will have a stack where we can push and pop values,
we will only support floats, and our VM will only implement a few very basic operations.</p>
<h3 id="opcodes">OpCodes</h3>
<p>In fact our entire instruction set is:</p>
<pre><code class="nohighlight">OP_CONSTANT
OP_RETURN
OP_NEGATE
OP_ADD
OP_SUBTRACT
OP_MULTIPLY
OP_DIVIDE
</code></pre>
<p>Since we are targeting RPython we can't use the nice <code>enum</code> module from the Python standard
library, so instead we just define a simple class with class attributes.</p>
<p>We should start to get organized, so we will create a new file
<a href="https://github.com/hardbyte/rpython-post/blob/master/section-2-vm/opcodes.py"><code>opcodes.py</code></a> and add this:</p>
<pre><code class="python language-python">class OpCode:
OP_CONSTANT = 0
OP_RETURN = 1
OP_NEGATE = 2
OP_ADD = 3
OP_SUBTRACT = 4
OP_MULTIPLY = 5
OP_DIVIDE = 6
</code></pre>
<h3 id="chunks">Chunks</h3>
<p>To start with we need to get some infrastructure in place before we write the VM engine.</p>
<p>Following <a href="https://www.craftinginterpreters.com/chunks-of-bytecode.html">craftinginterpreters.com</a>
we start with a <code>Chunk</code> object which will represent our bytecode. In RPython we have access
to Python-esq lists so our <code>code</code> object will just be a list of <code>OpCode</code> values – which are
just integers. A list of ints, couldn't get much simpler.</p>
<p><code>section-2-vm/chunk.py</code></p>
<pre><code class="python language-python">class Chunk:
code = None
def __init__(self):
self.code = []
def write_chunk(self, byte):
self.code.append(byte)
def disassemble(self, name):
print "== %s ==\n" % name
i = 0
while i < len(self.code):
i = disassemble_instruction(self, i)
</code></pre>
<p><em>From here on I'll only present minimal snippets of code instead of the whole lot, but
I'll link to the repository with the complete example code. For example the
various debugging including <code>disassemble_instruction</code> isn't particularly interesting
to include verbatim. See the <a href="https://github.com/hardbyte/rpython-post/">github repo</a> for full details</em></p>
<p>We need to check that we can create a chunk and disassemble it. The quickest way to do this
is to use Python during development and debugging then every so often try to translate it.</p>
<p>Getting the disassemble part through the RPython translator was a hurdle for me as I
quickly found that many <code>str</code> methods such as <code>format</code> are not supported, and only very basic
<code>%</code> based formatting is supported. I ended up creating helper functions for string manipulation
such as:</p>
<pre><code class="python language-python">def leftpad_string(string, width, char=" "):
l = len(string)
if l > width:
return string
return char * (width - l) + string
</code></pre>
<p>Let's write a new <code>entry_point</code> that creates and disassembles a chunk of bytecode. We can
set the target output name to <code>vm1</code> at the same time:</p>
<p><a href="https://github.com/hardbyte/rpython-post/blob/master/section-2-vm/targetvm1.py"><code>targetvm1.py</code></a></p>
<pre><code class="python language-python">def entry_point(argv):
bytecode = Chunk()
bytecode.write_chunk(OpCode.OP_ADD)
bytecode.write_chunk(OpCode.OP_RETURN)
bytecode.disassemble("hello world")
return 0
def target(driver, *args):
driver.exe_name = "vm1"
return entry_point, None
</code></pre>
<p>Running this isn't going to be terribly interesting, but it is always nice to
know that it is doing what you expect:</p>
<pre><code class="nohighlight">$ ./vm1
== hello world ==
0000 OP_ADD
0001 OP_RETURN
</code></pre>
<h3 id="chunksofdata">Chunks of data</h3>
<p>Ref: http://www.craftinginterpreters.com/chunks-of-bytecode.html#constants</p>
<p>So our bytecode is missing a very crucial element – the values to operate on!</p>
<p>As with the bytecode we can store these constant values as part of the chunk
directly in a list. Each chunk will therefore have a constant data component,
and a code component. </p>
<p>Edit the <code>chunk.py</code> file and add the new instance attribute <code>constants</code> as an
empty list, and a new method <code>add_constant</code>.</p>
<pre><code class="python language-python"> def add_constant(self, value):
self.constants.append(value)
return len(self.constants) - 1
</code></pre>
<p>Now to use this new capability we can modify our example chunk
to write in some constants before the <code>OP_ADD</code>:</p>
<pre><code class="python language-python"> bytecode = Chunk()
constant = bytecode.add_constant(1.0)
bytecode.write_chunk(OpCode.OP_CONSTANT)
bytecode.write_chunk(constant)
constant = bytecode.add_constant(2.0)
bytecode.write_chunk(OpCode.OP_CONSTANT)
bytecode.write_chunk(constant)
bytecode.write_chunk(OpCode.OP_ADD)
bytecode.write_chunk(OpCode.OP_RETURN)
bytecode.disassemble("adding constants")
</code></pre>
<p>Which still translates with RPython and when run gives us the following disassembled
bytecode:</p>
<pre><code class="$ ./vm2 language-$ ./vm2">== adding constants ==
0000 OP_CONSTANT (00) '1'
0002 OP_CONSTANT (01) '2'
0004 OP_ADD
0005 OP_RETURN
</code></pre>
<p>We won't go down the route of serializing the bytecode to disk, but this bytecode chunk
(including the constant data) could be saved and executed on our VM later – like a Java
<code>.class</code> file. Instead we will pass the bytecode directly to our VM after we've created
it during the compilation process. </p>
<h3 id="emulation">Emulation</h3>
<p>So those four instructions of bytecode combined with the constant value mapping
<code>00 -> 1.0</code> and <code>01 -> 2.0</code> describes individual steps for our virtual machine
to execute. One major point in favor of defining our own bytecode is we can
design it to be really simple to execute – this makes the VM really easy to implement.</p>
<p>As I mentioned earlier this virtual machine will have a stack, so let's begin with that.
Now the stack is going to be a busy little beast – as our VM takes instructions like
<code>OP_ADD</code> it will pop off the top two values from the stack, and push the result of adding
them together back onto the stack. Although dynamically resizing Python lists
are marvelous, they can be a little slow. RPython can take advantage of a constant sized
list which doesn't make our code much more complicated.</p>
<p>To do this we will define a constant sized list and track the <code>stack_top</code> directly. Note
how we can give the RPython translator hints by adding assertions about the state that
the <code>stack_top</code> will be in.</p>
<pre><code class="python language-python">class VM(object):
STACK_MAX_SIZE = 256
stack = None
stack_top = 0
def __init__(self):
self._reset_stack()
def _reset_stack(self):
self.stack = [0] * self.STACK_MAX_SIZE
self.stack_top = 0
def _stack_push(self, value):
assert self.stack_top < self.STACK_MAX_SIZE
self.stack[self.stack_top] = value
self.stack_top += 1
def _stack_pop(self):
assert self.stack_top >= 0
self.stack_top -= 1
return self.stack[self.stack_top]
def _print_stack(self):
print " ",
if self.stack_top <= 0:
print "[]",
else:
for i in range(self.stack_top):
print "[ %s ]" % self.stack[i],
print
</code></pre>
<p>Now we get to the main event, the hot loop, the VM engine. Hope I haven't built it up to
much, it is actually really simple! We loop until the instructions tell us to stop
(<code>OP_RETURN</code>), and dispatch to other simple methods based on the instruction.</p>
<pre><code class="python language-python"> def _run(self):
while True:
instruction = self._read_byte()
if instruction == OpCode.OP_RETURN:
print "%s" % self._stack_pop()
return InterpretResultCode.INTERPRET_OK
elif instruction == OpCode.OP_CONSTANT:
constant = self._read_constant()
self._stack_push(constant)
elif instruction == OpCode.OP_ADD:
self._binary_op(self._stack_add)
</code></pre>
<p>Now the <code>_read_byte</code> method will have to keep track of which instruction we are up
to. So add an instruction pointer (<code>ip</code>) to the VM with an initial value of <code>0</code>.
Then <code>_read_byte</code> is simply getting the next bytecode (int) from the chunk's <code>code</code>:</p>
<pre><code class="python language-python"> def _read_byte(self):
instruction = self.chunk.code[self.ip]
self.ip += 1
return instruction
</code></pre>
<p></p>
<p>If the instruction is <code>OP_CONSTANT</code> we take the constant's address from the next byte
of the chunk's <code>code</code>, retrieve that constant value and add it to the VM's stack.</p>
<pre><code class="python language-python"> def _read_constant(self):
constant_index = self._read_byte()
return self.chunk.constants[constant_index]
</code></pre>
<p>Finally our first arithmetic operation <code>OP_ADD</code>, what it has to achieve doesn't
require much explanation: pop two values from the stack, add them together, push
the result. But since a few operations all have the same template we introduce a
layer of indirection – or abstraction – by introducing a reusable <code>_binary_op</code>
helper method.</p>
<pre><code class="python language-python"> @specialize.arg(1)
def _binary_op(self, operator):
op2 = self._stack_pop()
op1 = self._stack_pop()
result = operator(op1, op2)
self._stack_push(result)
@staticmethod
def _stack_add(op1, op2):
return op1 + op2
</code></pre>
<p></p>
<p>Note we tell RPython to specialize <code>_binary_op</code> on the first argument. This causes
RPython to make a copy of <code>_binary_op</code> for every value of the first argument passed,
which means that each copy contains a call to a particular operator, which can then be
inlined.</p>
<p>To be able to run our bytecode the only thing left to do is to pass in the chunk
and call <code>_run()</code>:</p>
<pre><code class="python language-python"> def interpret_chunk(self, chunk):
if self.debug_trace:
print "== VM TRACE =="
self.chunk = chunk
self.ip = 0
try:
result = self._run()
return result
except:
return InterpretResultCode.INTERPRET_RUNTIME_ERROR
</code></pre>
<p><a href="https://github.com/hardbyte/rpython-post/blob/master/section-2-vm/targetvm3.py"><code>targetvm3.py</code></a> connects the pieces:</p>
<pre><code class="python language-python">def entry_point(argv):
bytecode = Chunk()
constant = bytecode.add_constant(1)
bytecode.write_chunk(OpCode.OP_CONSTANT)
bytecode.write_chunk(constant)
constant = bytecode.add_constant(2)
bytecode.write_chunk(OpCode.OP_CONSTANT)
bytecode.write_chunk(constant)
bytecode.write_chunk(OpCode.OP_ADD)
bytecode.write_chunk(OpCode.OP_RETURN)
vm = VM()
vm.interpret_chunk(bytecode)
return 0
</code></pre>
<p>I've added some trace debugging so we can see what the VM and stack is doing.</p>
<p>The whole thing translates with RPython, and when run gives us:</p>
<pre><code class="nohighlight">./vm3
== VM TRACE ==
[]
0000 OP_CONSTANT (00) '1'
[ 1 ]
0002 OP_CONSTANT (01) '2'
[ 1 ] [ 2 ]
0004 OP_ADD
[ 3 ]
0005 OP_RETURN
3
</code></pre>
<p>Yes we just computed the result of <code>1+2</code>. Pat yourself on the back. </p>
<p>At this point it is probably valid to check that the translated executable is actually
faster than running our program directly in Python. For this trivial example under
<code>Python2</code>/<code>pypy</code> this <code>targetvm3.py</code> file runs in the 20ms – 90ms region, and the
compiled <code>vm3</code> runs in <5ms. Something useful must be happening during the translation.</p>
<p>I won't go through the code adding support for our other instructions as they are
very similar and straightforward. Our VM is ready to execute our chunks of bytecode,
but we haven't yet worked out how to take the entered expression and turn that into
this simple bytecode. This is broken into two steps, scanning and compiling.</p>
<h2 id="scanningthesource">Scanning the source</h2>
<p><em>All the source for this section can be found in
<a href="https://github.com/hardbyte/rpython-post/blob/master/section-3-scanning">section-3-scanning</a>.</em></p>
<p>The job of the scanner is to take the raw expression string and transform it into
a sequence of tokens. This scanning step will strip out whitespace and comments,
catch errors with invalid token and tokenize the string. For example the input
<code>"( 1 + 2 )</code> would get tokenized into <code>LEFT_PAREN, NUMBER(1), PLUS, NUMBER(2), RIGHT_PAREN</code>.</p>
<p>As with our <code>OpCodes</code> we will just define a simple Python class to define an <code>int</code>
for each type of token:</p>
<pre><code class="python language-python">class TokenTypes:
ERROR = 0
EOF = 1
LEFT_PAREN = 2
RIGHT_PAREN = 3
MINUS = 4
PLUS = 5
SLASH = 6
STAR = 7
NUMBER = 8
</code></pre>
<p>A token has to keep some other information as well – keeping track of the <code>location</code> and
<code>length</code> of the token will be helpful for error reporting. The <code>NUMBER</code> token clearly needs
some data about the value it is representing: we could include a copy of the source lexeme
(e.g. the string <code>2.0</code>), or parse the value and store that, or – what we will do in this
blog – use the <code>location</code> and <code>length</code> information as pointers into the original source
string. Every token type (except perhaps <code>ERROR</code>) will use this simple data structure: </p>
<pre><code class="python language-python">class Token(object):
def __init__(self, start, length, token_type):
self.start = start
self.length = length
self.type = token_type
</code></pre>
<p>Our soon to be created scanner will create these <code>Token</code> objects which refer back to
addresses in some source. If the scanner sees the source <code>"( 1 + 2.0 )"</code> it would emit
the following tokens:</p>
<pre><code class="python language-python">Token(0, 1, TokenTypes.LEFT_PAREN)
Token(2, 1, TokenTypes.NUMBER)
Token(4, 1, TokenTypes.PLUS)
Token(6, 3, TokenTypes.NUMBER)
Token(10, 1, TokenTypes.RIGHT_PAREN)
</code></pre>
<h3 id="scanner">Scanner</h3>
<p>Let's walk through the scanner <a href="https://github.com/hardbyte/rpython-post/blob/master/section-3-scanning/scanner.py">implementation</a> method
by method. The scanner will take the source and pass through it once, creating tokens
as it goes.</p>
<pre><code class="python language-python">class Scanner(object):
def __init__(self, source):
self.source = source
self.start = 0
self.current = 0
</code></pre>
<p>The <code>start</code> and <code>current</code> variables are character indices in the source string that point to
the current substring being considered as a token. </p>
<p>For example in the string <code>"(51.05+2)"</code> while we are tokenizing the number <code>51.05</code>
we will have <code>start</code> pointing at the <code>5</code>, and advance <code>current</code> character by character
until the character is no longer part of a number. Midway through scanning the number
the <code>start</code> and <code>current</code> values might point to <code>1</code> and <code>4</code> respectively:</p>
<table border='1' style='border-collapse:collapse'>
<thead>
<tr>
<th>0</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
</tr>
</thead>
<tbody>
<tr>
<td>"("</td>
<td>"5"</td>
<td>"1"</td>
<td>"."</td>
<td>"0"</td>
<td>"5"</td>
<td>"+"</td>
<td>"2"</td>
<td>")"</td>
</tr>
<tr>
<td></td>
<td> ^</td>
<td></td>
<td></td>
<td> ^</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
<p>From <code>current=4</code> the scanner peeks ahead and sees that the next character (<code>5</code>) is
a digit, so will continue to advance.</p>
<table border='1' style='border-collapse:collapse'>
<thead>
<tr>
<th>0</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
</tr>
</thead>
<tbody>
<tr>
<td>"("</td>
<td>"5"</td>
<td>"1"</td>
<td>"."</td>
<td>"0"</td>
<td>"5"</td>
<td>"+"</td>
<td>"2"</td>
<td>")"</td>
</tr>
<tr>
<td></td>
<td> ^</td>
<td></td>
<td></td>
<td></td>
<td> ^</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
<p>When the scanner peeks ahead and sees the <code>"+"</code> it will create the number
token and emit it. The method that carry's out this tokenizing is <code>_number</code>:</p>
<pre><code class="python language-python"> def _number(self):
while self._peek().isdigit():
self.advance()
# Look for decimal point
if self._peek() == '.' and self._peek_next().isdigit():
self.advance()
while self._peek().isdigit():
self.advance()
return self._make_token(TokenTypes.NUMBER)
</code></pre>
<p>It relies on a few helpers to look ahead at the upcoming characters:</p>
<pre><code class="python language-python"> def _peek(self):
if self._is_at_end():
return '\0'
return self.source[self.current]
def _peek_next(self):
if self._is_at_end():
return '\0'
return self.source[self.current+1]
def _is_at_end(self):
return len(self.source) == self.current
</code></pre>
<p>If the character at <code>current</code> is still part of the number we want to call <code>advance</code>
to move on by one character.</p>
<pre><code class="python language-python"> def advance(self):
self.current += 1
return self.source[self.current - 1]
</code></pre>
<p>Once the <code>isdigit()</code> check fails in <code>_number()</code> we call <code>_make_token()</code> to emit the
token with the <code>NUMBER</code> type.</p>
<pre><code class="python language-python"> def _make_token(self, token_type):
return Token(
start=self.start,
length=(self.current - self.start),
token_type=token_type
)
</code></pre>
<p>Note again that the token is linked to an index address in the source, rather than
including the string value.</p>
<p>Our scanner is pull based, a token will be requested via <code>scan_token</code>. First we skip
past whitespace and depending on the characters emit the correct token:</p>
<pre><code class="python language-python"> def scan_token(self):
# skip any whitespace
while True:
char = self._peek()
if char in ' \r\t\n':
self.advance()
break
self.start = self.current
if self._is_at_end():
return self._make_token(TokenTypes.EOF)
char = self.advance()
if char.isdigit():
return self._number()
if char == '(':
return self._make_token(TokenTypes.LEFT_PAREN)
if char == ')':
return self._make_token(TokenTypes.RIGHT_PAREN)
if char == '-':
return self._make_token(TokenTypes.MINUS)
if char == '+':
return self._make_token(TokenTypes.PLUS)
if char == '/':
return self._make_token(TokenTypes.SLASH)
if char == '*':
return self._make_token(TokenTypes.STAR)
return ErrorToken("Unexpected character", self.current)
</code></pre>
<p></p>
<p>If this was a real programming language we were scanning, this would be the point where we
add support for different types of literals and any language identifiers/reserved words.</p>
<p>At some point we will need to parse the literal value for our numbers, but we leave that
job for some later component, for now we'll just add a <code>get_token_string</code> helper. To make
sure that RPython is happy to index arbitrary slices of <code>source</code> we add range assertions:</p>
<pre><code class="python language-python"> def get_token_string(self, token):
if isinstance(token, ErrorToken):
return token.message
else:
end_loc = token.start + token.length
assert end_loc < len(self.source)
assert end_loc > 0
return self.source[token.start:end_loc]
</code></pre>
<p>A simple entry point can be used to test our scanner with a hard coded
source string:</p>
<p><a href="https://github.com/hardbyte/rpython-post/blob/master/section-3-scanning/targetscanner1.py"><code>targetscanner1.py</code></a></p>
<pre><code class="python language-python">from scanner import Scanner, TokenTypes, TokenTypeToName
def entry_point(argv):
source = "( 1 + 2.0 )"
scanner = Scanner(source)
t = scanner.scan_token()
while t.type != TokenTypes.EOF and t.type != TokenTypes.ERROR:
print TokenTypeToName[t.type],
if t.type == TokenTypes.NUMBER:
print "(%s)" % scanner.get_token_string(t),
print
t = scanner.scan_token()
return 0
</code></pre>
<p>RPython didn't complain, and lo it works:</p>
<pre><code class="nohighlight">$ ./scanner1
LEFT_PAREN
NUMBER (1)
PLUS
NUMBER (2.0)
RIGHT_PAREN
</code></pre>
<p>Let's connect our REPL to the scanner.</p>
<p><a href="https://github.com/hardbyte/rpython-post/blob/master/section-3-scanning/targetscanner2.py"><code>targetscanner2.py</code></a></p>
<pre><code class="python language-python">from rpython.rlib import rfile
from scanner import Scanner, TokenTypes, TokenTypeToName
LINE_BUFFER_LENGTH = 1024
def repl(stdin, stdout):
while True:
stdout.write("> ")
source = stdin.readline(LINE_BUFFER_LENGTH)
scanner = Scanner(source)
t = scanner.scan_token()
while t.type != TokenTypes.EOF and t.type != TokenTypes.ERROR:
print TokenTypeToName[t.type],
if t.type == TokenTypes.NUMBER:
print "(%s)" % scanner.get_token_string(t),
print
t = scanner.scan_token()
def entry_point(argv):
stdin, stdout, stderr = rfile.create_stdio()
try:
repl(stdin, stdout)
except:
pass
return 0
</code></pre>
<p>With our REPL hooked up we can now scan tokens from arbitrary input:</p>
<pre><code class="nohighlight">$ ./scanner2
> (3 *4) - -3
LEFT_PAREN
NUMBER (3)
STAR
NUMBER (4)
RIGHT_PAREN
MINUS
MINUS
NUMBER (3)
> ^C
</code></pre>
<h2 id="compilingexpressions">Compiling expressions</h2>
<h3 id="references">References</h3>
<ul>
<li>https://www.craftinginterpreters.com/compiling-expressions.html</li>
<li>http://effbot.org/zone/simple-top-down-parsing.htm</li>
</ul>
<p>The final piece is to turn this sequence of tokens into our low level
bytecode instructions for the virtual machine to execute. Buckle up,
we are about to write us a compiler.</p>
<p>Our compiler will take a single pass over the tokens using
<a href="https://en.wikipedia.org/wiki/Vaughan_Pratt">Vaughan Pratt’s</a>
parsing technique, and output a chunk of bytecode – if we do it
right it will be compatible with our existing virtual machine.</p>
<p>Remember the bytecode we defined above is really simple – by relying
on our stack we can transform a nested expression into a sequence of
our bytecode operations.</p>
<p>To make this more concrete let's go through by hand translating an
expression into bytecode.</p>
<p>Our source expression:</p>
<pre><code class="nohighlight">(3 + 2) - (7 * 2)
</code></pre>
<p>If we were to make an abstract syntax tree we'd get something
like this:</p>
<p><a href="https://4.bp.blogspot.com/-9mH1n1YF3rA/W-wxcRXRNPI/AAAAAAAAm5Y/PFqcPlOQ8KcSfIoxdDZHJO3Tby1vKqOKACPcBGAYYCw/s1600/ast.jpg" imageanchor="1" ><img border="0" src="https://4.bp.blogspot.com/-9mH1n1YF3rA/W-wxcRXRNPI/AAAAAAAAm5Y/PFqcPlOQ8KcSfIoxdDZHJO3Tby1vKqOKACPcBGAYYCw/s400/ast.jpg" width="400" height="187" data-original-width="1600" data-original-height="749" /></a></p>
<p>Now if we start at the first sub expression <code>(3+2)</code> we can clearly
note from the first open bracket that we <em>must</em> see a close bracket,
and that the expression inside that bracket <em>must</em> be valid on its
own. Not only that but regardless of the inside we know that the whole
expression still has to be valid. Let's focus on this first bracketed
expression, let our attention recurse into it so to speak.</p>
<p>This gives us a much easier problem – we just want to get our virtual
machine to compute <code>3 + 2</code>. In this bytecode dialect we would load the
two constants, and then add them with <code>OP_ADD</code> like so: </p>
<pre><code class="nohighlight">OP_CONSTANT (00) '3.000000'
OP_CONSTANT (01) '2.000000'
OP_ADD
</code></pre>
<p>The effect of our vm executing these three instructions is that sitting
pretty at the top of the stack is the result of the addition. Winning.</p>
<p>Jumping back out from our bracketed expression, our next token is <code>MINUS</code>,
at this point we have a fair idea that it must be used in an infix position.
In fact whatever token followed the bracketed expression it <strong>must</strong> be a
valid infix operator, if not the expression is over or had a syntax error. </p>
<p>Assuming the best from our user (naive), we handle <code>MINUS</code> the same way
we handled the first <code>PLUS</code>. We've already got the first operand on the
stack, now we compile the right operand and <strong>then</strong> write out the bytecode
for <code>OP_SUBTRACT</code>.</p>
<p>The right operand is another simple three instructions:</p>
<pre><code class="nohighlight">OP_CONSTANT (02) '7.000000'
OP_CONSTANT (03) '2.000000'
OP_MULTIPLY
</code></pre>
<p>Then we finish our top level binary expression and write a <code>OP_RETURN</code> to
return the value at the top of the stack as the execution's result. Our
final hand compiled program is:</p>
<pre><code class="nohighlight">OP_CONSTANT (00) '3.000000'
OP_CONSTANT (01) '2.000000'
OP_ADD
OP_CONSTANT (02) '7.000000'
OP_CONSTANT (03) '2.000000'
OP_MULTIPLY
OP_SUBTRACT
OP_RETURN
</code></pre>
<p>Ok that wasn't so hard was it? Let's try make our code do that.</p>
<p>We define a parser object which will keep track of where we are, and
whether things have all gone horribly wrong:</p>
<pre><code class="python language-python">class Parser(object):
def __init__(self):
self.had_error = False
self.panic_mode = False
self.current = None
self.previous = None
</code></pre>
<p>The compiler will also be a class, we'll need one of our <code>Scanner</code> instances
to pull tokens from, and since the output is a bytecode <code>Chunk</code> let's go ahead
and make one of those in our compiler initializer:</p>
<pre><code class="python language-python">class Compiler(object):
def __init__(self, source):
self.parser = Parser()
self.scanner = Scanner(source)
self.chunk = Chunk()
</code></pre>
<p>Since we have this (empty) chunk of bytecode we will make a helper method
to add individual bytes. Every instruction will pass from our compiler into
an executable program through this simple .</p>
<pre><code class="python language-python"> def emit_byte(self, byte):
self.current_chunk().write_chunk(byte)
</code></pre>
<p>To quote from Bob Nystrom on the Pratt parsing technique:</p>
<blockquote>
<p>the implementation is a deceptively-simple handful of deeply intertwined code</p>
</blockquote>
<p>I don't actually think I can do justice to this section. Instead I suggest
reading his treatment in
<a href="http://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/">Pratt Parsers: Expression Parsing Made Easy</a>
which explains the magic behind the parsing component. Our only major difference is
instead of creating an AST we are going to directly emit bytecode for our VM.</p>
<p>Now that I've absolved myself from taking responsibility in explaining this somewhat
tricky concept, I'll discuss some of the code from
<a href="https://github.com/hardbyte/rpython-post/blob/master/section-4-compiler/compiler.py"><code>compiler.py</code></a>, and walk through what happens
for a particular rule.</p>
<p>I'll jump straight to the juicy bit the table of parse rules. We define a <code>ParseRule</code>
for each token, and each rule comprises:</p>
<ul>
<li>an optional handler for when the token is as a <em>prefix</em> (e.g. the minus in <code>(-2)</code>),</li>
<li>an optional handler for whet the token is used <em>infix</em> (e.g. the slash in <code>2/47</code>)</li>
<li>a precedence value (a number that determines what is of higher precedence)</li>
</ul>
<pre><code class="python language-python">rules = [
ParseRule(None, None, Precedence.NONE), # ERROR
ParseRule(None, None, Precedence.NONE), # EOF
ParseRule(Compiler.grouping, None, Precedence.CALL), # LEFT_PAREN
ParseRule(None, None, Precedence.NONE), # RIGHT_PAREN
ParseRule(Compiler.unary, Compiler.binary, Precedence.TERM), # MINUS
ParseRule(None, Compiler.binary, Precedence.TERM), # PLUS
ParseRule(None, Compiler.binary, Precedence.FACTOR), # SLASH
ParseRule(None, Compiler.binary, Precedence.FACTOR), # STAR
ParseRule(Compiler.number, None, Precedence.NONE), # NUMBER
]
</code></pre>
<p>These rules really are the magic of our compiler. When we get to a particular
token such as <code>MINUS</code> we see if it is an infix operator and if so we've gone and
got its first operand ready. At all times we rely on the relative precedence; consuming
everything with higher precedence than the operator we are currently evaluating.</p>
<p>In the expression:</p>
<pre><code class="nohighlight">2 + 3 * 4
</code></pre>
<p>The <code>*</code> has higher precedence than the <code>+</code>, so <code>3 * 4</code> will be parsed together
as the second operand to the first infix operator (the <code>+</code>) which follows
the <a href="https://en.wikipedia.org/wiki/Order_of_operations#Mnemonics">BEDMAS</a>
order of operations I was taught at high school.</p>
<p>To encode these precedence values we make another Python object moonlighting
as an enum:</p>
<pre><code class="python language-python">class Precedence(object):
NONE = 0
DEFAULT = 1
TERM = 2 # + -
FACTOR = 3 # * /
UNARY = 4 # ! - +
CALL = 5 # ()
PRIMARY = 6
</code></pre>
<p>What happens in our compiler when turning <code>-2.0</code> into bytecode? Assume we've just
pulled the token <code>MINUS</code> from the scanner. Every expression <strong>has</strong> to start with some
type of prefix – whether that is:</p>
<ul>
<li>a bracket group <code>(</code>, </li>
<li>a number <code>2</code>, </li>
<li>or a prefix unary operator <code>-</code>. </li>
</ul>
<p>Knowing that, our compiler assumes there is a <code>prefix</code> handler in the rule table – in
this case it points us at the <code>unary</code> handler.</p>
<pre><code class="python language-python"> def parse_precedence(self, precedence):
# parses any expression of a given precedence level or higher
self.advance()
prefix_rule = self._get_rule(self.parser.previous.type).prefix
prefix_rule(self)
</code></pre>
<p></p>
<p><code>unary</code> is called:</p>
<pre><code class="python language-python"> def unary(self):
op_type = self.parser.previous.type
# Compile the operand
self.parse_precedence(Precedence.UNARY)
# Emit the operator instruction
if op_type == TokenTypes.MINUS:
self.emit_byte(OpCode.OP_NEGATE)
</code></pre>
<p>Here – before writing the <code>OP_NEGATE</code> opcode we recurse back into <code>parse_precedence</code>
to ensure that <em>whatever</em> follows the <code>MINUS</code> token is compiled – provided it has
higher precedence than <code>unary</code> – e.g. a bracketed group.
Crucially at run time this recursive call will ensure that the result is left
on top of our stack. Armed with this knowledge, the <code>unary</code> method just
has to emit a single byte with the <code>OP_NEGATE</code> opcode.</p>
<h3 id="testcompilation">Test compilation</h3>
<p>Now we can test our compiler by outputting disassembled bytecode
of our user entered expressions. Create a new entry_point
<a href="https://github.com/hardbyte/rpython-post/blob/master/section-4-compiler/targetcompiler1.py"><code>targetcompiler</code></a>:</p>
<pre><code class="python language-python">from rpython.rlib import rfile
from compiler import Compiler
LINE_BUFFER_LENGTH = 1024
def entry_point(argv):
stdin, stdout, stderr = rfile.create_stdio()
try:
while True:
stdout.write("> ")
source = stdin.readline(LINE_BUFFER_LENGTH)
compiler = Compiler(source, debugging=True)
compiler.compile()
except:
pass
return 0
</code></pre>
<p>Translate it and test it out:</p>
<pre><code class="nohighlight">$ ./compiler1
> (2/4 + 1/2)
== code ==
0000 OP_CONSTANT (00) '2.000000'
0002 OP_CONSTANT (01) '4.000000'
0004 OP_DIVIDE
0005 OP_CONSTANT (02) '1.000000'
0007 OP_CONSTANT (00) '2.000000'
0009 OP_DIVIDE
0010 OP_ADD
0011 OP_RETURN
</code></pre>
<p>Now if you've made it this far you'll be eager to finally connect everything
together by executing this bytecode with the virtual machine.</p>
<h2 id="endtoend">End to end</h2>
<p>All the pieces slot together rather easily at this point, create a new
file <a href="https://github.com/hardbyte/rpython-post/blob/master/section-5-execution/targetcalc.py"><code>targetcalc.py</code></a> and define our
entry point:</p>
<pre><code class="python language-python">from rpython.rlib import rfile
from compiler import Compiler
from vm import VM
LINE_BUFFER_LENGTH = 4096
def entry_point(argv):
stdin, stdout, stderr = rfile.create_stdio()
vm = VM()
try:
while True:
stdout.write("> ")
source = stdin.readline(LINE_BUFFER_LENGTH)
if source:
compiler = Compiler(source, debugging=False)
compiler.compile()
vm.interpret_chunk(compiler.chunk)
except:
pass
return 0
def target(driver, *args):
driver.exe_name = "calc"
return entry_point, None
</code></pre>
<p></p>
<p>Let's try catch it out with a double negative:</p>
<pre><code class="nohighlight">$ ./calc
> 2--3
== VM TRACE ==
[]
0000 OP_CONSTANT (00) '2.000000'
[ 2.000000 ]
0002 OP_CONSTANT (01) '3.000000'
[ 2.000000 ] [ 3.000000 ]
0004 OP_NEGATE
[ 2.000000 ] [ -3.000000 ]
0005 OP_SUBTRACT
[ 5.000000 ]
0006 OP_RETURN
5.000000
</code></pre>
<p>Ok well let's evaluate the first 50 terms of the
<a href="https://en.wikipedia.org/wiki/Pi#Infinite_series">Nilakantha Series</a>:</p>
<pre><code class="nohighlight">$ ./calc
> 3 + 4 * ((1/(2 * 3 * 4)) + (1/(4 * 5 * 6)) - (1/(6 * 7 * 8)) + (1/(8 * 9 * 10)) - (1/(10 * 11 * 12)) + (1/(12 * 13 * 14)) - (1/(14 * 15 * 16)) + (1/(16 * 17 * 18)) - (1/(18 * 19 * 20)) + (1/(20 * 21 * 22)) - (1/(22 * 23 * 24)) + (1/(24 * 25 * 26)) - (1/(26 * 27 * 28)) + (1/(28 * 29 * 30)) - (1/(30 * 31 * 32)) + (1/(32 * 33 * 34)) - (1/(34 * 35 * 36)) + (1/(36 * 37 * 38)) - (1/(38 * 39 * 40)) + (1/(40 * 41 * 42)) - (1/(42 * 43 * 44)) + (1/(44 * 45 * 46)) - (1/(46 * 47 * 48)) + (1/(48 * 49 * 50)) - (1/(50 * 51 * 52)) + (1/(52 * 53 * 54)) - (1/(54 * 55 * 56)) + (1/(56 * 57 * 58)) - (1/(58 * 59 * 60)) + (1/(60 * 61 * 62)) - (1/(62 * 63 * 64)) + (1/(64 * 65 * 66)) - (1/(66 * 67 * 68)) + (1/(68 * 69 * 70)) - (1/(70 * 71 * 72)) + (1/(72 * 73 * 74)) - (1/(74 * 75 * 76)) + (1/(76 * 77 * 78)) - (1/(78 * 79 * 80)) + (1/(80 * 81 * 82)) - (1/(82 * 83 * 84)) + (1/(84 * 85 * 86)) - (1/(86 * 87 * 88)) + (1/(88 * 89 * 90)) - (1/(90 * 91 * 92)) + (1/(92 * 93 * 94)) - (1/(94 * 95 * 96)) + (1/(96 * 97 * 98)) - (1/(98 * 99 * 100)) + (1/(100 * 101 * 102)))
== VM TRACE ==
[]
0000 OP_CONSTANT (00) '3.000000'
[ 3.000000 ]
0002 OP_CONSTANT (01) '4.000000'
...SNIP...
0598 OP_CONSTANT (101) '102.000000'
[ 3.000000 ] [ 4.000000 ] [ 0.047935 ] [ 1.000000 ] [ 10100.000000 ] [ 102.000000 ]
0600 OP_MULTIPLY
[ 3.000000 ] [ 4.000000 ] [ 0.047935 ] [ 1.000000 ] [ 1030200.000000 ]
0601 OP_DIVIDE
[ 3.000000 ] [ 4.000000 ] [ 0.047935 ] [ 0.000001 ]
0602 OP_ADD
[ 3.000000 ] [ 4.000000 ] [ 0.047936 ]
0603 OP_MULTIPLY
[ 3.000000 ] [ 0.191743 ]
0604 OP_ADD
[ 3.191743 ]
0605 OP_RETURN
3.191743
</code></pre>
<p>We just executed 605 virtual machine instructions to compute pi to 1dp!</p>
<p>This brings us to the end of this tutorial. To recap we've walked through the whole
compilation process: from the user providing an expression string on the REPL, scanning
the source string into tokens, parsing the tokens while accounting for relative
precedence via a Pratt parser, generating bytecode, and finally executing the bytecode
on our own VM. RPython translated what we wrote into C and compiled it, meaning
our resulting <code>calc</code> REPL is really fast.</p>
<blockquote>
<p>“The world is a thing of utter inordinate complexity and richness and strangeness that is absolutely awesome.”</p>
<p>― Douglas Adams </p>
</blockquote>
<p>Many thanks to Bob Nystrom for writing the book that inspired this post, and thanks to
Carl Friedrich and Matt Halverson for reviewing.</p>
<p>― Brian (<a href="https://twitter.com/thorneynz">@thorneynzb</a>)</p>
Carl Friedrich Bolz-Tereickhttp://www.blogger.com/profile/00518922641059511014noreply@blogger.com0