PEP 600 – Future ‘manylinux’ Platform Tags for Portable Linux Built Distributions
- Author:
- Nathaniel J. Smith <njs at pobox.com>, Thomas Kluyver <thomas at kluyver.me.uk>
- Sponsor:
- Paul Moore <p.f.moore at gmail.com>
- BDFL-Delegate:
- Paul Moore <p.f.moore at gmail.com>
- Discussions-To:
- Discourse thread
- Status:
- Final
- Type:
- Standards Track
- Topic:
- Packaging
- Created:
- 03-May-2019
- Post-History:
- 03-May-2019
- Replaces:
- 513, 571, 599
- Resolution:
- Discourse post
Abstract
This PEP proposes a scheme for new ‘manylinux’ wheel tags to be defined without requiring a PEP for every specific tag, similar to how Windows and macOS tags already work. This will allow package maintainers to take advantage of new tags more quickly, while making better use of limited volunteer time.
Non-goals include: handling non-glibc-based platforms; integrating with external package managers or handling external dependencies such as CUDA; making manylinux tags more sophisticated than their Windows/macOS equivalents; doing anything besides taking our existing tried-and-tested approach and streamlining it. These are important issues and other PEPs may address them in the future, but for this PEP they’re out of scope.
Rationale
Python users appreciate it when PyPI has pre-compiled packages for their platform, because it makes installation fast and simple. But distributing pre-compiled binaries on Linux is challenging because of the diversity of Linux-based platforms. For example, Debian, Android, and Alpine all use the Linux kernel, but with radically different userspace libraries, which makes it difficult or impossible to create a single wheel that works on all three. This complexity has caused many previous discussions of Linux wheels to stall out.
The “manylinux” project succeeded by adopting a strategy of ruthless pragmatism. We chose a large but tractable set of Linux platforms – specifically, mainstream glibc-based distributions like Debian, OpenSuSE, Ubuntu, RHEL, etc. – and then we did whatever it takes to make wheels that work across all these platforms.
This approach requires many compromises. Manylinux wheels can only rely on external libraries that maintain a consistent ABI and are universally available across all these distributions, which in practice restricts them to a small set of core libraries like glibc and a few others. Wheels have to be built on carefully-chosen platforms of the oldest possible vintage, using a Python that is itself built in a carefully-chosen configuration. Other shared library dependencies have to be bundled into the wheel, which requires a complex process to avoid collisions between unrelated wheels. And finally, the details of these requirements change over time, as new distro versions are released, and old ones fall out of use.
It turns out that these requirements are not too onerous: they’re essentially equivalent to what you have to do to ship Windows or macOS wheels, and the manylinux approach has achieved substantial uptake among both package maintainers and end-users. But any manylinux PEP needs some way to address these complexities.
In previous manylinux PEPs (PEP 513, PEP 571, PEP 599), we’ve done this by attempting to write down in the PEP the exact set of libraries, symbol versions, Python configuration, etc. that we believed would lead to wheels that work on all mainstream glibc-based Linux systems. But this created several problems:
First, PEPs are generally supposed to be normative references: if software doesn’t match the PEP, then we fix the software. But in this case, the PEPs are attempting to describe Linux distributions, which are a moving target, and do not consider our PEPs to constrain their behavior. This means that we’ve been taking on an unbounded commitment to keep updating every manylinux PEP whenever the Linux distro landscape changes. This is a substantial commitment for unfunded volunteers to take on, and it’s not clear that this work produces value for our users.
And second, every time we move manylinux forward to a newer range of supported platforms, or add support for a new architecture, we have to go through a fairly elaborate process: writing a new PEP, updating the PyPI and pip codebases to recognize the new tag, waiting for the new pip to percolate to users, etc. None of this happens on Windows/macOS; it’s only a tax on Linux maintainers. This slows deployment of new manylinux versions, and consumes part of our community’s limited PEP review bandwidth, thus slowing progress of the Python packaging ecosystem as a whole. This is especially problematic for less-popular architectures, who have less volunteer resources to overcome these barriers.
How can we fix it?
A manylinux PEP has to address three main audiences:
- Package installers, like pip, need to be able to determine which wheel tags are compatible with the system they find themselves running on. This requires some automated process to introspect the system and match it up with wheel tags.
- Package indexes, like PyPI, need to be able to validate which wheel tags are valid. Generally, this just requires something like a list of valid tags, or regex they match, with no need to know anything about the actual semantics for individual tags. (But see the discussion of upload verification below.)
- Package maintainers need to be able to build wheels that meet the requirements for a given wheel tag.
Here’s the key insight behind this new PEP: it’s crucial that different package installers and package indexes all agree on which manylinux tags are valid and which systems they install on, so we need a PEP to specify these – but, these are straightforward, and don’t really change between manylinux versions. The complicated part that keeps changing is the process of actually building the wheels – but, if there are multiple competing build environments, it doesn’t matter whether they use exactly the same rules as each other, as long as they all produce wheels that work on end-user systems. Therefore, we don’t need an interoperability standard for building wheels, so we don’t need to write the details into a PEP.
To further convince ourselves that this approach will work, let’s look again at how we handle wheels on Windows and macOS: the PEPs describe which tags are valid, and which systems they’re supposed to work on, but not how to actually build wheels for those platforms. And in practice, if you want to distribute Windows or macOS wheels, you might have to jump through some complicated and poorly documented hoops in order to bundle dependencies, target the right range of OS versions, etc. But the system works, and the way to improve it is to write better docs and build better tooling; no-one thinks that the way to make Windows wheels work better is to publish a PEP describing which symbols we think Microsoft should be including in their libraries and how their linker ought to work. This PEP extends that philosophy to manylinux as well.
Specification
Core definition
Tags using the new scheme will look like:
manylinux_2_17_x86_64
Or more generally:
manylinux_${GLIBCMAJOR}_${GLIBCMINOR}_${ARCH}
This tag is a promise: the wheel’s creator promises that the wheel
will work on any mainstream Linux distro that uses glibc version
${GLIBCMAJOR}.${GLIBCMINOR}
or later, and where the ${ARCH}
matches the return value from distutils.util.get_platform()
. (For
more detail about architecture tags, see PEP 425.)
If a user installs this wheel into an environment that matches these requirements and it doesn’t work, then that wheel does not comply with this specification. This should be considered a bug in the wheel, and it’s the wheel creator’s responsibility to look for a fix (possibly with the help of the broader community).
The word “mainstream” is intentionally somewhat vague, and should be interpreted expansively. The goal is to rule out weird homebrew Linux systems; generally any distro you’ve actually heard of should be considered “mainstream”. We also provide a way for maintainers of “weird” distros to manually override this check, though based on experience with previous manylinux PEPs, we don’t expect this feature to see much use.
And finally, compliant wheels are required to “play well with others”, i.e., installing a manylinux wheel must not cause other unrelated packages to break.
Any method of producing wheels which meets these criteria is acceptable. However, in practice we expect that the auditwheel project will maintain an up-to-date set of tools and build images for producing manylinux wheels, as well as documentation about how they work and how to use them, and that most maintainers will want to use those. For the latest information on building manylinux wheels, including recommendations about which build images to use, see https://packaging.python.org.
Since these requirements are fairly high-level, here are some examples of how they play out in specific situations:
Example: if a wheel is tagged as manylinux_2_17_x86_64
, but it
uses symbols that were only added in glibc 2.18, then that wheel won’t
work on systems with glibc 2.17. Therefore, we can conclude that this
wheel is in violation of this specification.
Example: Until ~2017, all major Linux distros included
libncursesw.so.5
as part of their default install. Until that
date, a wheel that linked to libncursesw.so.5
was compliant with
this specification. Then, distros started switching to ncurses 6,
which has a different name and incompatible ABI, and stopped
installing libncursesw.so.5
by default. So after that date, a
wheel that links to libncursesw.so.5
was no longer compliant with
this specification.
Example: The Linux ELF linker places all shared library SONAMEs into a single process-global namespace. If independent wheels used the same SONAME for their bundled libraries, they might end up colliding and using the wrong library version, which would violate the “play well with others” rule. Therefore, this specification requires that wheels use globally-unique names for all bundled libraries. (Auditwheel currently accomplishes this by renaming all bundled libraries to include a globally-unique hash.)
Example: we’ve observed certain wheels using C++ in ways that interfere with other packages via an unclear mechanism. This is also a violation of the “play well with others” rule, so those wheels aren’t compliant with this specification.
Example: The imaginary architecture LEG v7 has both big-endian and
little-endian variants. Big-endian binaries require a big-endian
system, and little-endian binaries require a little-endian system. But
unfortunately, it’s discovered that due to a bug in PEP 425, both
variants use the same architecture tag, legv7
. This makes it
impossible to create a compliant manylinux_2_17_legv7
wheel: no
matter what we do, it will crash on some user’s systems. So, we write
a new PEP defining architecture tags legv7le
and legv7be
; now
we can ship manylinux LEG v7 wheels.
Example: There’s also a LEG v8. It also has big-endian and
little-endian variants. But fortunately, it turns out that PEP 425
already does the right thing LEG v8, so LEG v8 enthusiasts can start
shipping manylinux_2_17_legv8le
and manylinux_2_17_legv8be
wheels immediately once this PEP is implemented, even though the
authors of this PEP don’t know anything at all about LEG v8.
Package installers
Generally, package installers should install manylinux wheels on systems that have an appropriate glibc and architecture, and not otherwise. If there are multiple compatible manylinux wheels available, then the wheel with the highest glibc version should be preferred, in order to take advantage of newer compilers and glibc features.
In addition, we follow previous specifications, and allow for Python
distributors to manually override this check by adding a
_manylinux
module to their standard library. If this package is
importable, and if it defines a function called
manylinux_compatible
, then package installers should call this
function, passing in the major version, minor version, and
architecture from the manylinux tag, and it will either return a
boolean saying whether wheels with the given tag should be considered
compatible with the current system, or else None
to indicate that
the default logic should be used.
For compatibility with previous specifications, if the tag is
manylinux1
or manylinux_2_5
exactly, then we also check the
module for a boolean attribute manylinux1_compatible
, if the
tag version is manylinux2010
or manylinux_2_12
exactly, then
we also check the module for a boolean attribute
manylinux2010_compatible
, and if the tag version is
manylinux2014
or manylinux_2_17
exactly, then we also check
the module for a boolean attribute manylinux2014_compatible
. If
both the new and old attributes are defined, then
manylinux_compatible
takes precedence.
Here’s some example code. You don’t have to actually use this code, but you can use it for reference if you have questions about the exact semantics:
LEGACY_ALIASES = {
"manylinux1_x86_64": "manylinux_2_5_x86_64",
"manylinux1_i686": "manylinux_2_5_i686",
"manylinux2010_x86_64": "manylinux_2_12_x86_64",
"manylinux2010_i686": "manylinux_2_12_i686",
"manylinux2014_x86_64": "manylinux_2_17_x86_64",
"manylinux2014_i686": "manylinux_2_17_i686",
"manylinux2014_aarch64": "manylinux_2_17_aarch64",
"manylinux2014_armv7l": "manylinux_2_17_armv7l",
"manylinux2014_ppc64": "manylinux_2_17_ppc64",
"manylinux2014_ppc64le": "manylinux_2_17_ppc64le",
"manylinux2014_s390x": "manylinux_2_17_s390x",
}
def manylinux_tag_is_compatible_with_this_system(tag):
# Normalize and parse the tag
tag = LEGACY_ALIASES.get(tag, tag)
m = re.match("manylinux_([0-9]+)_([0-9]+)_(.*)", tag)
if not m:
return False
tag_major_str, tag_minor_str, tag_arch = m.groups()
tag_major = int(tag_major_str)
tag_minor = int(tag_minor_str)
if not system_uses_glibc():
return False
sys_major, sys_minor = get_system_glibc_version()
if (sys_major, sys_minor) < (tag_major, tag_minor):
return False
sys_arch = get_system_arch()
if sys_arch != tag_arch:
return False
# Check for manual override
try:
import _manylinux
except ImportError:
pass
else:
if hasattr(_manylinux, "manylinux_compatible"):
result = _manylinux.manylinux_compatible(
tag_major, tag_minor, tag_arch,
)
if result is not None:
return bool(result)
else:
if (tag_major, tag_minor) == (2, 5):
if hasattr(_manylinux, "manylinux1_compatible"):
return bool(_manylinux.manylinux1_compatible)
if (tag_major, tag_minor) == (2, 12):
if hasattr(_manylinux, "manylinux2010_compatible"):
return bool(_manylinux.manylinux2010_compatible)
return True
Package indexes
The exact set of wheel tags accepted by PyPI, or any package index, is a policy question, and up to the maintainers of that index. But, we recommend that package indexes accept any wheels whose platform tag matches the following regexes:
manylinux1_(x86_64|i686)
manylinux2010_(x86_64|i686)
manylinux2014_(x86_64|i686|aarch64|armv7l|ppc64|ppc64le|s390x)
manylinux_[0-9]+_[0-9]+_(.*)
Package indexes may impose additional requirements; for example, they
might audit uploaded wheels and reject those that contain known
problems, such as a manylinux_2_17
wheel that references symbols
from later glibc versions, or dependencies on external libraries that
are known not to exist on all systems. Or a package index might decide
to be conservative and reject wheels tagged manylinux_2_999
, on
the grounds that no-one knows what the Linux distro landscape will
look like when glibc 2.999 is released. We leave the details of any
such checks to the discretion of the package index maintainers.
Rejected alternatives
Continuing the manylinux20XX series: As discussed above, this leads to much more effort-intensive, slower, and more complex rollouts of new versions. And while there are two places where it seems at first to have some compensating benefits, if you look more closely this turns out not to be the case.
First, this forces us to produce human-readable descriptions of how Linux distros work, in the text of the PEP. But this is less valuable than it might seem at first, and can actually be handled better by the new “perennial” approach anyway.
If you’re trying to build wheels, the main thing you need is a tutorial on how to use the build images and tooling around them. If you’re trying to add support for a new build profile or create a competitor to auditwheel, then your best resources will be the auditwheel source code and issue tracker, which are always going to be more detailed, precise, and reliable than a summary spec written in English and without tests. Documentation like the old manylinux20XX PEPs does add value! But in both cases, it’s primarily as a secondary reference to provide overview and context.
And furthermore, the PEP process is poorly suited to maintaining this kind of reference documentation – there’s a reason we don’t keep the pip user manual in the PEPs repository! The auditwheel maintainers are the best situated to understand what kinds of documentation are useful to their users, and to maintain that documentation over time. For example, there’s substantial overlap between the different manylinux versions, and the PEP process currently forces us to handle this by copy-pasting everything between a growing list of documents; instead, the auditwheel maintainers might choose to factor out the common parts into a single piece of shared documentation.
A related concern was that with the perennial approach, it may become
harder for package maintainers to decide which build profile to
target: instead of having to pick between manylinux1
,
manylinux2010
, manylinux2014
, …, they now have a wider array
of options like manylinux_2_5
, manylinux_2_6
, …,
manylinux_2_20
, … But again, we don’t believe this will be a
problem in practice. In either system, most package maintainers won’t
be starting by reading PEPs and trying to implement them from scratch.
If you’re a particularly expert and ambitious package maintainer who
needs to target a new version or new architecture, the perennial
approach gives you additional flexibility. But for regular everyday
maintainers, we expect they’ll start from a tutorial like
packaging.python.org, and by choosing from existing build images. A
tutorial can just as easily recommend manylinux_2_17
as it can
recommend manylinux2014
, and we expect the actual set of
pre-provided build images to be identical in both cases. And again, by
maintaining this documentation in the right place, instead of trying
to do it PEPs repository, we expect that we’ll end up with
documentation that’s higher-quality and more fitted to purpose.
Finally, some participants have pointed out that it’s very nice to be able to look at a wheel and tell definitively whether it meets the requirements of the spec. With the new “perennial” approach, we can never say with 100% certainty that a wheel does meet the spec, because that depends on the Linux distros. As engineers we have a well-justified dislike for that kind of uncertainty.
However: as demonstrated by the examples above, we can still tell definitively when a wheel doesn’t meet the spec, which turns out to be what’s important in practice. And, in practice, with the manylinux20XX approach, whenever distros change, we actually change the spec; it takes a bit longer. So even if a wheel was compliant today, it might be become non-compliant tomorrow. This is frustrating, but unfortunately this uncertainty is unavoidable if what you care about is distributing working wheels to users.
So even on these points where the old approach initially seems to have advantages, we expect the new approach to actually do as well or better.
Switching to perennial tags, but continuing to write a PEP for each version: This was proposed as a kind of hybrid, to try to get some of the advantages of the perennial tagging system – like easier rollouts of new versions – while keeping the advantages of the manylinux20XX scheme, like forcing us to write documentation about Linux distros, simplifying options for package maintainers, and being able to definitively tell when a wheel meets the spec. But as discussed above, on a closer look, it turns out that these advantages are largely illusory. And this also inherits significant disadvantages from the manylinux20XX scheme, like creating indefinite obligations to update a growing list of copy-pasted PEPs.
Making auditwheel normative: Another possibility that was
considered was to make auditwheel the normative reference on the
definition of manylinux, i.e., a wheel would be compliant if and only
if auditwheel check
completed without errors. This was rejected
because the point of packaging PEPs is to define interoperability
between tools, not to bless specific tools.
Adding extra words to the tag string: Another proposal we
considered was to add extra words to the wheel tag, e.g.
manylinux_glibc_2_17
instead of manylinux_2_17
. The motivation
would be to leave the door open to other kinds of versioning
heuristics in the future – for example, we could have
manylinux_glibc_$VERSION
and manylinux_alpine_$VERSION
.
But “manylinux” has always been a synonym for “broad compatibility
with mainstream glibc-based distros”; reusing it for unrelated build
profiles like alpine is more confusing than helpful. Also, some early
reviewers who aren’t steeped in the details of packaging found the
word glibc
actively misleading, jumping to the conclusion that it
meant they needed a system with exactly that glibc version. And tags
like manylinux_$VERSION
and alpine_$VERSION
also have the
advantages of parsimony and directness. So we’ll go with that.
Source: https://github.com/python/peps/blob/main/pep-0600.rst
Last modified: 2022-06-21 21:47:58 GMT