PEP 421 – Adding sys.implementation
- Author:
- Eric Snow <ericsnowcurrently at gmail.com>
- BDFL-Delegate:
- Barry Warsaw
- Status:
- Final
- Type:
- Standards Track
- Created:
- 26-Apr-2012
- Python-Version:
- 3.3
- Post-History:
- 26-Apr-2012
- Resolution:
- Python-Dev message
Abstract
This PEP introduces a new attribute for the sys
module:
sys.implementation
. The attribute holds consolidated information
about the implementation of the running interpreter. Thus
sys.implementation
is the source to which the standard library may
look for implementation-specific information.
The proposal in this PEP is in line with a broader emphasis on making
Python friendlier to alternate implementations. It describes the new
variable and the constraints on what that variable contains. The PEP
also explains some immediate use cases for sys.implementation
.
Motivation
For a number of years now, the distinction between Python-the-language and CPython (the reference implementation) has been growing. Most of this change is due to the emergence of Jython, IronPython, and PyPy as viable alternate implementations of Python.
Consider, however, the nearly two decades of CPython-centric Python (i.e. most of its existence). That focus has understandably contributed to quite a few CPython-specific artifacts both in the standard library and exposed in the interpreter. Though the core developers have made an effort in recent years to address this, quite a few of the artifacts remain.
Part of the solution is presented in this PEP: a single namespace in which to consolidate implementation specifics. This will help focus efforts to differentiate the implementation specifics from the language. Additionally, it will foster a multiple-implementation mindset.
Proposal
We will add a new attribute to the sys
module, called
sys.implementation
, as an object with attribute-access (as opposed
to a mapping). It will contain implementation-specific information.
The attributes of this object will remain fixed during interpreter
execution and through the course of an implementation version. This
ensures behaviors don’t change between versions which depend on
attributes of sys.implementation
.
The object has each of the attributes described in the Required Attributes section below. Those attribute names will never start with an underscore. The standard library and the language definition will rely only on those required attributes.
This proposal takes a conservative approach in requiring only a small number of attributes. As more become appropriate, they may be added with discretion, as described in Adding New Required Attributes.
While this PEP places no other constraints on sys.implementation
,
it also recommends that no one rely on capabilities outside those
described here. The only exception to that recommendation is for
attributes starting with an underscore. Implementers may use those
as appropriate to store per-implementation data.
Required Attributes
These are attributes in sys.implementation
on which the standard
library and language definition will rely, meaning implementers must
define them:
- name
- A lower-case identifier representing the implementation. Examples include ‘pypy’, ‘jython’, ‘ironpython’, and ‘cpython’.
- version
- The version of the implementation, as opposed to the version of the language it implements. This value conforms to the format described in Version Format.
- hexversion
- The version of the implementation in the same hexadecimal format as
sys.hexversion
. - cache_tag
- A string used for the PEP 3147 cache tag. It would
normally be a composite of the name and version (e.g. ‘cpython-33’
for CPython 3.3). However, an implementation may explicitly use a
different cache tag. If
cache_tag
is set to None, it indicates that module caching should be disabled.
Adding New Required Attributes
In time more required attributes will be added to
sys.implementation
. However, each must have a meaningful use case
across all Python implementations in order to be considered. This is
made most clear by a use case in the standard library or language
specification.
All proposals for new required attributes will go through the normal PEP process. Such a PEP need not be long, just long enough. It will need to sufficiently spell out the rationale for the new attribute, its use cases, and the impact it will have on the various Python implementations.
Version Format
A main point of sys.implementation
is to contain information that
will be used internally in the standard library. In order to
facilitate the usefulness of the version attribute, its value should
be in a consistent format across implementations.
As such, the format of sys.implementation.version
will follow that
of sys.version_info
, which is effectively a named tuple. It is a
familiar format and generally consistent with normal version format
conventions.
Rationale
The status quo for implementation-specific information gives us that information in a more fragile, harder to maintain way. It is spread out over different modules or inferred from other information, as we see with platform.python_implementation().
This PEP is the main alternative to that approach. It consolidates the implementation-specific information into a single namespace and makes explicit that which was implicit.
Type Considerations
It’s very easy to get bogged down in discussions about the type of
sys.implementation
. However, its purpose is to support the
standard library and language definition. As such, there isn’t much
that really matters regarding its type, as opposed to a feature that
would be more generally used. Thus characteristics like immutability
and sequence-ness have been disregarded.
The only real choice has been between an object with attribute access and a mapping with item access. This PEP espouses dotted access to reflect the relatively fixed nature of the namespace.
Non-Required Attributes
Earlier versions of this PEP included a required attribute called
metadata
that held any non-required, per-implementation data
[16]. However, this proved to be an unnecessary addition
considering the purpose of sys.implementation
.
Ultimately, non-required attributes are virtually ignored in this PEP.
They have no impact other than that careless use may collide with
future required attributes. That, however, is but a marginal concern
for sys.implementation
.
Why a Part of sys
?
The sys
module holds the new namespace because sys
is the depot
for interpreter-centric variables and functions. Many
implementation-specific attributes are already found in sys
.
Why Strict Constraints on Any of the Values?
As already noted in Version Format, values in
sys.implementation
are intended for use by the standard library.
Constraining those values, essentially specifying an API for them,
allows them to be used consistently, regardless of how they are
otherwise implemented. However, care should be take to not
over-specify the constraints.
Discussion
The topic of sys.implementation
came up on the python-ideas list
in 2009, where the reception was broadly positive [1]. I
revived the discussion recently while working on a pure-python
imp.get_tag()
[2]. Discussion has been ongoing
[3]. The messages in issue #14673 are also relevant.
A good part of the recent discussion centered on the type to use for
sys.implementation
.
Use-cases
platform.python_implementation()
“explicit is better than implicit”
The platform
module determines the python implementation by looking
for clues in a couple different sys
variables [11]. However,
this approach is fragile, requiring changes to the standard library
each time an implementation changes. Beyond that, support in
platform
is limited to those implementations that core developers
have blessed by special-casing them in the platform
module.
With sys.implementation
the various implementations would
explicitly set the values in their own version of the sys
module.
Another concern is that the platform
module is part of the stdlib,
which ideally should minimize implementation details such as would be
moved to sys.implementation
.
Any overlap between sys.implementation
and the platform
module
would simply defer to sys.implementation
(with the same interface
in platform
wrapping it).
Cache Tag Generation in Frozen Importlib
PEP 3147 defined the use of a module cache and cache tags for file names. The importlib bootstrap code, frozen into the Python binary as of 3.3, uses the cache tags during the import process. Part of the project to bootstrap importlib has been to clean code out of Python/import.c that did not need to be there any longer.
The cache tag defined in Python/import.c
was
hard-coded
to "cpython" MAJOR MINOR
. For importlib the options are
either hard-coding it in the same way, or guessing the implementation
in the same way as does platform.python_implementation()
.
As long as the hard-coded tag is limited to CPython-specific code, it is livable. However, inasmuch as other Python implementations use the importlib code to work with the module cache, a hard-coded tag would become a problem.
Directly using the platform
module in this case is a non-starter.
Any module used in the importlib bootstrap must be built-in or frozen,
neither of which apply to the platform
module. This is the point
that led to the recent interest in sys.implementation
.
Regardless of the outcome for the implementation name used, another problem relates to the version used in the cache tag. That version is likely to be the implementation version rather than the language version. However, the implementation version is not readily identified anywhere in the standard library.
Implementation-Specific Tests
Currently there are a number of implementation-specific tests in the
test suite under Lib/test
. The test support module
(Lib/test/support.py) provides some functionality for dealing with
these tests. However, like the platform
module, test.support
must do some guessing that sys.implementation
would render
unnecessary.
Jython’s os.name
Hack
In Jython, os.name
is set to ‘java’ to accommodate special
treatment of the java environment in the standard library [14]
[15]. Unfortunately it masks the os name that would otherwise
go there. sys.implementation
would help obviate the need for this
special case. Currently Jython sets os._name
for the normal
os.name
value.
The Problem With sys.(version|version_info|hexversion)
Earlier versions of this PEP made the mistake of calling
sys.version_info
(and friends) the version of the Python language,
in contrast to the implementation. However, this is not the case.
Instead, it is the version of the CPython implementation. Incidentally,
the first two components of sys.version_info
(major and minor) also
reflect the version of the language definition.
As Barry Warsaw noted, the “semantics of sys.version_info have been
sufficiently squishy in the past” [17]. With
sys.implementation
we have the opportunity to improve this
situation by first establishing an explicit location for the version of
the implementation.
This PEP makes no other effort to directly clarify the semantics of
sys.version_info
. Regardless, having an explicit version for the
implementation will definitely help to clarify the distinction from the
language version.
Feedback From Other Python Implementers
IronPython
Jeff Hardy responded to a request for feedback [4]. He
said, “I’ll probably add it the day after it’s approved”
[6]. He also gave useful feedback on both the type of
sys.implementation
and on the metadata
attribute (which has
since been removed from the PEP).
Jython
In 2009 Frank Wierzbicki said this (relative to Jython implementing the required attributes) [8]:
Speaking for Jython, so far it looks like something we would adopt
soonish after it was accepted (it looks pretty useful to me).
PyPy
Some of the PyPy developers have responded to a request for feedback [9]. Armin Rigo said the following [10]:
For myself, I can only say that it looks like a good idea, which we
will happily adhere to when we migrate to Python 3.3.
He also expressed support for keeping the required list small. Both Armin and Laura Creighton indicated that an effort to better catalog Python’s implementation would be welcome. Such an effort, for which this PEP is a small start, will be considered separately.
Past Efforts
PEP 3139
PEP 3139, from 2008, recommended a clean-up of the sys
module in
part by extracting implementation-specific variables and functions
into a separate module. PEP 421 is less ambitious version of that
idea. While PEP 3139 was rejected, its goals are reflected in PEP 421
to a large extent, though with a much lighter approach.
PEP 399
PEP 399 dictates policy regarding the standard library, helping to make it friendlier to alternate implementations. PEP 421 is proposed in that same spirit.
The Bigger Picture
It’s worth noting again that this PEP is a small part of a larger on-going effort to identify the implementation-specific parts of Python and mitigate their impact on alternate implementations.
sys.implementation
is a focal point for implementation-specific
data, acting as a nexus for cooperation between the language, the
standard library, and the different implementations. As time goes by
it is feasible that sys.implementation
will assume current
attributes of sys
and other builtin/stdlib modules, where
appropriate. In this way, it is a PEP 3137-lite, but starting as
small as possible.
However, as already noted, many other efforts predate
sys.implementation
. Neither is it necessarily a major part of the
effort. Rather, consider it as part of the infrastructure of the
effort to make Python friendlier to alternate implementations.
Alternatives
Since the single-namespace-under-sys approach is relatively straightforward, no alternatives have been considered for this PEP.
Examples of Other Attributes
These are examples only and not part of the proposal. Most of them were suggested during previous discussions, but did not fit into the goals of this PEP. (See Adding New Required Attributes if they get you excited.)
- common_name
- The case-sensitive name by which the implementation is known.
- vcs_url
- A URL for the main VCS repository for the implementation project.
- vcs_revision_id
- A value that identifies the VCS revision of the implementation.
- build_toolchain
- The tools used to build the interpreter.
- build_date
- The timestamp of when the interpreter was built.
- homepage
- The URL of the implementation’s website.
- site_prefix
- The preferred site prefix for the implementation.
- runtime
- The run-time environment in which the interpreter is running, as in “Common Language Runtime” (.NET CLR) or “Java Runtime Executable”.
- gc_type
- The type of garbage collection used, like “reference counting” or “mark and sweep”.
Open Issues
Currently none.
Implementation
The implementation of this PEP is covered in issue #14673.
References
Copyright
This document has been placed in the public domain.
Source: https://github.com/python/peps/blob/main/pep-0421.txt
Last modified: 2022-08-24 22:40:18 GMT