Commit eeaf2b98 authored by PJ Eby's avatar PJ Eby

Documentation for namespace packages, working sets, and supporting custom

PEP 302 importers.  Once the "Distribution" class is documented, this will
be a complete API reference for pkg_resources.

--HG--
branch : setuptools
extra : convert_revision : svn%3A6015fed2-1504-0410-9fe1-9d1591cc4771/sandbox/trunk/setuptools%4041196
parent 6ad319bf
......@@ -46,19 +46,231 @@ Developer's Guide
API Reference
-------------
'require', 'run_script',
Namespace Package Support
=========================
declare_namespace, fixup_namespace_packages, register_namespace_handler
A namespace package is a package that only contains other packages and modules,
with no direct contents of its own. Such packages can be split across
multiple, separately-packaged distributions. Normally, you do not need to use
the namespace package APIs directly; instead you should supply the
``namespace_packages`` argument to ``setup()`` in your project's ``setup.py``.
See the `setuptools documentation on namespace packages`_ for more information.
However, if for some reason you need to manipulate namespace packages or
directly alter ``sys.path`` at runtime, you may find these APIs useful:
``declare_namespace(name)``
Declare that the dotted package name `name` is a "namespace package" whose
contained packages and modules may be spread across multiple distributions.
The named package's ``__path__`` will be extended to include the
corresponding package in all distributions on ``sys.path`` that contain a
package of that name. (More precisely, if an importer's
``find_module(name)`` returns a loader, then it will also be searched for
the package's contents.) Whenever a Distribution's ``to_install()`` method
is invoked, it checks for the presence of namespace packages and updates
their ``__path__`` contents accordingly.
``fixup_namespace_packages(path_item)``
Declare that `path_item` is a newly added item on ``sys.path`` that may
need to be used to update existing namespace packages. Ordinarily, this is
called for you when an egg is automatically added to ``sys.path``, but if
your application modifies ``sys.path`` to include locations that may
contain portions of a namespace package, you will need to call this
function to ensure they are added to the existing namespace packages.
Although by default ``pkg_resources`` only supports namespace packages for
filesystem and zip importers, you can extend its support to other "importers"
compatible with PEP 302 using the ``register_namespace_handler()`` function.
See the section below on `Supporting Custom Importers`_ for details.
.. _setuptools documentation on namespace packages: http://peak.telecommunity.com/DevCenter/setuptools#namespace-packages
``WorkingSet`` Objects
======================
Listeners
working_set
The ``WorkingSet`` class provides access to a collection of "active"
distributions. In general, there is only one meaningful ``WorkingSet``
instance: the one that represents the distributions that are currently active
on ``sys.path``. This global instance is available under the name
``working_set`` in the ``pkg_resources`` module. However, specialized
tools may wish to manipulate working sets that don't correspond to
``sys.path``, and therefore may wish to create other ``WorkingSet`` instances.
It's important to note that the global ``working_set`` object is initialized
from ``sys.path`` when ``pkg_resources`` is first imported, but is only updated
if you do all future ``sys.path`` manipulation via ``pkg_resources`` APIs. If
you manually modify ``sys.path``, you must invoke the appropriate methods on
the ``working_set`` instance to keep it in sync. Unfortunately, Python does
not provide any way to detect arbitrary changes to a list object like
``sys.path``, so ``pkg_resources`` cannot automatically update the
``working_set`` based on changes to ``sys.path``.
``WorkingSet(entries=None)``
Create a ``WorkingSet`` from an iterable of path entries. If `entries`
is not supplied, it defaults to the value of ``sys.path`` at the time
the constructor is called.
Note that you will not normally construct ``WorkingSet`` instances
yourbut instead you will implicitly or explicitly use the global
``working_set`` instance. For the most part, the ``pkg_resources`` API
is designed so that the ``working_set`` is used by default, such that you
don't have to explicitly refer to it most of the time.
Basic ``WorkingSet`` Methods
----------------------------
The following methods of ``WorkingSet`` objects are also available as module-
level functions in ``pkg_resources`` that apply to the default ``working_set``
instance. Thus, you can use e.g. ``pkg_resources.require()`` as an
abbreviation for ``pkg_resources.working_set.require()``:
``require(*requirements)``
Ensure that distributions matching `requirements` are activated
`requirements` must be a string or a (possibly-nested) sequence
thereof, specifying the distributions and versions required. The
return value is a sequence of the distributions that needed to be
activated to fulfill the requirements; all relevant distributions are
included, even if they were already activated in this working set.
For the syntax of requirement specifiers, see the section below on
`Requirements Parsing`_.
Note: in general, it should not be necessary for you to call this method
directly. It's intended more for use in quick-and-dirty scripting and
interactive interpreter hacking than for production use. If you're creating
an actual library or application, it's strongly recommended that you create
a "setup.py" script using ``setuptools``, and declare all your requirements
there. That way, tools like EasyInstall can automatically detect what
requirements your package has, and deal with them accordingly.
``run_script(requires, script_name)``
Locate distribution specified by `requires` and run its `script_name`
script. `requires` must be a string containing a requirement specifier.
(See `Requirements Parsing`_ below for the syntax.)
The script, if found, will be executed in *the caller's globals*. That's
because this method is intended to be called from wrapper scripts that
act as a proxy for the "real" scripts in a distribution. A wrapper script
usually doesn't need to do anything but invoke this function with the
correct arguments.
If you need more control over the script execution environment, you
probably want to use the ``run_script()`` method of a ``Distribution``
object's `Metadata API`_ instead.
``iter_entry_points(group, name=None)``
Yield entry point objects from `group` matching `name`
If `name` is None, yields all entry points in `group` from all
distributions in the working set, otherwise only ones matching both
`group` and `name` are yielded. Entry points are yielded from the active
distributions in the order that the distributions appear in the working
set. (For the global ``working_set``, this should be the same as the order
that they are listed in ``sys.path``.) Note that within the entry points
advertised by an individual distribution, there is no particular ordering.
Please see the section below on `Entry Points`_ for more information.
``WorkingSet`` Methods and Attributes
-------------------------------------
These methods are used to query or manipulate the contents of a specific
working set, so they must be explicitly invoked on a particular ``WorkingSet``
instance:
``add_entry(entry)``
Add a path item to the ``entries``, finding any distributions on it. You
should use this when you add additional items to ``sys.path`` and you want
the global ``working_set`` to reflect the change. This method is also
called by the ``WorkingSet()`` constructor during initialization.
This method uses ``find_distributions(entry,False)`` to find distributions
corresponding to the path entry, and then ``add()`` them. `entry` is
always appended to the ``entries`` attribute, even if it is already
present, however. (This is because ``sys.path`` can contain the same value
more than once, and the ``entries`` attribute should be able to reflect
this.)
``__contains__(dist)``
True if `dist` is active in this ``WorkingSet``. Note that only one
distribution for a given project can be active in a given ``WorkingSet``.
``__iter__()``
Yield distributions for non-duplicate projects in the working set.
The yield order is the order in which the items' path entries were
added to the working set.
``find(req)``
Find a distribution matching `req` (a ``Requirement`` instance).
If there is an active distribution for the requested project, this
returns it, as long as it meets the version requirement specified by
`req`. But, if there is an active distribution for the project and it
does *not* meet the `req` requirement, ``VersionConflict`` is raised.
If there is no active distribution for the requested project, ``None``
is returned.
``resolve(requirements, env=None, installer=None)``
List all distributions needed to (recursively) meet `requirements`
`requirements` must be a sequence of ``Requirement`` objects. `env`,
if supplied, should be an ``Environment`` instance. If
not supplied, an ``Environment`` is created from the working set's
``entries``. `installer`, if supplied, will be invoked with each
requirement that cannot be met by an already-installed distribution; it
should return a ``Distribution`` or ``None``. (See the ``obtain()`` method
of `Environment Objects`_, below, for more information on the `installer`
argument.)
``add(dist, entry=None)``
Add `dist` to working set, associated with `entry`
If `entry` is unspecified, it defaults to ``dist.location``. On exit from
this routine, `entry` is added to the end of the working set's ``.entries``
(if it wasn't already present).
`dist` is only added to the working set if it's for a project that
doesn't already have a distribution active in the set. If it's
successfully added, any callbacks registered with the ``subscribe()``
method will be called. (See `Receiving Change Notifications`_, below.)
Note: ``add()`` is automatically called for you by the ``require()``
method, so you don't normally need to use this method directly.
``entries``
This attribute represents a "shadow" ``sys.path``, primarily useful for
debugging. If you are experiencing import problems, you should check
the global ``working_set`` object's ``entries`` against ``sys.path``, to
ensure that they match. If they do not, then some part of your program
is manipulating ``sys.path`` without updating the ``working_set``
accordingly. IMPORTANT NOTE: do not directly manipulate this attribute!
Setting it equal to ``sys.path`` will not fix your problem, any more than
putting black tape over an "engine warning" light will fix your car! If
this attribute is out of sync with ``sys.path``, it's merely an *indicator*
of the problem, not the cause of it.
Receiving Change Notifications
------------------------------
Extensible applications and frameworks may need to receive notification when
a new distribution (such as a plug-in component) has been added to a working
set. This is what the ``subscribe()`` method and ``add_activation_listener()``
function are for.
``subscribe(callback)``
Invoke ``callback(distribution)`` once for each active distribution that is
in the set now, or gets added later. Because the callback is invoked for
already-active distributions, you do not need to loop over the working set
yourself to deal with the existing items; just register the callback and
be prepared for the fact that it will be called immediately by this method.
``pkg_resources.add_activation_listener()`` is an alternate spelling of
``pkg_resources.working_set.subscribe()``.
``Environment`` Objects
......@@ -317,7 +529,7 @@ to avoid collision with other packages' entry point groups.
``EntryPoint`` instance in that group.
``iter_entry_points(group, name=None)``
Yield entry point objects from `group` matching `name`
Yield entry point objects from `group` matching `name`.
If `name` is None, yields all entry points in `group` from all
distributions in the working set on sys.path, otherwise only ones matching
......@@ -326,6 +538,9 @@ to avoid collision with other packages' entry point groups.
sys.path. (Within entry points for a particular distribution, however,
there is no particular ordering.)
(This API is actually a method of the global ``working_set`` object; see
the section above on `Basic WorkingSet Methods`_ for more information.)
Creating and Parsing
--------------------
......@@ -411,16 +626,37 @@ addition, the following methods are provided:
``EntryPoint.parse()`` to yield an equivalent ``EntryPoint``.
``Distribution`` Objects
========================
Factories: get_provider, get_distribution, find_distributions; see also
Factories: see also
WorkingSet and Environment APIs.
register_finder
register_loader_type
'load_entry_point', 'get_entry_map', 'get_entry_info'
Getting or Creating Distributions
---------------------------------
``find_distributions(path_item, only=False)``
Yield distributions accessible via `path_item`. If `only` is true, yield
only distributions whose ``location`` is equal to `path_item`. In other
words, if `only` is true, this yields any distributions that would be
importable if `path_item` were on ``sys.path``. If `only` is false, this
also yields distributions that are "in" or "under" `path_item`, but would
not be importable unless their locations were also added to ``sys.path``.
``get_distribution(dist_spec)``
Return a ``Distribution`` object for a given ``Requirement`` or string.
If `dist_spec` is already a ``Distribution`` instance, it is returned.
If it is a ``Requirement`` object or a string that can be parsed into one,
it is used to locate and activate a matching distribution, which is then
returned.
``ResourceManager`` API
=======================
......@@ -525,14 +761,16 @@ Resource Extraction
details.)
Resources are extracted to subdirectories of this path based upon
information given by the ``IResourceProvider``. You may set this to a
information given by the resource provider. You may set this to a
temporary directory, but then you must call ``cleanup_resources()`` to
delete the extracted files when done. There is no guarantee that
``cleanup_resources()`` will be able to remove all extracted files.
``cleanup_resources()`` will be able to remove all extracted files. (On
Windows, for example, you can't unlink .pyd or .dll files that are still
in use.)
(Note: you may not change the extraction path for a given resource
Note that you may not change the extraction path for a given resource
manager once resources have been extracted, unless you first call
``cleanup_resources()``.)
``cleanup_resources()``.
``cleanup_resources(force=False)``
Delete all extracted resource files and directories, returning a list
......@@ -550,7 +788,7 @@ Resource Extraction
If you are implementing an ``IResourceProvider`` and/or ``IMetadataProvider``
for a new distribution archive format, you may need to use the following
``ResourceManager`` methods to co-ordinate extraction of resources to the
``IResourceManager`` methods to co-ordinate extraction of resources to the
filesystem. If you're not implementing an archive format, however, you have
no need to use these methods. Unlike the other methods listed above, they are
*not* available as top-level functions tied to the global ``ResourceManager``;
......@@ -592,12 +830,11 @@ begin with a ``/``. You should not use ``os.path`` routines to manipulate
resource paths.
The metadata API is provided by objects implementing the ``IMetadataProvider``
interface. ``Distribution`` objects created with a ``metadata`` setting
implement this interface, as do objects returned by the ``get_provider()``
function:
or ``IResourceProvider`` interfaces. ``Distribution`` objects implement this
interface, as do objects returned by the ``get_provider()`` function:
``get_provider(package_or_requirement)``
If a package name is supplied, return an ``IMetadataProvider`` for the
If a package name is supplied, return an ``IResourceProvider`` for the
package. If a ``Requirement`` is supplied, resolve it by returning a
``Distribution`` from the current working set (searching the current
``Environment`` if necessary and adding the newly found ``Distribution``
......@@ -606,12 +843,15 @@ function:
Note also that if you supply a package name, and the package is not part
of a pluggable distribution (i.e., it has no metadata), then you will still
get an ``IMetadataProvider`` object, but it will just be an empty one that
answers ``False`` when asked if any given metadata resource file or
directory exists.
get an ``IResourceProvider`` object, but it will return ``False`` when
asked if any metadata files or directories exist.
``IMetadataProvider`` Methods
-----------------------------
The methods provided by ``IMetadataProvider`` (and ``Distribution`` objects
with a ``metadata`` attribute set) are:
The methods provided by objects (such as ``Distribution`` instances) that
implement the ``IMetadataProvider`` or ``IResourceProvider`` interfaces are:
``has_metadata(name)``
Does the named metadata resource exist?
......@@ -668,6 +908,150 @@ occur when processing requests to locate and activate packages::
was requested from.
Supporting Custom Importers
===========================
By default, ``pkg_resources`` supports normal filesystem imports, and
``zipimport`` importers. If you wish to use the ``pkg_resources`` features
with other (PEP 302-compatible) importers or module loaders, you may need to
register various handlers and support functions using these APIs:
``register_finder(importer_type, distribution_finder)``
Register `distribution_finder` to find distributions in ``sys.path`` items.
`importer_type` is the type or class of a PEP 302 "Importer" (``sys.path``
item handler), and `distribution_finder` is a callable that, when passed a
path item, the importer instance, and an `only` flag, yields
``Distribution`` instances found under that path item. (The `only` flag,
if true, means the finder should yield only ``Distribution`` objects whose
``location`` is equal to the path item provided.)
See the source of the ``pkg_resources.find_on_path`` function for an
example finder function.
``register_loader_type(loader_type, provider_factory)``
Register `provider_factory` to make ``IResourceProvider`` objects for
`loader_type`. `loader_type` is the type or class of a PEP 302
``module.__loader__``, and `provider_factory` is a function that, when
passed a module object, returns an `IResourceProvider`_ for that module,
allowing it to be used with the `ResourceManager API`_.
``register_namespace_handler(importer_type, namespace_handler)``
Register `namespace_handler` to declare namespace packages for the given
`importer_type`. `importer_type` is the type or class of a PEP 302
"importer" (sys.path item handler), and `namespace_handler` is a callable
with a signature like this::
def namespace_handler(importer, path_entry, moduleName, module):
# return a path_entry to use for child packages
Namespace handlers are only called if the relevant importer object has
already agreed that it can handle the relevant path item. The handler
should only return a subpath if the module ``__path__`` does not already
contain an equivalent subpath. Otherwise, it should return None.
For an example namespace handler, see the source of the
``pkg_resources.file_ns_handler`` function, which is used for both zipfile
importing and regular importing.
IResourceProvider
-----------------
``IResourceProvider`` is an abstract class that documents what methods are
required of objects returned by a `provider_factory` registered with
``register_loader_type()``. ``IResourceProvider`` is a subclass of
``IMetadataProvider``, so objects that implement this interface must also
implement all of the `IMetadataProvider Methods`_ as well as the methods
shown here. The `manager` argument to the methods below must be an object
that supports the full `ResourceManager API`_ documented above.
``get_resource_filename(manager, resource_name)``
Return a true filesystem path for `resource_name`, co-ordinating the
extraction with `manager`, if the resource must be unpacked to the
filesystem.
``get_resource_stream(manager, resource_name)``
Return a readable file-like object for `resource_name`.
``get_resource_string(manager, resource_name)``
Return a string containing the contents of `resource_name`.
``has_resource(resource_name)``
Does the package contain the named resource?
``resource_isdir(resource_name)``
Is the named resource a directory? Return a false value if the resource
does not exist or is not a directory.
``resource_listdir(resource_name)``
Return a list of the contents of the resource directory, ala
``os.listdir()``. Requesting the contents of a non-existent directory may
raise an exception.
Note, by the way, that your provider classes need not (and should not) subclass
``IResourceProvider`` or ``IMetadataProvider``! These classes exist solely
for documentation purposes and do not provide any useful implementation code.
You may instead wish to subclass one of the `built-in resource providers`_.
Built-in Resource Providers
---------------------------
``pkg_resources`` includes several provider classes, whose inheritance
tree looks like this::
NullProvider
EggProvider
DefaultProvider
PathMetadata
ZipProvider
EggMetadata
EmptyProvider
``NullProvider``
This provider class is just an abstract base that provides for common
provider behaviors (such as running scripts), given a definition for just
a few abstract methods.
``EggProvider``
This provider class adds in some egg-specific features that are common
to zipped and unzipped eggs.
``DefaultProvider``
This provider class is used for unpacked eggs and "plain old Python"
filesystem modules.
``ZipProvider``
This provider class is used for all zipped modules, whether they are eggs
or not.
``EmptyProvider``
This provider class always returns answers consistent with a provider that
has no metadata or resources. ``Distribution`` objects created without
a ``metadata`` argument use an instance of this provider class instead.
Since all ``EmptyProvider`` instances are equivalent, there is no need
to have more than one instance. ``pkg_resources`` therefore creates a
global instance of this class under the name ``empty_provider``, and you
may use it if you have need of an ``EmptyProvider`` instance.
``PathMetadata(path, egg_info)``
Create an ``IResourceProvider`` for a filesystem-based distribution, where
`path` is the filesystem location of the importable modules, and `egg_info`
is the filesystem location of the distribution's metadata directory.
`egg_info` should usually be the ``EGG-INFO`` subdirectory of `path` for an
"unpacked egg", and a ``ProjectName.egg-info`` subdirectory of `path` for
a "development egg". However, other uses are possible for custom purposes.
``EggMetadata(zipimporter)``
Create an ``IResourceProvider`` for a zipfile-based distribution. The
`zipimporter` should be a ``zipimport.zipimporter`` instance, and may
represent a "basket" (a zipfile containing multiple ".egg" subdirectories)
a specific egg *within* a basket, or a zipfile egg (where the zipfile
itself is a ".egg"). It can also be a combination, such as a zipfile egg
that also contains other eggs.
Utility Functions
=================
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment