Commit 6c6568db authored by alvyjudy's avatar alvyjudy

docs: dedicate file for package discovery

parent c0794ed0
===================
Package Discovery
===================
Using ``find_packages()``
-------------------------
For simple projects, it's usually easy enough to manually add packages to
the ``packages`` argument of ``setup()``. However, for very large projects
(Twisted, PEAK, Zope, Chandler, etc.), it can be a big burden to keep the
package list updated. That's what ``setuptools.find_packages()`` is for.
``find_packages()`` takes a source directory and two lists of package name
patterns to exclude and include. If omitted, the source directory defaults to
the same
directory as the setup script. Some projects use a ``src`` or ``lib``
directory as the root of their source tree, and those projects would of course
use ``"src"`` or ``"lib"`` as the first argument to ``find_packages()``. (And
such projects also need something like ``package_dir={"": "src"}`` in their
``setup()`` arguments, but that's just a normal distutils thing.)
Anyway, ``find_packages()`` walks the target directory, filtering by inclusion
patterns, and finds Python packages (any directory). Packages are only
recognized if they include an ``__init__.py`` file. Finally, exclusion
patterns are applied to remove matching packages.
Inclusion and exclusion patterns are package names, optionally including
wildcards. For
example, ``find_packages(exclude=["*.tests"])`` will exclude all packages whose
last name part is ``tests``. Or, ``find_packages(exclude=["*.tests",
"*.tests.*"])`` will also exclude any subpackages of packages named ``tests``,
but it still won't exclude a top-level ``tests`` package or the children
thereof. In fact, if you really want no ``tests`` packages at all, you'll need
something like this::
find_packages(exclude=["*.tests", "*.tests.*", "tests.*", "tests"])
in order to cover all the bases. Really, the exclusion patterns are intended
to cover simpler use cases than this, like excluding a single, specified
package and its subpackages.
Regardless of the parameters, the ``find_packages()``
function returns a list of package names suitable for use as the ``packages``
argument to ``setup()``, and so is usually the easiest way to set that
argument in your setup script. Especially since it frees you from having to
remember to modify your setup script whenever your project grows additional
top-level packages or subpackages.
``find_namespace_packages()``
-----------------------------
In Python 3.3+, ``setuptools`` also provides the ``find_namespace_packages`` variant
of ``find_packages``, which has the same function signature as
``find_packages``, but works with `PEP 420`_ compliant implicit namespace
packages. Here is a minimal setup script using ``find_namespace_packages``::
from setuptools import setup, find_namespace_packages
setup(
name="HelloWorld",
version="0.1",
packages=find_namespace_packages(),
)
Keep in mind that according to PEP 420, you may have to either re-organize your
codebase a bit or define a few exclusions, as the definition of an implicit
namespace package is quite lenient, so for a project organized like so::
├── namespace
│   └── mypackage
│   ├── __init__.py
│   └── mod1.py
├── setup.py
└── tests
└── test_mod1.py
A naive ``find_namespace_packages()`` would install both ``namespace.mypackage`` and a
top-level package called ``tests``! One way to avoid this problem is to use the
``include`` keyword to whitelist the packages to include, like so::
from setuptools import setup, find_namespace_packages
setup(
name="namespace.mypackage",
version="0.1",
packages=find_namespace_packages(include=["namespace.*"])
)
Another option is to use the "src" layout, where all package code is placed in
the ``src`` directory, like so::
├── setup.py
├── src
│   └── namespace
│   └── mypackage
│   ├── __init__.py
│   └── mod1.py
└── tests
└── test_mod1.py
With this layout, the package directory is specified as ``src``, as such::
setup(name="namespace.mypackage",
version="0.1",
package_dir={"": "src"},
packages=find_namespace_packages(where="src"))
.. _PEP 420: https://www.python.org/dev/peps/pep-0420/
Namespace Packages
------------------
Sometimes, a large package is more useful if distributed as a collection of
smaller eggs. However, Python does not normally allow the contents of a
package to be retrieved from more than one location. "Namespace packages"
are a solution for this problem. When you declare a package to be a namespace
package, it means that the package has no meaningful contents in its
``__init__.py``, and that it is merely a container for modules and subpackages.
The ``pkg_resources`` runtime will then automatically ensure that the contents
of namespace packages that are spread over multiple eggs or directories are
combined into a single "virtual" package.
The ``namespace_packages`` argument to ``setup()`` lets you declare your
project's namespace packages, so that they will be included in your project's
metadata. The argument should list the namespace packages that the egg
participates in. For example, the ZopeInterface project might do this::
setup(
# ...
namespace_packages=["zope"]
)
because it contains a ``zope.interface`` package that lives in the ``zope``
namespace package. Similarly, a project for a standalone ``zope.publisher``
would also declare the ``zope`` namespace package. When these projects are
installed and used, Python will see them both as part of a "virtual" ``zope``
package, even though they will be installed in different locations.
Namespace packages don't have to be top-level packages. For example, Zope 3's
``zope.app`` package is a namespace package, and in the future PEAK's
``peak.util`` package will be too.
Note, by the way, that your project's source tree must include the namespace
packages' ``__init__.py`` files (and the ``__init__.py`` of any parent
packages), in a normal Python package layout. These ``__init__.py`` files
*must* contain the line::
__import__("pkg_resources").declare_namespace(__name__)
This code ensures that the namespace package machinery is operating and that
the current package is registered as a namespace package.
You must NOT include any other code and data in a namespace package's
``__init__.py``. Even though it may appear to work during development, or when
projects are installed as ``.egg`` files, it will not work when the projects
are installed using "system" packaging tools -- in such cases the
``__init__.py`` files will not be installed, let alone executed.
You must include the ``declare_namespace()`` line in the ``__init__.py`` of
*every* project that has contents for the namespace package in question, in
order to ensure that the namespace will be declared regardless of which
project's copy of ``__init__.py`` is loaded first. If the first loaded
``__init__.py`` doesn't declare it, it will never *be* declared, because no
other copies will ever be loaded!
\ No newline at end of file
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment